Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upAdd ComputeSHA1 method; resolves #363. #364
Conversation
This comment has been minimized.
This comment has been minimized.
codecov
bot
commented
Oct 8, 2019
•
Codecov Report
@@ Coverage Diff @@
## master #364 +/- ##
=========================================
+ Coverage 75.93% 76.24% +0.3%
=========================================
Files 39 40 +1
Lines 1392 1410 +18
Branches 267 267
=========================================
+ Hits 1057 1075 +18
Misses 218 218
Partials 117 117 |
Builds nicely locally. Re:
That makes sense to me? |
- Update tests where needed - Add SHA1 method to ExtractImageDetails - Add SHA1 to DataFrames binary extraction and analysis
This comment has been minimized.
This comment has been minimized.
I'll update the bleeding edge documentation once we're good to go here. |
New commit looks great! Runs nicely with this script: import io.archivesunleashed._
import io.archivesunleashed.df._
val df = RecordLoader.loadArchives("example.arc.gz", sc).extractImageDetailsDF();
df.select($"url", $"filename", $"extension", $"mime_type_web_server", $"mime_type_tika", $"width", $"height", $"md5", $"sha1", $"bytes").orderBy(desc("md5")).show() |
This comment has been minimized.
This comment has been minimized.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
ruebot commentedOct 8, 2019
GitHub issue(s): #363
What does this Pull Request do?
Add ComputeSHA1 method.
How should this be tested?
Additional Notes:
Do we want to add this method to all our DataFrame methods? Basically add a
sha1
column, along with themd5
column? If so, something like this here: