Permalink
Please
sign in to comment.
Browse files
Add office document binary extraction. (#346)
- Add Word Processor DF and binary extraction - Add Spreadsheets DF and binary extraction - Add Presentation Program DF and binary extraction - Add Text files DF and binary extraction - Add tests for new DF and binary extractions - Add test fixtures for new DF and binary extractions - Resolves #303 - Resolves #304 - Resolves #305 - Use aut-resources repo to distribute our shaded tika-parsers 1.22 - Close TikaInputStream - Add RDD filters on MimeTypeTika values - Add CodeCov configuration yaml - Includes work by @jrwiebe, see #346 for all commits before squash
- Loading branch information
Showing
with
602 additions
and 50 deletions.
- +26 −0 .codecov.yml
- +4 −0 pom.xml
- +1 −0 src/main/scala/io/archivesunleashed/matchbox/DetectMimeTypeTika.scala
- +0 −23 src/main/scala/io/archivesunleashed/matchbox/ExtractAtMentions.scala
- +270 −27 src/main/scala/io/archivesunleashed/package.scala
- BIN src/test/resources/warc/example.docs.warc.gz
- BIN src/test/resources/warc/example.txt.warc.gz
- +72 −0 src/test/scala/io/archivesunleashed/df/ExtractPresentationProgramDetailsTest.scala
- +85 −0 src/test/scala/io/archivesunleashed/df/ExtractSpreadsheetDetailsTest.scala
- +66 −0 src/test/scala/io/archivesunleashed/df/ExtractTextFilesDetailsTest.scala
- +78 −0 src/test/scala/io/archivesunleashed/df/ExtractWordProcessorDetailsTest.scala
@@ -0,0 +1,26 @@ | ||
codecov: | ||
notify: | ||
require_ci_to_pass: yes | ||
|
||
coverage: | ||
precision: 2 | ||
round: down | ||
range: "50...80" | ||
|
||
status: | ||
project: yes | ||
patch: yes | ||
changes: no | ||
|
||
parsers: | ||
gcov: | ||
branch_detection: | ||
conditional: yes | ||
loop: yes | ||
method: no | ||
macro: no | ||
|
||
comment: | ||
layout: "header, diff" | ||
behavior: default | ||
require_changes: no |
Oops, something went wrong.
0 comments on commit
c824ad8