Permalink
Please
sign in to comment.
Browse files
Add office document binary extraction. (#346)
- Add Word Processor DF and binary extraction - Add Spreadsheets DF and binary extraction - Add Presentation Program DF and binary extraction - Add Text files DF and binary extraction - Add tests for new DF and binary extractions - Add test fixtures for new DF and binary extractions - Resolves #303 - Resolves #304 - Resolves #305 - Use aut-resources repo to distribute our shaded tika-parsers 1.22 - Close TikaInputStream - Add RDD filters on MimeTypeTika values - Add CodeCov configuration yaml - Includes work by @jrwiebe, see #346 for all commits before squash
- Loading branch information...
Showing
with
602 additions
and 50 deletions.
- +26 −0 .codecov.yml
- +4 −0 pom.xml
- +1 −0 src/main/scala/io/archivesunleashed/matchbox/DetectMimeTypeTika.scala
- +0 −23 src/main/scala/io/archivesunleashed/matchbox/ExtractAtMentions.scala
- +270 −27 src/main/scala/io/archivesunleashed/package.scala
- BIN src/test/resources/warc/example.docs.warc.gz
- BIN src/test/resources/warc/example.txt.warc.gz
- +72 −0 src/test/scala/io/archivesunleashed/df/ExtractPresentationProgramDetailsTest.scala
- +85 −0 src/test/scala/io/archivesunleashed/df/ExtractSpreadsheetDetailsTest.scala
- +66 −0 src/test/scala/io/archivesunleashed/df/ExtractTextFilesDetailsTest.scala
- +78 −0 src/test/scala/io/archivesunleashed/df/ExtractWordProcessorDetailsTest.scala
@@ -0,0 +1,26 @@ | |||
codecov: | |||
notify: | |||
require_ci_to_pass: yes | |||
|
|||
coverage: | |||
precision: 2 | |||
round: down | |||
range: "50...80" | |||
|
|||
status: | |||
project: yes | |||
patch: yes | |||
changes: no | |||
|
|||
parsers: | |||
gcov: | |||
branch_detection: | |||
conditional: yes | |||
loop: yes | |||
method: no | |||
macro: no | |||
|
|||
comment: | |||
layout: "header, diff" | |||
behavior: default | |||
require_changes: no |
Oops, something went wrong.
0 comments on commit
c824ad8