Permalink
Please
sign in to comment.
Browse files
Update keepValidPages to include a filter on 200 OK. (#360)
- Add status code filter to keepValidPages - Add MimeTypeTika to valid pages DF - Update tests since we filter more and better now😄 - Resolves #359
- Loading branch information...
Showing
with
22 additions
and 19 deletions.
- +7 −3 src/main/scala/io/archivesunleashed/package.scala
- +7 −8 src/test/scala/io/archivesunleashed/RecordRDDTest.scala
- +3 −3 src/test/scala/io/archivesunleashed/app/DomainFrequencyExtractorTest.scala
- +1 −1 src/test/scala/io/archivesunleashed/app/DomainGraphExtractorDfTest.scala
- +1 −1 src/test/scala/io/archivesunleashed/app/DomainGraphExtractorTest.scala
- +1 −1 src/test/scala/io/archivesunleashed/app/PlainTextExtractorTest.scala
- +2 −2 src/test/scala/io/archivesunleashed/df/SimpleDfTest.scala
0 comments on commit
9b3e025