Tree: 59b1d4e949
-
Spark 3.0.0 + Java 11 support. (#375)
- Update to Spark 3.0.0 - Update to Java 11 - Update README - Remove Java8 support - Resolves #375
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Add Python implementation of SaveBytes. (#482)
- Resolves #478 - Tweak formatting in DataFrameLoader
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Bump xercesImpl from 2.11.0 to 2.12.0 (#481)
Bumps xercesImpl from 2.11.0 to 2.12.0. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
[Skip Travis] Trim README down given aut.docs.archivesunleashed.org (#…
ruebot committedJun 8, 2020 …480)
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Verified
This commit was signed with a verified signature.ruebot Nick RuestGPG key ID: 417FAF1A0E1080CD Learn about signing commits -
[maven-release-plugin] prepare release aut-0.80.0
ruebot committedJun 3, 2020 Verified
This commit was signed with a verified signature.ruebot Nick RuestGPG key ID: 417FAF1A0E1080CD Learn about signing commits
-
Remove RDD suffixes on file, class, and object names. (#479)
- Remove all the RDD suffixes added previously - Rename image_graph to imagegraph (Python) - Rename GetExtensionMime to GetExtensionMIME (Scala) - Remove textFiles (Scala) - Remove text_files (Python) - Remove TextFilesInformationExtractor - Rename files all affected files as needed - Update tests as needed
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
PEP8 Python app method names. (#477)
- Resolve 468
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits -
Move Python UDF methods out of their own class. (#475)
- Resolve #467 - README button colour tweak for UserDocs
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Add DataFrame udf tests. (#474)
- Resolves #473
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits -
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits -
Add ExtractPopularImages, WriteGEXF, and WriteGraphML to Python. (#466)
- Resolves #409 - Add Python implementations of - ExtractPopularImages - WriteGraphML - WriteGEXF - Clean up formatting in app.py, and udfs - Cleanup doc comments on the Scala side
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Verified
This commit was signed with a verified signature.ruebot Nick RuestGPG key ID: 417FAF1A0E1080CD Learn about signing commits -
[maven-release-plugin] prepare release aut-0.70.0
ruebot committedMay 4, 2020 Verified
This commit was signed with a verified signature.ruebot Nick RuestGPG key ID: 417FAF1A0E1080CD Learn about signing commits -
[skip travis] README updates (#460)
ruebot committedMay 4, 2020 - `$` should only be used if output is also shown (mdl) - Add UserDoc badge, and yank buried documentation section - Additional formatting and typo fixes
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits -
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Add RemovePrefixWWWDF to DomainFrequencyExtractor. (#457)
- Resolves #456 - Update test
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
[skip travis] Updating Java install instructions, resolves #445 (#455)
ianmilligan1 committedApr 23, 2020 Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Add option to save to Parquet for app. (#454)
- Resolves #448 - Update test - Add CSV headers to coalesce CSV output - Update README
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits -
Update PlainTextExtractor to output a single column; text. (#453)
- Resolves #452 - PlainTextExtractor runs ExtractBoilerplate on `content` - Update test
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Add a number of additional app extractors. (#451)
- Resolves #447 - Add AudioInformationExtractor, ImageInformationExtractor, PDFInformationExtractor, PresentationProgramInformationExtractor, SpreadsheetInformationExtractor, TextFilesInformationExtractor, VideoInformationExtractor, WebGraphExtractor, WordProcessorInformationExtractor - Add tests for the new extractors - Update CommandLineApp to use new extractors - Add domain, and language column WebPagesExtractor - Change "TEXT" to "csv" - Lower case "GEXF" and "GRAPHML"
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Remove RDD option in app; DataFrame only now. (#450)
- Resolves #449 - Updates and renames tests were applicable - Update README to reflect updates
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits -
Verified
This commit was signed with a verified signature.ruebot Nick RuestGPG key ID: 417FAF1A0E1080CD Learn about signing commits -
[maven-release-plugin] prepare release aut-0.60.0
ruebot committedApr 15, 2020 Verified
This commit was signed with a verified signature.ruebot Nick RuestGPG key ID: 417FAF1A0E1080CD Learn about signing commits
-
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Remove GraphX support; resolves #442. (#443)
- Remove graphx dependencies from pom - Remove ExtractGraphX and related tests - Remove WriteGraphXML and related tests
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Add graphml output to CommandLineApp and DomainGraphExtractor. (#438)
* Resolves #435 * Adds GRAPHML option to CommandLineApp * Adds DataFrame method to DomainGraphExtractor * Updates CommandLineApp, and WriteGraphML tests
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Align RDD and DF output for DomainGraphExtractor. (#437)
- Resolves #436 - Remove WWW prefix for RDD was double escaping - Update DF so it matches RDD output (it wasn't even close before
🤦 ) - Update tests so they're basically testing the same thingVerified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits
-
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits -
Add imagegraph, and webgraph to command line app. (#432)
- Resolves #431 - Adds webpages, and imagegraph to command line app - Adds tests for new functionality - Clean-up doc comments - Convert files with dos line endings to unix line endings - Update CommandLineApp tests
Verified
This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits