Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upAdd DomainGraphExtractor examples #52
Merged
+16
−4
Conversation
The RDD commands are throwing an error, the DF ones work. |
current/aut-spark-submit-app.md
Outdated
Text output: | ||
|
||
```shell | ||
spark-submit --class io.archivesunleashed.app.CommandLineAppRunner path/to/aut-fatjar.jar --extractor ImageGraphExtractor --input /path/to/warcs/* --output output/path --output-format TEXT |
This comment has been minimized.
This comment has been minimized.
ianmilligan1
Apr 8, 2020
Member
I couldn't get this to work without the --df
flag (i.e. the commands below worked). Running the above command leads to:
20/04/08 16:09:28 ERROR CommandLineApp: ImageGraphExtractor not supported with RDD. The following extractors are supported:
20/04/08 16:09:28 ERROR CommandLineApp: DomainFrequencyExtractor
20/04/08 16:09:28 ERROR CommandLineApp: DomainGraphExtractor
20/04/08 16:09:28 ERROR CommandLineApp: PlainTextExtractor
current/aut-spark-submit-app.md
Outdated
GEXF output: | ||
|
||
```shell | ||
spark-submit --class io.archivesunleashed.app.CommandLineAppRunner path/to/aut-fatjar.jar --extractor ImageGraphExtractor --input /path/to/warcs/* --output output/path --output-format GEXF |
This comment has been minimized.
This comment has been minimized.
current/aut-spark-submit-app.md
Outdated
|
||
```shell | ||
spark-submit --class io.archivesunleashed.app.CommandLineAppRunner path/to/aut-fatjar.jar --extractor ImageGraphExtractor --input /path/to/warcs/* --output output/path --df --partition 1 | ||
spark-submit --class io.archivesunleashed.app.CommandLineAppRunner path/to/aut-fatjar.jar --extractor ImageGraphExtractor --input /path/to/warcs/* --output output/path --df --output-format GEXF |
This comment has been minimized.
This comment has been minimized.
Looks good (of course, GEXF results are off as per aut/#436). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
ruebot commentedApr 8, 2020
No description provided.