Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upUpdate Spark jobs to use 0.60.0 of aut, and DataFrames instead of RDD #386
Labels
Comments
This comment has been minimized.
This comment has been minimized.
data migration:
|
ruebot
added a commit
that referenced
this issue
Apr 16, 2020
- Resolves #386 - Move the faux txt derivatives to what they actually are; csv. - Update Spark job to use DataFrames - Update auk documentation and lessons with correct file extension (s/txt/csv) - Data migration needs to be completed on prod - rename full-text and full-domains - s/.txt/.csv/g - on all -fullurls.txt - remove the first and last character on each line. ( )
ruebot
added a commit
that referenced
this issue
Apr 16, 2020
- Resolves #386 - Move the faux txt derivatives to what they actually are; csv. - Update Spark job to use DataFrames - Update auk documentation and lessons with correct file extension (s/txt/csv) - Data migration needs to be completed on prod - rename full-text and full-domains - s/.txt/.csv/g - on all -fullurls.txt - remove the first and last character on each line. ( ) - TravisCI should only test Ruby 2.6.5 - Update tests to reflect changes - Rename text fixtures
ruebot
added a commit
that referenced
this issue
Apr 16, 2020
- Resolves #386 - Move the faux txt derivatives to what they actually are; csv. - Update Spark job to use DataFrames - Update auk documentation and lessons with correct file extension (s/txt/csv) - Data migration needs to be completed on prod - rename full-text and full-domains - s/.txt/.csv/g - on all -fullurls.txt - remove the first and last character on each line. ( ) - TravisCI should only test Ruby 2.6.5 - Update tests to reflect changes - Rename text fixtures
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
ruebot commentedApr 16, 2020
No description provided.