Permalink
Browse files

Use find with cat; address #200.

- We may hit (and probably have with the mt.gov job) '-bash: /bin/cat: Argument list too long'
  • Loading branch information...
ruebot committed Nov 19, 2018
1 parent dd65d90 commit e8f6f26fcf478b01b94068fa08048772f703fe34
Showing with 2 additions and 2 deletions.
  1. +2 −2 app/jobs/graphpass_job.rb
@@ -28,11 +28,11 @@ def perform(user_id, collection_id)
graphpass_cmd = graphpass + graphpass_flags
logger.info 'Executing: ' + graphpass_cmd
system(graphpass_cmd)
combine_full_url_output_cmd = 'cat ' + collection_derivatives + '/all-domains/output/part* > ' + collection_derivatives + '/all-domains/' + c.collection_id.to_s + '-fullurls.txt'
combine_full_url_output_cmd = 'find ' + collection_derivatives + '/all-domains/output -iname "part*" -type f -exec cat {} > ' + collection_derivatives + '/all-domains/' + c.collection_id.to_s + '-fullurls.txt \;'
logger.info 'Executing: ' + combine_full_url_output_cmd
system(combine_full_url_output_cmd)
FileUtils.rm_rf(collection_derivatives + '/all-domains/output')
combine_full_text_output_cmd = 'cat ' + collection_derivatives + '/all-text/output/part* > ' + collection_derivatives + '/all-text/' + c.collection_id.to_s + '-fulltext.txt'
combine_full_text_output_cmd = 'find ' + collection_derivatives + '/all-text/output -iname "part*" -type f -exec cat {} > ' + collection_derivatives + '/all-text/' + c.collection_id.to_s + '-fulltext.txt \;'
logger.info 'Executing: ' + combine_full_text_output_cmd
system(combine_full_text_output_cmd)
FileUtils.rm_rf(collection_derivatives + '/all-text/output')

0 comments on commit e8f6f26

Please sign in to comment.