New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Docker lesson plan with new graph write commands #92

Closed
ianmilligan1 opened this Issue Jan 28, 2019 · 4 comments

Comments

Projects
None yet
2 participants
@ianmilligan1
Copy link
Member

ianmilligan1 commented Jan 28, 2019

Thanks for catching this @SamFritz!

@ianmilligan1 ianmilligan1 self-assigned this Jan 28, 2019

@ianmilligan1

This comment has been minimized.

Copy link
Member

ianmilligan1 commented Jan 29, 2019

Wasn't able to reproduce your error, @SamFritz - can you try to recreate on your end using the docker docs and drop the error message here.

import io.archivesunleashed._
import io.archivesunleashed.app._
import io.archivesunleashed.matchbox._

val links = RecordLoader.loadArchives("/aut-resources/Sample-Data/*.gz", sc)
  .keepValidPages()
  .map(r => (r.getCrawlDate, ExtractLinks(r.getUrl, r.getContentString)))
  .flatMap(r => r._2.map(f => (r._1, ExtractDomain(f._1).replaceAll("^\\s*www\\.", ""), ExtractDomain(f._2).replaceAll("^\\s*www\\.", ""))))
  .filter(r => r._2 != "" && r._3 != "")
  .countItems()
  .filter(r => r._2 > 5)

WriteGEXF(links, "/data/links-for-gephi-test.gexf")

Works for me when running docker with docker run --rm -it -v "/Users/ianmilligan1/desa:/data" archivesunleashed/docker-aut:0.17.0 as my launch.

@SamFritz

This comment has been minimized.

Copy link
Member

SamFritz commented Jan 29, 2019

I think I was experiencing wifi interference yesterday when testing.

This script runs successfuly

@SamFritz

This comment has been minimized.

Copy link
Member

SamFritz commented Jan 29, 2019

@ianmilligan1 Closing issue as script does not need updating.

@SamFritz SamFritz closed this Jan 29, 2019

@ianmilligan1

This comment has been minimized.

Copy link
Member

ianmilligan1 commented Jan 29, 2019

Great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment