Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A Vision of the Library’s Role in Archiving Scholarly Artifacts

8 views

Published on

Invited talk at ASLI 2019

Published in: Internet
  • Be the first to comment

  • Be the first to like this

A Vision of the Library’s Role in Archiving Scholarly Artifacts

  1. 1. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Martin Klein Los Alamos National Laboratory @mart1nkle1n https://orcid.org/0000-0003-0130-2097 A Vision of the Library’s Role in Archiving Scholarly Artifacts The Scholarly Orphans project is funded by the Andrew W. Mellon Foundation
  2. 2. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Scholarly Orphans – Project Motivation
  3. 3. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 • Consideration • Researchers are increasingly using a variety of web platforms for collaboration and communication • Why? • Many of these platforms have desirable characteristics • Versioning • Time stamping • Social embedding • Their institutions do not provide platforms that have global reach • Collaboration, cf. Github ~ productivity • Communication, cf. SlideShare ~ visibility Research and Research Communication on the Web
  4. 4. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Emma Schymanski https://orcid.org/0000-0001-6868-8145 https://github.com/schymane https://www.slideshare.net/EmmaSchymanski https://figshare.com/authors/Emma_Schymanski/5087039 https://publons.com/author/1538491/emma-schymanski#profile
  5. 5. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 • Consideration • Researchers deposit artifacts in these web platforms • Web Platforms: • Dedicated to scholarship: • Commercial: e.g., FigShare, Publons • Not for profit: e.g., OSF, Zenodo • General purpose: • Commercial: e.g., GitHub, SlideShare • Not for profit: e.g., Wikipedia, Wikidata Research and Research Communication on the Web
  6. 6. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 • Consideration • Researchers deposit artifacts in these web platforms • Status quo - The researchers’ institutions commonly: • Do not know about the existence of these artifact • Do not have a copy of these artifacts Research and Research Communication on the Web
  7. 7. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 • Consideration • Researchers deposit artifacts in these web platforms • Status quo – Uncertainty regarding long-term accessibility of these artifacts: • General purpose platforms don’t provide long-term access guarantees; platforms dedicated to scholarship commonly do • Uncertainty regarding the sustainability of unhindered long- term access to artifacts in these platforms: • Commercial: when is the change in business model coming? • Not for profit: will the next round of grant applications, member contributions be successful? Research and Research Communication on the Web
  8. 8. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 • Consideration • Researchers deposit artifacts in these web platforms • Status quo - These artifacts are not systematically archived: • No frameworks like LOCKSS/Portico exist for these artifacts • Researchers only selectively deposit artifacts in portals that provide archival guarantees; to obtain a cite-able DOI • Can’t expect researchers to (also) upload all artifacts in IRs • Web archives only incidentally archive these artifacts • Anecdotal & Hiberlink evidence Research and Research Communication on the Web
  9. 9. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Emma’s SlideShare Artifact: 0 Mementos https://www.slideshare.net/EmmaSchymanski/dmcm2018-community-resources-connecting-chemistry-and-toxicity-knowledge http://timetravel.mementoweb.org/
  10. 10. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Hiberlink Evidence Web resources referenced in Elsevier corpus (1996-2012) without representative Memento in public web archives Martin Klein, Herbert Van de Sompel, et al. (2014) Scholarly context not found. In: PLOS ONE https://doi.org/10.1371/journal.pone.0115253
  11. 11. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 The Need for an Archiving Infrastructure Herbert Van de Sompel & Andrew Treloar (2014) A Perspective on Archiving the Scholarly Web https://hvdsomp.info/papers/Papers/2014/iPres2014_Sompel_Treloar.pdf
  12. 12. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Scholarly Orphans – Project Overview
  13. 13. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 The Scholarly Orphans Project • How to capture Scholarly Orphans (i.e., the scholarly artifacts deposited in web portals) for long-term archiving? • Explores an institution-driven paradigm • Academic institutions typically have a long shelf life • A basic premise underlying e.g., LOCKSS, perma.cc • An academic institution should be interested in capturing the artifacts (intellectual property) its scholars deposit on the web • Collecting and archiving such artifacts aligns with the mission of academic libraries
  14. 14. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 The Scholarly Orphans Project • Explores a paradigm inspired by web archiving • Scale of the problem • Can’t expect researchers to upload all artifacts in an institutional repository • Bilateral agreements for archival purposes with most web portals unlikely
  15. 15. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 A Web Archiving Perspective
  16. 16. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Scholarly Orphans – Prototype Pipeline Overview
  17. 17. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Prototype Pipeline
  18. 18. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Tracking Artifacts
  19. 19. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Tracking Artifacts - Description • In order to track artifacts that were recently deposited by an institutional researcher in a portal, one reasonably needs: • The web identity of the researcher in the portal • Algorithmic discovery • Discovery via a registry
  20. 20. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Algorithmic Discovery of Web Identities James Powell, Harihar Shankar, Marko Rodriguez, and Herbert Van de Sompel (2014) EgoSystem: Where are our alumni? In: code4lib http://journal.code4lib.org/articles/9519
  21. 21. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Martin Klein and Herbert Van de Sompel (2017) Discovering Scholarly Orphans Using ORCID In: JCDL2017 https://arxiv.org/abs/1703.09343 Discovery of Web Identities via a Registry (ORCID)
  22. 22. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 https://orcid.org/0000-0001-6868-8145 Emma’s ORCID Record
  23. 23. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Tracking Artifacts - Description • In order to track artifacts that were recently deposited by an institutional researcher in a portal, one reasonably needs: • The web identity of the researcher in the portal • Algorithmic discovery • Discovery via a registry • A portal API that supports: • Access by web identity • Access to contributions “since …” for the web identity • Result of tracking: • URI(s) of new artifact(s) discovered in the portal
  24. 24. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Capturing Artifacts
  25. 25. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Capturing Artifacts - Description • The capture process takes as input the URI of a new artifact discovered in a portal • Its task is to create a representative institutional capture of the artifact • Result of capture: • WARC file for new artifact in an institutional archive
  26. 26. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Capturing Artifacts - Description • Challenges: • Delineate the web boundary of the artifact • More than the input artifact URI • The boundary is in the eye of the beholder • Create a high-fidelity capture using an approach that scales for a steady stream of new artifacts • Unsolved problem
  27. 27. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Capturing Artifacts
  28. 28. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Capturing Artifacts
  29. 29. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Capturing Artifacts
  30. 30. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Memento Tracer - Framework http://tracer.mementoweb.org
  31. 31. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Archiving Artifacts
  32. 32. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Archiving Artifacts - Description • The archiving process takes as input the URI of a WARC file generated by the capture process • Its task is to ingest the WARC file in a cross-institutional web archive • This can be achieved using off-the-shelf web archiving software, e.g., pywb, Open Wayback • Result of archiving: • Mementos pertaining to newly discovered artifact in a cross- institutional, Memento-compliant web archive
  33. 33. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Demo - myresearch.institute
  34. 34. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 myresearch.institute - Researchers • Uniquely identified by ORCIDs • Web identities in multiple portals • Create various types of artifacts
  35. 35. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 myresearch.institute - Portals • Tracking started August 27 2018 • Tracking artifacts created starting August 1 2018 • >9,400 artifacts tracked to date for all 16 researchers
  36. 36. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 myresearch.institute - Artifacts • schema.org typology: • Answer • Article • BlogPosting • Comment • Dataset • PresentationDigitalDocument • Question • Review • SoftwareSourceCode
  37. 37. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 myresearch.institute - Demo Demo: https://myresearch.institute/
  38. 38. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Scholarly Orphans – Summary
  39. 39. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Summary (1/2) • The Scholarly Orphans project explores an institution-driven approach to capture scholarly artifacts deposited in web portals • Artifacts out of scope of existing archival approaches such as LOCKSS, Portico, web archives • Institutions have a long shelf life, should be interested in collecting these artifacts, and have feasible scale for identity/artifact discovery
  40. 40. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Summary (2/2) • Components of the experimental pipeline: • Tracker: Automatically discover artifacts because researchers will not upload them to the institution • Capturer: High fidelity artifact captures through crowd-sourcing navigation patterns with Memento Tracer • Archiver: Cross-institutional, Memento-compliant scholarly web archive
  41. 41. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Acknowledgments • Los Alamos National Laboratory: • Lyudmila Balakireva • Martin Klein • James Powell • Harihar Shankar • Herbert Van de Sompel • Old Dominion University: • Sawood Alam • Grant Atkins • Shawn Jones • Mat Kelly • Michael L. Nelson • myresearch.institute – all volunteering researchers
  42. 42. @mart1nkle1n A Vision of the Library’s Role in Archiving Scholarly Artifacts ASLI 2019, Phoenix, AZ, 01/09/2019 Martin Klein Los Alamos National Laboratory @mart1nkle1n https://orcid.org/0000-0003-0130-2097 A Vision of the Library’s Role in Archiving Scholarly Artifacts The Scholarly Orphans project is funded by the Andrew W. Mellon Foundation

×