Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Knowledge Graph Maintenance

91 views

Published on

Thinking about the combination of humans and machines to maintain knowledge graphs. Thanks to

https://dfdazac.github.io
https://thiviyansingam.com

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Knowledge Graph Maintenance

  1. 1. Knowledge Graph Maintenance Prof. Paul Groth | @pgroth | pgroth.com | indelab.org Thanks to Daniel Daza, Thiviyan Thanapalsingam and Frank van Harmelen Knowledge Graph Conference 2020
  2. 2. Roads and Bridges:The Unseen Labor Behind Our Digital Infrastructure W R I T T E N B Y Nadia Eghbal Source: https://www.fordfoundation.org/work/learning/research-reports/roads-and-bridges-the-unseen-labor-behind-our- digital-infrastructure/
  3. 3. Source: Azzaoui, K., Jacoby, E., Senger, S., Rodríguez, E. C., Loza, M., Zdrazil, B., … Ecker, G. F. (2013). Scientific competency questions as the basis for semantically enriched open pharmacological space development. Drug Discovery Today, 18(17–18), 843–852. https://doi.org/10.1016/j.drudis.2013.05.008
  4. 4. Source: https://www.biocuration2019.org/about
  5. 5. Source: https://www.wired.com/story/inside-the-alexa-friendly-world-of-wikidata/
  6. 6. Source: https://stats.wikimedia.org/v2/#/en.wikipedia.org/contributing/user-edits/normal||2001-01-01~2019-09-01|~total|
  7. 7. Crowdsourcing 100,000s of hand annotated examples The TAC Relation Extraction Dataset Source: Zhang, Yuhao, et al. "Position-aware attention and supervised data improve slot filling." Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. Karen Fort, Gilles Adda, Kevin Bretonnel Cohen. Amazon Mechanical Turk: Gold Mine or Coal Mine?. Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2011, pp.413-420. 10.1162/COLI_a_00057
  8. 8. Data Work == People Work
  9. 9. Concept1 Concept2 Concept3 KOS Professional Curators Literature Software Non-professional contributors 1. dealing with changing cultural and societal norms, specifically to address or correct bias; 2. political influence 3. new concepts and terminology arising from discoveries or change in perspective within a technical/scientific community 4. gardening 5. incremental contributorship 6. progressive formalization 7. software and automation 8. integration of large numbers of data sources 9. variance in algorithm training data Data ⚐Society & Politics (4, 5, 6) (7, 8, 9) (3) (1, 2) Source: Michael Lauruhn and Paul Groth. “Sources of Change for Modern Knowledge Organization Systems." Knowledge Organization 43, no. 8 (2016).
  10. 10. Apply ML
  11. 11. Content Universal schema Surface form relations Structured relations Factorization model Matrix Construction Open Information Extraction Entity Resolution Matrix Factorization Knowledge graph Curation Predicted relations Matrix Completion Taxonomy Triple Extraction Concept Resolution 14M SD articles 475 M triples 3.3 million relations 49 M relations ~15k -> 1M entries Paul Groth, Sujit Pal, Darin McBeath, Brad Allen, Ron Daniel “Applying Universal Schemas for Domain Specific Ontology Expansion” 5th Workshop on Automated Knowledge Base Construction (AKBC) 2016 Link Prediction & KG Curation
  12. 12. Link Prediction
  13. 13. Inductive Prediction
  14. 14. Inductive Prediction
  15. 15. Inductive Prediction
  16. 16. Future: Sub-graph Prediction
  17. 17. Future: Learning KG Pipelines End-to-End Paul T. Groth, Antony Scerri, Ron Daniel, Bradley P. Allen:
 End-to-End Learning for Answering Structured Queries Directly over Text. DL4KG@ESWC 2019: 57-70
  18. 18. Data Work == People Work
  19. 19. Knowledge Engineering Revisited • Knowledge graphs are built ad-hoc • 100s of components (extractors, scrapers, quality, scoring,  user feedback, ….) • Unique for each organization • Existing knowledge engineering theory does not apply: • Assumes small scale • Assumes slow change • People-centric • Expressive representations • an updated theory and methods for knowledge engineering designed for the demands of modern knowledge graphs
  20. 20. knowledgescientist.org
  21. 21. Conclusion • Knowledge graphs require maintenance • Maintenance is frequently people work • New ML based methods & new human + machine workflows • Interested? Happy to talk more Paul Groth | @pgroth | pgroth.com | indelab.org Thanks to Daniel Daza, Thiviyan Thanapalsingam and Frank van Harmelen

×