Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data to serve computational toxicology and environmental science

85 views

Published on

The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program utilizes computational and data-driven approaches that integrate chemistry, exposure and biological data to help characterize potential risks from chemical exposure. The National Center for Computational Toxicology (NCCT) has measured, assembled and delivered an enormous quantity and diversity of data for the environmental sciences, including high-throughput in vitro screening data, in vivo and functional use data, exposure models and chemical databases with associated properties. The CompTox Chemicals Dashboard website provides access to data associated with ~900,000 chemical substances. New data are added on an ongoing basis, including the registration of new and emerging chemicals, data extracted from the literature, chemicals studied in our labs, and data of interest to specific research projects at the EPA. Hazard and exposure data have been assembled from a large number of public databases and as a result the dashboard surfaces hundreds of thousands of data points. Other data includes experimental and predicted physicochemical property data, in vitro bioassay data for over 4000 chemicals and 2000 assays, and millions of chemical identifiers (names and CAS Registry Numbers) to facilitate searching. Other integrated modules include an interactive read-across module, real-time physicochemical and toxicity endpoint prediction and an integrated search to PubMed. This presentation will provide an overview of the latest release of the CompTox Chemicals Dashboard and how it has developed into an integrated data hub for environmental data. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

Published in: Science
  • Be the first to comment

  • Be the first to like this

US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data to serve computational toxicology and environmental science

  1. 1. US-EPA CompTox Chemicals Dashboard – integrating chemistry and biology data to serve computational toxicology and environmental science Antony Williams, Chris Grulke, Ann Richard, Richard Judson, Imran Shah Grace Patlewicz, John Wambaugh, Katie Paul-Friedman, Jeremy Dunne and Jeff Edwards National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC August 2019 ACS Fall Meeting, San Diego http://www.orcid.org/0000-0002-2668-4821 The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
  2. 2. CompTox Portal 1
  3. 3. CompTox Chemicals Dashboard • A publicly accessible website delivering access: – ~875,000 chemicals with related property data – Searchable by chemical, product use, gene and assay (ToxCast) – Experimental and predicted physicochemical property data – “Bioactivity data” for the ToxCast/Tox21 project – Links to other agency websites and public data resources – “Literature” searches for chemicals using public resources – “Batch searching” for thousands of chemicals – DOWNLOADABLE Open Data for reuse and repurposing 2
  4. 4. CompTox Chemicals Dashboard https://comptox.epa.gov/dashboard 3 875k Chemical Substances
  5. 5. BASIC Search 4
  6. 6. Detailed Chemical Pages 5
  7. 7. Experimental and Predicted Data 6
  8. 8. Transparency for prediction models 7
  9. 9. OPERA Predicted Properties 8 OPERA Models: https://github.com/kmansouri/OPERA
  10. 10. Access to Chemical Hazard Data 9
  11. 11. Hazard Data from “ToxVal_DB” • ToxVal Database contains following data: –~800,000 toxicity values –~30 sources of data –~22,000 sub-sources –~5000 journals cited –~70,000 literature citations 10
  12. 12. In Vitro Bioassay Screening ToxCast and Tox21 11
  13. 13. In Vitro Bioassay Screening ToxCast and Tox21 12
  14. 14. Bioactivity: Downloadable Data https://www.epa.gov/chemical-research/exploring-toxcast-data- downloadable-data 13
  15. 15. Sources of Exposure to Chemicals 14
  16. 16. Sources of Exposure to Chemicals 15
  17. 17. An “Executive Summary” Quick Look Tox Info 16
  18. 18. Identifiers to Support Searches 17
  19. 19. Built in “Modules” 18
  20. 20. Abstract Sifter for Excel 19
  21. 21. Literature Searching 20
  22. 22. Literature Searching 21
  23. 23. Literature Searching 22
  24. 24. Generalized Read-Across (GenRA) 23
  25. 25. Related Publications 24
  26. 26. Mapped Relationships 25
  27. 27. Relationships in the Data 26
  28. 28. 27
  29. 29. Bisphenol A 27 Total MS-Ready Mappings 28
  30. 30. Related Substances – Transformation Products, “Monomer-Polymer” 29 What No Structures???
  31. 31. Quality Control 30
  32. 32. Quality Control of the Database • We have full time curators checking data 31
  33. 33. Names to CASRN Mappings 32
  34. 34. Subtleties 33 E/Z-stereochemistry E-stereochemistry “4-Decene”
  35. 35. CAS Registry Numbers 34
  36. 36. Crowdsourced Curation 35
  37. 37. Chemical Lists and Categories 36
  38. 38. Category example – PAHs 37
  39. 39. EPAHFR: Hydraulic Fracturing 38
  40. 40. List of Assays
  41. 41. From Assay to Chemicals… 40
  42. 42. Other Searches 41
  43. 43. Product/Use Categories 42
  44. 44. Lubricant 43
  45. 45. Lots of UVCBS in Commerce…. 44
  46. 46. Other Searches Chemical-Biology 45
  47. 47. Assay/Gene Search 46
  48. 48. Assay/Gene Search 47
  49. 49. Batch Searching 48
  50. 50. Batch Searching • Singleton searches are useful but people generally want data on LOTS of chemicals! • Typical questions – What is the list of chemicals for the formula CxHyOz – What is the list of chemicals for a mass +/- error – Can I get chemical lists in Excel files? In SDF files? – Can I include properties in the download file? 49
  51. 51. Batch Search Names 50 Excel Download
  52. 52. Add Other Data of Interest 51
  53. 53. Built in Checks… 52
  54. 54. Related Substance Relationships
  55. 55. Real-Time Predictions 54
  56. 56. Real-Time Predictions 55
  57. 57. Real-Time Predictions with detailed calculation reports 56
  58. 58. Real-Time Predictions with detailed calculation reports 57
  59. 59. Open Data Download Files 58
  60. 60. Downloadable Data 59
  61. 61. Conclusion • Building an integrated hub for environmental chemistry data to serve computational toxicology • Transparent access to data and models – file downloads, SQL data dumps and web services • Expansion of functionality to serve all data streams generated by NCCT across the agency & community 60 • Data QUALITY is a key focus - ongoing curation • We are committed to open API development with time..
  62. 62. Acknowledgements EPA-RTP • An enormous team of contributors from NCCT, especially the IT software development team • Our curation team for their care and focus on data quality • Multiple centers and laboratories across the EPA • Many public domain databases and open data contributors
  63. 63. Contact Antony Williams NCCT, US EPA Office of Research and Development, Williams.Antony@epa.gov ORCID: https://orcid.org/0000-0002-2668-4821 62 https://doi.org/10.1186/s13321-017-0247-6

×