Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

EPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicals

119 views

Published on

The US-EPA CompTox Chemicals Dashboard provides access to data associated with ~900,000 chemical substances. Available data include experimental and predicted physicochemical properties, environmental fate and transport data, in vivo and in silico toxicity data, in vitro bioassay data, exposure data and a variety of other types of information. The data are under continuous expansion and curation and the experimental data have been used to develop QSAR and QSPR models. A number of these models are available via a web interface so that users can submit a chemical structure and predict properties in real time. The dashboard also provides access to pre-compiled chemical lists and categories, including pesticides, and chemicals detected in the environment via non-targeted mass spectrometry analysis. The data are searchable using chemical identifiers (systematic names, trade names, CAS Registry Numbers), by structure, mass and formula. Batch searches allow for data associated with thousands of chemicals to be obtained in a few seconds, with just a few button clicks, and downloaded to the desktop in formats including spreadsheets and chemical structure file formats. This presentation will provide an overview of the Dashboard and its applications to accessing source data associated with agriculturally related chemicals. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

Published in: Science
  • Be the first to comment

  • Be the first to like this

EPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicals

  1. 1. EPA’s CompTox Chemicals Dashboard, a tool with information on ~900,000 chemicals CREEC April 2020 http://www.orcid.org/0000-0002-2668-4821 The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA Antony Williams Center for Computational Toxicology and Exposure, US-EPA, RTP, NC
  2. 2. 1 SEARCH TOX DATA BIOACTIVITY SIMILARITY READ-ACROSS PUBMED BATCH SEARCH CompTox Chemicals Dashboard https://comptox.epa.gov/dashboard
  3. 3. BASIC Search 2
  4. 4. Detailed Chemical Pages 3
  5. 5. Properties, Fate and Transport 4
  6. 6. Properties, Fate and Transport e.g. Solubility 5
  7. 7. Properties, Fate and Transport e.g. logP 6
  8. 8. Sources of Exposure to Chemicals 7
  9. 9. Identifiers to Support Searches 8
  10. 10. Link Access 9
  11. 11. Mass Spec Links 10
  12. 12. NIST WebBook https://webbook.nist.gov/chemistry/ 11
  13. 13. MassBank of North America https://mona.fiehnlab.ucdavis.edu 12
  14. 14. Batch Searching 13
  15. 15. Aggregate data for a list of chemicals 14
  16. 16. Batch Search Names 15 Excel Download
  17. 17. Add Other Data of Interest 16
  18. 18. Chemical Lists of Interest… 17
  19. 19. 225 Chemical Lists (and growing) 18
  20. 20. “Volatilome” Human Breath 19
  21. 21. “Volatilome” Saliva 20
  22. 22. PFAS lists of Chemicals 21
  23. 23. Building a “reference” PFAS list • PFAS structure list (PFASSTRUCT) is expanded from public databases, EPA agency lists and literature • Approaching ~7000 structures – 98.8% have associated CAS Numbers • Compare with PubChem 220,720 structures 22
  24. 24. Formula Search can find isomers 23
  25. 25. Active expansion of the PFAS list From 2 to 8 variants of PFOS 24
  26. 26. Disinfection By-Products 25
  27. 27. Mycotoxins • Two lists: 328 and 88 members 26
  28. 28. Tire Crumb Rubber (298) 27
  29. 29. Terpenes in Vape (37) 28
  30. 30. Hydraulic Fracturing (1640) 29
  31. 31. Opioids and Metabolites (160) 30
  32. 32. “MS-ready” structures 31
  33. 33. Overview of MS-Ready Structures • All structure-based chemical substances are algorithmically processed to – Split multicomponent chemicals into individual structures – Desalt and neutralize individual structures – Remove stereochemical bonds from all chemicals • MS-Ready structures are then mapped to original substances to provide a path between chemicals detected by mass spectrometry to original substances 32
  34. 34. 33
  35. 35. MS-Ready Mappings from Details Page 34
  36. 36. MS-Ready Mappings Set of 20 substances for “PFOS” 35
  37. 37. Mass and Formula Searching 36
  38. 38. Advanced Searches Mass Search 37
  39. 39. Advanced Searches Mass Search 38
  40. 40. MS-Ready Structures for Formula Search 39
  41. 41. MS-Ready Mappings • EXACT Formula: C10H16N2O8: 3 Hits 40
  42. 42. MS-Ready Mappings • Same Input Formula: C10H16N2O8 • MS Ready Formula Search: 125 Chemicals 41
  43. 43. MS-Ready Mappings • 125 chemicals returned in total – 8 of the 125 are single component chemicals – 3 of the 8 are isotope-labeled – 3 are neutral compounds and 2 are charged • Multiple components, stereo, isotopes and charge all collapsed and mapped through MS-Ready 42
  44. 44. “UVCB” Chemicals 43
  45. 45. UVCB Chemicals 44
  46. 46. UVCBs challenge in non-target analysis 45 Homologue screening plots from Swiss Wastewater (Schymanski et al 2014, left) and Novi Sad (right) o Complex mixtures (UVCBs) are a huge and very challenging part of the unknowns in many environmental samples
  47. 47. Public TSCA Inventory on Dashboard 31,460 Chemicals (1/24/2020) 46
  48. 48. Many Chemicals are “Complex” >14000 chemicals are UVCBs 47
  49. 49. “Markush Structures” https://en.wikipedia.org/wiki/Markush_structure 48
  50. 50. How to represent complexity? 49
  51. 51. In the Dashboard Abstract Sifter 50
  52. 52. Literature Searching 51
  53. 53. Literature Searching 52
  54. 54. Abstract Sifter for Excel 53
  55. 55. Conclusion • Dashboard access to data for ~875,000 chemicals (~895k in the Spring Release) • MS-Ready data facilitates structure identification • Related metadata facilitates candidate ranking 54 • Relationship mappings and chemical lists of great utility • Curation and mutual sharing of chemical lists is important (e.g. NORMAN)
  56. 56. ILS Kamel Mansouri EPA ORD Ann Richard Chris Grulke John Wambaugh Jeremy Dunne Jeff Edwards Grace Patlewicz Alex Chao Kristin Isaacs Charles Lowe James McCord Seth Newton Katherine Phillips Tom Purucker Jon Sobus Mark Strynar Elin Ulrich Joach Pleil GDIT Ilya Balabin Tom Transue Tommy Cathey Acknowledgements TEAMS IT Development Team Curation Team Collaborators Emma Schymanski NORMAN Network Andrew McEachran Jerry Zweigenbaum
  57. 57. MANY presentations online https://tinyurl.com/w5hqs55 56
  58. 58. Contact Antony Williams CCTE, US EPA Office of Research and Development, Williams.Antony@epa.gov ORCID: https://orcid.org/0000-0002-2668-4821 57 https://doi.org/10.1186/s13321-017-0247-6

×