Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ASMS Fall Metabolomics Informatics Workshop 2018 Identifying Unknown Metabolites

49 views

Published on

Characterising unknown metabolites talk from the ASMS Fall Metabolomics Informatics Workshop 2018 in San Francisco, California.
https://www.asms.org/conferences/fall-workshop/program
Slides with active hyperlinks accessible via tinyurl on the front page.

Published in: Science
  • Be the first to comment

  • Be the first to like this

ASMS Fall Metabolomics Informatics Workshop 2018 Identifying Unknown Metabolites

  1. 1. 1 Characterising “Unknown” Metabolites Emma L. Schymanski FNR ATTRACT Fellow and PI in Environmental Cheminformatics Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg Email: emma.schymanski@uni.lu …and many colleagues who contributed to my science over the years! ASMS Fall Meeting, San Francisco, California, November 29-30, 2018 Image©www.seanoakley.com/ https://tinyurl.com/asmsfall2018-unknowns Known known, known unknown, unknown known, unknown unknown …
  2. 2. 2 Turning Unknowns into Knowns o Knowns and Unknowns o Overview of Resources • Compound databases • “Make your own” molecules • Spectral libraries o Walk-through Swiss Wastewater • Targets • Suspect screening approaches • Annotation of non-targets with MetFrag o Exchanging information for annotating unknowns… o Take home messages 2.3
  3. 3. 3 Knowns and Unknowns … Peisl, Schymanski & Wilmes, 2018 Anal. Chim. Acta, DOI: 10.1016/j.aca.2017.12.034 Known known Unknown known Known unknown Unknown unknown -> Expected in sample -> Confirmed by mass spectrometry -> Reference standard available -> Known as part of expert knowledge or a mixture -> Undocumented as an individual compound -> “Suspected” or unknown to investigator -> Documented in databases, literature -> Compound not previously documented -> Full elucidation and confirmation required
  4. 4. 4 Searching for Known Small Molecules … o Compound Databases Peisl, Schymanski & Wilmes, 2018 Anal. Chim. Acta, DOI: 10.1016/j.aca.2017.12.034 PubChem: >96 million https://pubchem.ncbi.nlm.nih.gov/ ChemSpider: >69 million http://www.chemspider.com/ CompTox Chemicals Dashboard: >765 000 https://comptox.epa.gov/dashboard/ Human Metabolome DB (HMDB): >114 000 http://www.hmdb.ca/
  5. 5. 5 Searching for Known Small Molecules … o Compound Databases … isn’t 96 million enough? Peisl, Schymanski & Wilmes, 2018 Anal. Chim. Acta, DOI: 10.1016/j.aca.2017.12.034 Quick answer … NO! E. coli data :N. Zamboni, IMSB, ETH Zürich in silico prediction
  6. 6. 6 Searching for More (Un)Known Small Molecules … Jeffryes et al, 2015, MINEs, J. Cheminf, 7:44. DOI: 10.1186/s13321-015-0087-1 o In silico metabolite prediction – example of MINE (2015) KEGG MINE 13,307 => 571,368 EcoCyc MINE 1,832 => 54,719 YMDB MINE 1,978 => 100,755 HMDB [15] MINE 23,035 => 400,414
  7. 7. 7 Searching for More (Un)Known Small Molecules … Jeffryes et al, 2015, MINEs, J. Cheminf, 7:44. DOI: 10.1186/s13321-015-0087-1 o In silico metabolite prediction – example of MINE (2015) • First generation only … combinatorial explosion! KEGG MINE 13,307 => 571,368 EcoCyc MINE 1,832 => 54,719 YMDB MINE 1,978 => 100,755 HMDB MINE 23,035 => 400,414 Speculation … PubChem MINE 95 million => 1.6 billion … first generation only?!?!
  8. 8. 8 Searching for MORE (Un)Known Small Molecules… Source: A. Kerber, R. Laue, M. Meringer, C. Rücker (2005) MATCH 54 (2), 301-312. o Structure Generation • But of course most of these do not exist Molecular Mass NumberofStructures 50 70 90 110 130 150 1100100001000000100000000 NIST MS LibraryNIST MS Library Beilstein Registry NIST MS Library Beilstein Registry Molecular Graphs Structure Generation 100 million at mass = 150 Da NIST MS Library ~1-200 at mass = 150 Spectral Libraries
  9. 9. 9 Searching for Small Molecules in Spectral Libraries o … to find what is “on record” with MS “fingerprint” • Too many different MS/MS libraries (and they are still too small) Peisl, Schymanski & Wilmes, 2018 Anal. Chim. Acta, DOI: 10.1016/j.aca.2017.12.034
  10. 10. 10 Do we need all these libraries? Vinaixa, Schymanski, Navarro, Neumann, Salek, Yanes, 2016, TrAC, DOI: 10.1016/j.trac.2015.09.005 o Yes … most libraries still have many unique entries = HMDB, GNPS, MassBank, ReSpect Compound lists provided by: S. Stein, R. Mistrik, Agilent
  11. 11. 11 Mind the Gap! Frainay, C. et al. (2018) “Mind the Gap: …” Metabolites: http://www.mdpi.com/2218-1989/8/3/51 o Only 23-60 % of (defined) metabolites in Genome-Scale Metabolic Networks are covered by (combined!) Mass Spectral Libraries
  12. 12. 12 Mind the Gap! Frainay, C. et al. (2018) “Mind the Gap: …” Metabolites: http://www.mdpi.com/2218-1989/8/3/51 o Best library to choose depends highly on your dataset • Example: MSforID (https://msforid.com/) is poor for metabolic networks – but great for forensic toxicology!
  13. 13. 13 Environmental Chemistry and Metabolomics … Source: Fenner et al. (2013) Science, 341(6147), 752-758. DOI: 10.1126/science.1236281 …have surprisingly many things in common …
  14. 14. 14 What is in our (Swiss) Wastewater? France Germany Austria Italy Vernier Uetendorf Zug Werdhölzli, Zürich Bioggio Bussigny prés Lausanne Hallau Zwillikon ThalWinterthur Map © Eawag/BAFU/SwissTopo 10 Wastewater Treatment Plants 24 hr flow-proportional samples February 2010 364 target substances Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374
  15. 15. 15 Target, Suspect and Non-Target Screening KNOWNS SUSPECTS No Prior Knowledge HPLC separation and HR-MS/MS TARGET ANALYSIS SUSPECT SCREENING NON-TARGET SCREENING Targets found Suspects found Masses of interest (Molecular formula) DATABASE SEARCH STRUCTURE GENERATION Confirmation and quantification of compounds present Candidate selection (retention time, MS/MS, calculated properties) Sampling extraction (SPE) HPLC separation HR-MS/MS Time, Effort & Number of Compounds…. SUSPECTS SPECTRUM SEARCH Spectral match
  16. 16. 16 Identification Strategies and Confidence Schymanski et al, 2014, ES&T. DOI: 10.1021/es5002105 & Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7 Peak picking Non-target HR-MS(/MS) Acquisition Target Screening Suspect Screening Non-target Screening Start Level 1 Confirmed Structure by reference standard Level 2 Probable Structure by library/diagnostic evidence Start Level 3 Tentative Candidate(s) suspect, substructure, class Level 4 Unequivocal Molecular Formula insufficient structural evidence Start Level 5 Mass of Interest multiple detection, trends, … “downgrading” with contradictory evidence Increasing identification confidence Target list Suspect list Peak picking or XICs
  17. 17. 17 Target Analysis: Status Quo (>364 targets) Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Target List HPLC separation and HR-MS/MS TARGET ANALYSIS Targets found Confirmation and quantification of compounds present Sampling extraction (SPE) HPLC separation HR-MS/MS TPs!
  18. 18. 18 Target Analysis: Status Quo (>364 targets) Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Target List HPLC separation and HR-MS/MS TARGET ANALYSIS Targets found Confirmation and quantification of compounds present Sampling extraction (SPE) HPLC separation HR-MS/MS m/z RT
  19. 19. 19 Swiss Wastewater: Top 30 Peaks (ESI-) Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Artificial Sweeteners Diclofenac Pictures: www.coca-cola-com; www.rivella.ch; www.voltargengel.com
  20. 20. 20 Suspect Screening: Different Approaches Target List Suspect List HPLC separation and HR-MS/MS TARGET ANALYSIS SUSPECT SCREENING Targets found Suspects found Confirmation and quantification of compounds present Candidate selection (retention time, MS/MS, calculated properties) Sampling extraction (SPE) HPLC separation HR-MS/MS o Screen for predicted transformation products of known parent compounds o Look for “well known” substances without reference standards o Screen for known homologue series o Search in mass spectral libraries
  21. 21. 21 Suspect Screening: Benzotriazole TPs Huntscha et al. 2014, ES&T, 48(8), 4435-4443. DOI: 10.1021/es405694z 28 Suspects HPLC separation and HR-MS/MS SUSPECT SCREENING 11 masses for 6 suspect formulas 7 with MS/MS 1 reference std. 1 TP confirmed 1 TP “likely”, no std. [UM-PPS] ↓ Eawag-PPS ↓ [enviPath]
  22. 22. 22 Suspect Screening: Benzotriazole TPs Huntscha et al. 2014, ES&T, 48(8), 4435-4443. DOI: 10.1021/es405694z 28 Suspects HPLC separation and HR-MS/MS SUSPECT SCREENING 11 masses for 6 suspect formulas 7 with MS/MS 1 reference std. 1 TP confirmed 1 TP “likely”, no std. [UM-PPS] ↓ Eawag-PPS ↓ [enviPath] N N N H O OH N N N H O OH - Predicted with Eawag-PPS - No standard - Not in ChemSpider - In the Dashboard  DTXSID10212177 - Confirmed with reference std. - Observed in WWTP effluents
  23. 23. 23 Suspect Screening: Different Approaches Target List Suspect List HPLC separation and HR-MS/MS TARGET ANALYSIS SUSPECT SCREENING Targets found Suspects found Confirmation and quantification of compounds present Candidate selection (retention time, MS/MS, calculated properties) Sampling extraction (SPE) HPLC separation HR-MS/MS o Screen for predicted transformation products of known parent compounds o Look for “well known” substances without reference standards o Screen for known homologue series o Search in mass spectral libraries
  24. 24. 24 Suspect Screening – “Screen Smart” Moschet et al 2013, ES&T. DOI: 10.1021/ac4021598 o Screened 213 pesticides & TPs without standards => confirm 19 new IDs o Browse: https://comptox.epa.gov/dashboard/chemical_lists/swisspest
  25. 25. 25 NORMAN Network Suspect List Exchange o http://www.norman-network.com/?q=node/236 ReferencesFull Lists InChIKeys
  26. 26. 26 Lists on CompTox Chemicals Dashboard https://comptox.epa.gov/dashboard/chemical_lists/ More lists become available with every release
  27. 27. 27 Suspect Screening: Different Approaches Target List Suspect List HPLC separation and HR-MS/MS TARGET ANALYSIS SUSPECT SCREENING Targets found Suspects found Confirmation and quantification of compounds present Candidate selection (retention time, MS/MS, calculated properties) Sampling extraction (SPE) HPLC separation HR-MS/MS o Screen for predicted transformation products of known parent compounds o Look for “well known” substances without reference standards o Screen for known homologue series o Search in mass spectral libraries
  28. 28. 28 RECAP: Target Analysis: Status Quo (>364 targets) Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Target List HPLC separation and HR-MS/MS TARGET ANALYSIS Targets found Confirmation and quantification of compounds present Sampling extraction (SPE) HPLC separation HR-MS/MS m/z RT
  29. 29. 29 Grouping Isotopes and Adducts Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 0 3000 6000 9000 12000 15000 positive 2% 27% 100% Noise/Blank Targets Non-targets 0 3000 6000 9000 12000 15000 positivenegative 1% 30% 100%
  30. 30. 30 Swiss Wastewater: Top 30 Peaks (ESI-) Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Artificial Sweeteners Diclofenac Pictures: www.coca-cola-com; www.rivella.ch; www.voltargengel.com
  31. 31. 31 Swiss Wastewater: Top 30 Peaks (ESI-) Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 S OO O - O S O - O CH2 m/z = 79.96 m/z = 183.01 Picture: www.momsteam.com
  32. 32. 32 Surfactant Screening From Literature Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Literature sources o Formulas, masses (ions), retention times and intensities o Spectra of selected compounds (different instruments) Gonzalez et al. Rapid Comm. Mass Spec. 2008, 22: 1445-54 Lara-Martin et al. EST. 2010, 44: 1670-1676
  33. 33. 33 Homologous Series Detection M. Loos & H Singer, 2017. J. Cheminf. DOI: 10.1186/s13321-017-0197-z & Schymanski et al. 2014, ES&T DOI: 10.1021/es4044374 http://www.envihomolog.eawag.ch/ Search for discrete mass differences S OO OH CH3 CH3 m n C9H19 O O S O O OHm
  34. 34. 34 Homologous Series Detection M. Loos & H Singer, 2017. J. Cheminf. DOI: 10.1186/s13321-017-0197-z & Schymanski et al. 2014, ES&T DOI: 10.1021/es4044374 S OO OH CH3 CH3 m n DATS S OO OH O OH CH3 ()n ()m SPAC S OO OH O OHCH3 ()n ()m STAC http://www.envihomolog.eawag.ch/
  35. 35. 35 Swiss Wastewater: Top 30 Peaks (ESI-) Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Acesulfame Diclofenac Cyclamate Saccharin C10DATS C10SPAC SPA5C C15DATS STA6C C9DATS SPA2DC S OO OH O OH CH3 S OO OH CH3 CH3 ()n ()m SPAC DATS ()n ()m
  36. 36. 36 Supporting Evidence for Homologues Stravs et al. (2013), J. Mass Spectrom, 48(1):89-99. DOI: 10.1002/jms.3131 OHSO O CH3 O OH m n SPA-9C m+n=6 Formulas: http://sourceforge.net/projects/genform/ Meringer et al, 2011, MATCH 65, 259-290 Data: Schymanski et al. 2014, ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Chromatography and MS/MS Annotation Literature: LIT00034,35 Sample: ETS00002 Standard: ETS00016,17,19,20 https://github.com/MassBank/RMassBank/
  37. 37. 37 Cross-Linking Homologues in the Dashboard Schymanski, Grulke, Williams et al, in prep. & Williams et al. 2017 J. Cheminformatics 9:61 DOI: 10.1186/s13321-017-0247-6 https://comptox.epa.gov/dashboard/chemical_lists/eawagsurf
  38. 38. 38 Suspect Screening: Different Approaches Target List Suspect List HPLC separation and HR-MS/MS TARGET ANALYSIS SUSPECT SCREENING Targets found Suspects found Confirmation and quantification of compounds present Candidate selection (retention time, MS/MS, calculated properties) Sampling extraction (SPE) HPLC separation HR-MS/MS o Screen for predicted transformation products of known parent compounds o Look for “well known” substances without reference standards o Screen for known homologue series o Search in mass spectral libraries
  39. 39. 39 Searching for Small Molecules in Spectral Libraries Peisl, Schymanski & Wilmes, 2018 Anal. Chim. Acta, DOI: 10.1016/j.aca.2017.12.034
  40. 40. 40 What about Non-Target Screening? Target List Suspect List (no prior information) HPLC separation and HR-MS/MS TARGET ANALYSIS SUSPECT SCREENING NON-TARGET SCREENING Targets found Suspects found Masses of interest (Molecular formula) DATABASE SEARCH STRUCTURE GENERATION Confirmation and quantification of compounds present Candidate selection (retention time, MS/MS, calculated properties) Sampling extraction (SPE) HPLC separation HR-MS/MS Number of compounds
  41. 41. 41 Swiss Wastewater: Top 30 Peaks (ESI-) Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Acesulfame Diclofenac Cyclamate Saccharin C10DATS C10SPAC SPA5C C15DATS STA6C C9DATS SPA2DC S OO OH O OH CH3 S OO OH CH3 CH3 ()n ()m SPAC DATS ()n ()m
  42. 42. 42 MetFrag2.3: Non-target Identification Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9 Status: 2010 => 2016 5 ppm 0.001 Da mz [M-H]- 213.9637 ChemSpider or PubChem± 5 ppm 2.3 RT: 4.54 min 355 InChI/RTs References External Refs Data Sources RSC Count PubMed Count Suspect Lists MS/MS 134.0054 339689 150.0001 77271 213.9607 632466 Elements: C,N,S S OO OH
  43. 43. 43 MetFrag2.3: Non-target Identification Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9 MetFrag 2010 MetFrag2.3 Fragments only MetFrag2.3 +References +Retention time ChemSpider1 Top 1 Ranks 73 105 420 % Top 1 Ranks 15 % 22 % 89 % PubChem2 Top 1 Ranks - 30 336 % Top 1 Ranks - 6 % 71 % Test set of 473 Eawag Target Substances 1www.chemspider.com; ~34 million entries 2https://pubchem.ncbi.nlm.nih.gov/; ~74 million entries http://c-ruttkies.github.io/MetFrag/ Similar results with 3 independent datasets of 310, 289 and 225 substances from Eawag and UFZ (www.massbank.eu)
  44. 44. 44 The Power of the Metadata (Top 1 ranks) Schymanski et al, 2017, J Cheminf., DOI: 10.1186/s13321-017-0207-1 www.casmi-contest.org
  45. 45. 45 MetFrag2.3: Non-target Identification Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9 Try with the Web Interface: http://msbi.ipb-halle.de/MetFragBeta/
  46. 46. 46 MetFrag2.3: Non-target Identification Ruttkies, Schymanski, Wolf, Hollender, Neumann (2016) J. Cheminf., 2016, DOI: 10.1186/s13321-016-0115-9 Try with the Web Interface: http://msbi.ipb-halle.de/MetFragBeta/
  47. 47. 47 Swiss Wastewater: Top 30 Peaks (ESI-) Schymanski et al. (2014), ES&T, 48: 1811-1818. DOI: 10.1021/es4044374 Acesulfame Diclofenac Cyclamate Saccharin C10DATS C10SPAC SPA5C C15DATS STA6C C9DATS SPA2DC S N SO O OH Now 13 of the top 30 (tentatively) identified
  48. 48. 48 We still have many unknowns … (l) Data from Schymanski et al 2014, ES&T DOI: 10.1021/es4044374. (r) E. coli data provided by N. Zamboni, IMSB, ETH Zürich. Environment Cells
  49. 49. 49 Biological matrices also have many homologues Lipid extract of Mycobacterium smegmatis C23F48O7 +CF2
  50. 50. 50 Exchanging Knowledge … Open Science Helps! We need to be able to find and annotate the unexpected! C23F48O7 +CF2
  51. 51. 51 Exchanging Knowledge … Open Science Helps! We need to be able to find and annotate the unexpected!
  52. 52. 52 Take Home Messages Unknowns and High Resolution Mass Spectrometry o Over 60 % of HR-MS peaks are potentially relevant but unknown Environment Cells
  53. 53. 53 Take Home Messages o Over 60 % of HR-MS peaks are potentially relevant but unknown o Annotating unknowns requires data and evidence from many different sources o Many excellent workflows available to collate this information o Incorporation of all available metadata is critical to success! o E.g. MetFrag2.3 has greatly improved the speed and success of tentative identification of “known unknowns”: 15 % => 89 % Ranked Number 1 o http://c-ruttkies.github.io/MetFrag/ Unknowns and High Resolution Mass Spectrometry 2.3
  54. 54. 54 Take Home Messages o Over 60 % of HR-MS peaks are potentially relevant but unknown o Annotating unknowns requires data and evidence from many different sources o Exchange expert knowledge worldwide o Community efforts contribute greatly to improved cross-annotation o Information in the public domain helps everyone! o You never know when it will help you  Unknowns and High Resolution Mass Spectrometry Schymanski et al. 2015, ABC, DOI: 10.1007/s00216-015-8681-7; Alygizakis et al. 2018 ES&T, DOI: 10.1021/acs.est.8b00365
  55. 55. 55 Acknowledgements emma.schymanski@uni.lu Further Information: https://massbank.eu/MassBank/ http://c-ruttkies.github.io/MetFrag/ https://comptox.epa.gov/dashboard/ http://www.norman-network.com/?q=node/236 https://wwwen.uni.lu/lcsb/research/ environmental_cheminformatics .eu 2.3 EU Grant 603437
  56. 56. 56

×