Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking

64 views

Published on

Principles of Peak Picking and Alignment in Pictures and further "doing". ASMS Fall Metabolomics Informatics Workshop 2018.
https://www.asms.org/conferences/fall-workshop/program

Published in: Science
  • Be the first to comment

  • Be the first to like this

ASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking

  1. 1. 1 Principles of Peak Picking and Alignment Emma L. Schymanski FNR ATTRACT Fellow and PI in Environmental Cheminformatics Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg Email: emma.schymanski@uni.lu …and many colleagues who contributed to my science over the years! ASMS Fall Meeting, San Francisco, California, November 29-30, 2018 Image©www.seanoakley.com/ https://tinyurl.com/asmsfall2018-peaks How many peaks will a peak picker pick if a peak picker only picks peaks?
  2. 2. 2 (nevertheless, I will do my best!) DISCLAIMER! MS1 MS2 Two very different worlds …
  3. 3. 3 Presenting Peak Picking: Plan o Why Peak Pick o Terminology • Peak Picking vs Centroid vs Profile … o Peak Picking & Peak Pickers • “best of” xcms and enviPick • Peak Picking in Pictures • Peak Picking Parameters • Alleviating Peak Picking Parameter Panic o Alignment ( / Profiling) • “best of” xcms and enviMass o Peak Picking Pointers o Don’t just listen to me … do it!
  4. 4. 4 Why Peak Pick (I) Example scheme of liquid chromatography - mass spectrometry Image © www.planetorbitrap.com/q-exactive Sampling Extraction (SPE) HPLC separation HR-MS/MS
  5. 5. 5 Why Peak Pick (II) This is what the output “really” looks like … Image © www.planetorbitrap.com/q-exactive
  6. 6. 6 Why Peak Pick (III) Identification = turning numbers into structures N N N S CH3 NHNH CH3 CH3 CH3 N N N S CH3 NHNHCH3 CH3 OH P O S SO CH3 CH3 CH3 P OHS S O CH3 CH3 OH CH3 S O O OH CH3 CH3 S N S O O OH S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 N N N S NHNH CH3 CH3 CH3 NH2 OH O massbank.eu
  7. 7. 7 TERMINOLOGY! o Peak picking can be multi-directional, i.e. • in mass… or time…
  8. 8. 8 Mass: Centroid vs Profile Data (enviPat) https://www.envipat.eawag.ch/index.php and Loos et al Anal. Chem. 87(11), 5738-5744. DOI: 10.1021/acs.analchem.5b00941
  9. 9. 9 Mass: Centroid vs Profile Data (enviPat) https://www.envipat.eawag.ch/index.php and Loos et al Anal. Chem. 87(11), 5738-5744. DOI: 10.1021/acs.analchem.5b00941
  10. 10. 10 TERMINOLOGY! http://proteowizard.sourceforge.net/ o Peak picking can be multi-directional (mass, time) • Peak picking in Proteowizard MSConvert is “centroiding” masses (turning profile mode data into centroided data for efficient processing)
  11. 11. 11 Peak Picking (in time) Source: R. Tautenhahn, C. Böttcher, S. Neumann, BMC Bioinformatics 2008, 9:504. DOI: 10.1186/1471-2105-9-504 o Peak picking along time axis (chromatographic peaks)
  12. 12. 12 Peak Picking Source: R. Tautenhahn, C. Böttcher, S. Neumann, BMC Bioinformatics 2008, 9:504. DOI: 10.1186/1471-2105-9-504 o Peak picking along time axis (chromatographic peaks)
  13. 13. 13 Peak Picking Source: Johannes Rainer; http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html o Peak picking along time axis (chromatographic peaks)
  14. 14. 14 Peak Picking Source: Johannes Rainer; http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html o Peak picking along time axis (chromatographic peaks) Several Samples Overlaid Red = KO Blue = wild type Rectangle = chromatographic peaks identified per sample
  15. 15. 15 Peak Picking o Several options for peak picking • XCMS and centWave • Tautenhahn et al 2008 DOI: 10.1186/1471-2105-9-504 • http://bioconductor.org/packages/xcms/ • MZmine 2 • Pluskal et al 2010 DOI: 10.1186/1471-2105-11-395 • http://mzmine.github.io/ • enviPick / enviMass • Loos 2018 DOI: 10.5281/zenodo.1213098 • http://www.looscomputing.ch/eng/enviMass/overview.htm • Plenty of other open, research and vendor options ...
  16. 16. 16 Peak Picking o Result is something like this (from Formulator output):
  17. 17. 17 Peak Picking – XCMS & XCMS Online o http://bioconductor.org/packages/xcms/
  18. 18. 18 Peak Picking – XCMS & XCMS Online o https://xcmsonline.scripps.edu/
  19. 19. 19 Peak Picking – enviMass and enviPick o http://www.looscomputing.ch/eng/enviMass/overview.htm o R packages …
  20. 20. 20 Peak Picking in Pictures http://www.looscomputing.ch/eng/enviMass/topics/peakpicking.htm Red = peaks Grey = noise
  21. 21. 21 Peak Picking .. Somewhat simpler picture http://www.looscomputing.ch/eng/enviMass/topics/peakpicking.htm
  22. 22. 22 centWave – Gaussian with “Mexican Hat” https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
  23. 23. 23 centWave – Gaussian with “Mexican Hat” https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
  24. 24. 24 centWave – Gaussian with “Mexican Hat” https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
  25. 25. 25 But … peaks are not perfect! http://www.looscomputing.ch/eng/enviMass/topics/peakpicking.htm o See enviMass website for explanation …
  26. 26. 26 Critical Point: Separating Peaks from Baseline
  27. 27. 27 Peak Picking Parameters https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504 o There are a lot of options to tweak! • I will just run through (main) centWave parameters • enviPick is too complicated => further reading!
  28. 28. 28 Peak Picking Parameters: centWave ppm maximal tolerated m/z deviation in consecutive scans, in ppm (parts per million) NOTE: dependent on your mass spectrometer
  29. 29. 29 Peak Picking Parameters: centWave peakwidth Chromatographic peak width, given as range (min,max) in seconds NOTE: highly dependent on your chromatography!
  30. 30. 30 Peak Picking Parameters: centWave snthresh Signal to noise ratio cutoff
  31. 31. 31 Peak Picking Parameters: centWave prefilter prefilter=c(k,I). Prefilter step for the first phase. Mass traces are only retained if they contain at least k peaks with intensity >= I Only one “stick” so will fail recommended prefilter settings
  32. 32. 32 Too Many Peak Picking Parameters ??????? https://bioconductor.org/packages/ release/bioc/vignettes/IPO/inst/doc /IPO.html o IPO to the rescue! o Parameter optimization for xcms-based workflows … o Libiseller et al 2015, DOI: 10.1186/s12859-015-0562-8 IPO = Isotopologue Parameter Optimization
  33. 33. 33 Too Many Peak Picking Parameters ???????
  34. 34. 34 RECAP: Why Peak Pick? Identification = turning numbers into structures N N N S CH3 NHNH CH3 CH3 CH3 N N N S CH3 NHNHCH3 CH3 OH P O S SO CH3 CH3 CH3 P OHS S O CH3 CH3 OH CH3 S O O OH CH3 CH3 S N S O O OH S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 N N N S NHNH CH3 CH3 CH3 NH2 OH O massbank.eu
  35. 35. 35 o Instruments change over time … o Before we can do fancy statistics, we need to make sure our samples are comparable!
  36. 36. 36 Alignment http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#3_initial_data_inspection o Alignment / Profiling => which peaks belong together across large sample sets?
  37. 37. 37 Alignment http://www.looscomputing.ch/eng/enviMass/topics/profiling.htm o “Profiling” in enviMass
  38. 38. 38 Alignment ~= Retention Time Correction http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#3_initial_data_inspection o Many algorithms and methods … o Before:
  39. 39. 39 Alignment ~= Retention Time Correction http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#5_alignment o Many algorithms and methods … o After (Obiwarp algorithm in xcms)
  40. 40. 40 Before Alignment After Alignment
  41. 41. 41 Changes over samples http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#5_alignment o Difference between adjusted and raw retention times along the retention time axis
  42. 42. 42 Some advice … o Peak pickers are designed to pick the perfect peak • But life is never perfect and peaks are no different o Pick the peak picker that is best for your situation • Convenience, ease of use, designed for your data, … • The optimal choice is usually a compromise o Be sceptical (visualise your data, reality check it, etc.) • But don’t go overboard in evaluating peak pickers … remember your (real) goal …
  43. 43. 43 Peak Picking Overlap (centWave paper) https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
  44. 44. 44 Verify with EIC Extraction [these are NOT picked] https://github.com/schymane/ReSOLUTION/blob/master/R/RMB_EIC_prescreen.R No peak at all Nice peak, MSMS Peak, no MSMS Noise with MSMS (careful!) Isobars with MSMS (careful!)* Looking for chemicals known to be present in the sample
  45. 45. 45 Just because you find a peak … ENTACT Project: https://www.epa.gov/sites/production/files/2018-06/documents/comptox_cop_6-28-18.pdf o Mix 505: One candidate with this mass/formula • DTXSID9040001, C9H8O4 o One chemical… How many peaks?
  46. 46. 46 …doesn’t mean it’s your compound of interest!
  47. 47. 47 Beware of artefacts! o Your results also depend on the acquisition data!
  48. 48. 48 Further reading DOING! [Vendor independent] o Don’t just take my word for it … don’t just read about it … DO IT. There are so many ways to try it out … complete with sample data! [Open Science!] o http://bioconductor.org/packages/release/bioc/vignettes/x cms/inst/doc/xcms.html o http://www.looscomputing.ch/eng/enviMass/overview.htm o An interface that many enjoy, likely comes with example data but requires a login … o https://xcmsonline.scripps.edu/
  49. 49. 49 Further reading DOING! [Vendor independent] o http://mzmine.github.io/ o http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/ o MS-DIAL
  50. 50. 50 Acknowledgements emma.schymanski@uni.lu Further Information: http://bioconductor.org/packages/xcms/ http://www.looscomputing.ch/eng/enviMass/overview.htm https://xcmsonline.scripps.edu/ http://mzmine.github.io/ EU Grant 603437 The CompMS Community (proxy photo)
  51. 51. 51 Extra Slides
  52. 52. 52 Quality Control of Data Slide c/o Michael Stravs o Always visualise results … never take anything for granted
  53. 53. 53 Homologues: Challenge Peak Pickers but are Present! Stravs et al. (2013), J. Mass Spectrom, 48(1):89-99. DOI: 10.1002/jms.3131 OHSO O CH3 O OH m n SPA-9C m+n=6 www.massbank.eu ACCESSIONS (LAS, SPACs): Literature MS/MS LIT00034, LIT00037 Std Mix., Sample ETS00012, ETS00018https://github.com/MassBank/RMassBank/ Tentatively Identified Spectra: http://goo.gl/0t7jGp
  54. 54. 54 Be wary of instrument specific phenomena! o R package nontarget: satellite peak removal
  55. 55. 55 Be wary of instrument specific phenomena II o Orbitrap-specific calibration issues (not observed in TOF)

×