Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update embedded spectral library #31

Open
sneumann opened this issue Oct 1, 2019 · 1 comment

Comments

@sneumann
Copy link
Member

commented Oct 1, 2019

The class de.ipbhalle.metfraglib.scoreinitialisation.OfflineMetFusionSpectralSimilarityScoreInitialiser
at

is = OfflineMetFusionSpectralSimilarityScoreInitialiser.class.getResourceAsStream("/MoNA-export-LC-MS.mb");

is used to initialize parameters for the MetFusion-like score which
includes the reading of the spectral file MoNA-export-LC-MS.mb. If nothing else given in the
settings with OfflineSpectralDatabaseFile = ...

This class uses the file located at
https://github.com/ipb-halle/MetFragRelaunched/blob/master/MetFragLib/src/main/resources/MoNA-export-LC-MS.mb

As already declared this file is in non-standard format to only include a little information needed by the score. The class located at de.ipbhalle.metfraglib.peaklistreader.MultipleTandemMassPeakListReade in https://github.com/ipb-halle/MetFragRelaunched/blob/c57f9d2b406350b2357ce9f7ce42a286cefcca13/MetFragLib/src/main/java/de/ipbhalle/metfraglib/peaklistreader/MultipleTandemMassPeakListReader.java is used to read this file. This creates a de.ipbhalle.metfraglib.collection.SpectralPeakListCollectionwhich is stored in the global MetFrag settings object later used by the score classde.ipbhalle.metfraglib.score.OfflineMetFusionSpectralSimilarityScore`
uses this data to calculate the MetFusion-like score for each candidate.

There might be two possibilities now. First, you simply create a new
spectral file in the format I used. It's quite simple as it only needs
the parameters:

SampleName,InChI,InChIKey,IsPositiveIonMode,PrecursorIonMode,MassError,MSLevel,IonizedPrecursorMass,NumPeaks,MolecularFingerPrint

followed by the spectral data. You can easily figure that out when looking in the default file. This file can then be used by defining its path with OfflineSpectralDatabaseFile = ...

The used fingerprint function is the MACCSFingerprint included in the CDK implementation.

The second possibility is to define it's own spectral file reader instead of the reader
de.ipbhalle.metfraglib.peaklistreader.MultipleTandemMassPeakListReader
currently used. Here, you could implement a NIST or a MassBank file
reader which also needs to create a de.ipbhalle.metfraglib.collection.SpectralPeakListCollection
object. But you need to include the fingerprint of the underlying molecule for each spectrum.

Thanks @c-ruttkies for the information! Yours, Steffen

@schymane

This comment has been minimized.

Copy link

commented Oct 1, 2019

Current thoughts after talking with @adelenelai is that option (1) is likely easiest, I can count at least 6 (and probably more) formats we'll need to work with, so that a new reader for each (option 2) seems impractical - and ideally we'd like to merge where possible and not have 1000s of libraries.
This way we could do small converters for every format => MetFrag mb format and keep the efficient internal format that MetFrag needs, then offer users the opportunity to download resulting library files, which they can then specify using OfflineSpectralDatabaseFile=...
@MaliRemorker @he-ob @rickhelmus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.