Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upCreate file for PubChem deposition at every release #102
Comments
This comment has been minimized.
This comment has been minimized.
Hi, we have started to embed bioschemas information, |
This comment has been minimized.
This comment has been minimized.
Let's discuss with Evan and @egonw at Dagstuhl then ... |
This comment has been minimized.
This comment has been minimized.
adding @AlasdairGray in to contribute his Bioschemas wisdom ... |
This comment has been minimized.
This comment has been minimized.
@AlasdairGray, I guess a good first step forward is to have that aggregator website you just showed crawl the MassBank website and extract the chemical structures and data record JSON-LD. |
schymane commentedNov 21, 2019
It would be great if we could auto-create a file to deposit in PubChem with every stable release of MassBank-data.
To discuss: compound information only (=> relatively easy) or mappings with spectral IDs (slightly more info needed) or actual spectra as well (more work our side).
Shall we start with getting a deposit file for compound information only? Then we need e.g.:
PUBCHEM_EXT_DATASOURCE_REGID <= InChIKey, or any unique identifier our side
PUBCHEM_EXT_DATASOURCE_SMILES <= SMILES
PUBCHEM_EXT_DATASOURCE_CID <= PubChem CID (if available)
PUBCHEM_SUBSTANCE_COMMENT <= here we could e.g. provide accession IDs, collapsed
PUBCHEM_SUBSTANCE_SYNONYM <= any names our side (can have multiple columns, but maybe e.g. max 3 would be sensible)
@meier-rene @sneumann @tsufz what do you think?
If yes, who will look after the file?
I would contact PubChem to get us a MassBank login for deposition, so credit goes to MassBank(EU) and we can track our submissions.