Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submit record to Massbank EU #54

Merged
merged 3 commits into from Apr 9, 2019

Conversation

Projects
None yet
4 participants
@zzjl20
Copy link
Contributor

zzjl20 commented Apr 1, 2019

The record contains MS and MS2 records.
Already checked by .scripts/validate.sh
Contact me by:
donghan-l@nig.ac.jp

zzjl20 added some commits Apr 1, 2019

MS/MS2 records from Riken NPDepo
Riken NPDepo publish MS record (prefix: CB) and MS2 record (prefix: NGA)
MS/MS2 records from Riken NPDepo
Riken NPDepo publish MS record (prefix: CB) and MS2 record (prefix: NGA)
@meier-rene

This comment has been minimized.

Copy link
Collaborator

meier-rene commented Apr 1, 2019

Thank you for this contribution. In principle this looks quite good, but I have one issue. Although these records pass the validator they have some problems. In the "COMMENT" section i can find information like "COMMENT: Origin: Animal, CSID: 10334, SubCategory_DNP : Lipids, CASID (tmp): [18951-77-4], Fatty acids". Better would be to place things like "CSID: 10334" and "CASID (tmp): [18951-77-4]" in the CH$LINK section. Please check CH$LINK
I would rather see it in this way:

CH$LINK: PUBCHEM CID:10334
CH$LINK: CAS 18951-77-4

in the recordfile.

Do you think it would be possible to create the records in this way? Do you use your own software to create the record files?

add CH$LINK
abstract database info from COMMENT, add CH$LINK
@meier-rene

This comment has been minimized.

Copy link
Collaborator

meier-rene commented Apr 9, 2019

Nice work. Thank you for your contribution.

@meier-rene meier-rene merged commit 46c2ad9 into MassBank:master Apr 9, 2019

1 check failed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
@meowcat

This comment has been minimized.

Copy link

meowcat commented Apr 15, 2019

There's still problems here. Things listed as CH$LINK: PUBCHEM CID should really be CH$LINK: CHEMSPIDER instead! At least for the ones I checked.

@meier-rene

This comment has been minimized.

Copy link
Collaborator

meier-rene commented Apr 15, 2019

Yes @meowcat, you are right. I tested some and it was never correct. Automatic validation doesn't check this yet. Implementing test for this is on the roadmap, but I'm already a bit afraid of the numbers of mistakes I need to fix...

@meowcat

This comment has been minimized.

Copy link

meowcat commented Apr 15, 2019

I think it will be hard to validate this strictly, since there are multiple true and half-true answers sometimes (stereoisomers, mixtures, salts etc will all not have a simple answer).

@schymane

This comment has been minimized.

Copy link
Member

schymane commented Apr 15, 2019

It should be easy to check, the ChemSpider ID and PubChem CID should be an InChIKey match, at the very least an InChIKey first block match. Everything else is clearly wrong. Entries that fail an InChIKey check should be validated.
Could this case be a misassigned identifier?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.