Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign up[SOLVED] Add number of compounds to Record Index page #175
Comments
This comment has been minimized.
This comment has been minimized.
Indeed, useful as "Reference URL" here: |
This comment has been minimized.
This comment has been minimized.
Good idea. And you and I have exchanged on the need for some of the chemicals to be collapsed together too so the curation effort would affect those numbers. If you want me to do anything re looking for duplicates with mapping exercise let me know. I will dedicate a little time every day. |
This comment has been minimized.
This comment has been minimized.
On that note we could add the number of compounds by unique InChIKeys and also the numbers by unique first block to collapse down the (stereo)isomers ... would be an interesting statistic to have. |
This comment has been minimized.
This comment has been minimized.
Implemented with 50fb7ca and rolled out on the dev server server. I added 3 numbers: Unique Spectra corresponds to the the total number of accessions, Unique Compounds is the count of unique InChI-keys and Unique Isomers is the count of unique first blocks of InChI-keys. I have not added a section of records without InChI-keys which is around 3000 atm. With some work it will come down to less than 900. This can be closed with the next rollout of the official MassBank server. |
This comment has been minimized.
This comment has been minimized.
I updated the entry in Wikidata: https://www.wikidata.org/wiki/Property:P6689 |
This comment has been minimized.
This comment has been minimized.
Can you revert that? The numbers I added in wikidata were the numbers with accession IDs AND InChIKeys (the data we provided), unique spectra contains several without InChIKeys ...
…-------------------------------------------
PI: EnvCheminf @ LCSB
FNR ATTRACT Fellow
emma.schymanski@uni.lu
On Tue, Apr 30, 2019 at 10:29 AM +0200, "Egon Willighagen" <notifications@github.com<mailto:notifications@github.com>> wrote:
I updated the entry in Wikidata: https://www.wikidata.org/wiki/Property:P6689
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#175 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AA7BV7M24Q7R3ZN7TQJUENDPS77NDANCNFSM4HI33IOA>.
|
This comment has been minimized.
This comment has been minimized.
Maybe we need to add the number of spectra with InChIKeys to the record index as well?
…-------------------------------------------
PI: EnvCheminf @ LCSB
FNR ATTRACT Fellow
emma.schymanski@uni.lu
On Tue, Apr 30, 2019 at 11:16 AM +0200, "Emma SCHYMANSKI" <emma.schymanski@uni.lu<mailto:emma.schymanski@uni.lu>> wrote:
Can you revert that? The numbers I added in wikidata were the numbers with accession IDs AND InChIKeys (the data we provided), unique spectra contains several without InChIKeys ...
-------------------------------------------
PI: EnvCheminf @ LCSB
FNR ATTRACT Fellow
emma.schymanski@uni.lu
On Tue, Apr 30, 2019 at 10:29 AM +0200, "Egon Willighagen" <notifications@github.com<mailto:notifications@github.com>> wrote:
I updated the entry in Wikidata: https://www.wikidata.org/wiki/Property:P6689
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#175 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AA7BV7M24Q7R3ZN7TQJUENDPS77NDANCNFSM4HI33IOA>.
|
This comment has been minimized.
This comment has been minimized.
So ... this appears only on the msbi.ipb-halle record index still it seems, but is there an issue with the numbers? More isomers than compounds? What do we mean with "isomer" vs "compound"? Can we name them more accurately? |
This comment has been minimized.
This comment has been minimized.
yep, should he solved next weekend. I am relactant to change the sever in the week because of the service availability. And we have still some issues with the deployment... |
This comment has been minimized.
This comment has been minimized.
Yes, thats true but @tsufz is working on that issue.
I implemented my understanding of the topic: |
This comment has been minimized.
This comment has been minimized.
Well the problem is that isomers are defined on many different levels, and most would count a unique stereoisomer as a unique compound - hence the confusion. |
This comment has been minimized.
This comment has been minimized.
Interesting ontological discussion :) So, the IUPAC Goldbook does not have a definition of |
schymane commentedApr 27, 2019
The number of compounds in MassBank is not available anywhere ... we should have basic stats on how many compounds by unique InChIKeys and a number of records without InChIKeys (for instance).
A total number of spectra would also be good (and the answer is not >186,000, see #174) :-) but can be calculated relatively easily by adding pos and neg numbers - this is not the case for compounds (e.g. adding by name - due to naming inconsistencies, and the number of letters/numbers there in the range...