Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InChIKey and matching DTXSID dump for MassBank #77

Open
schymane opened this issue Jun 17, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@schymane
Copy link
Member

commented Jun 17, 2019

@meier-rene are you able to produce a dump file with all InChIKeys in MassBank and, where they have them, the corresponding DTXSIDs? I need all the InChIKeys for one file, and all the DTXSIDs for another.
I've browsed and found several varients of such files, but not one containing exactly this information paired. If you have one already that I missed, please point me to it ;-)
Thanks!

@meier-rene

This comment has been minimized.

Copy link
Collaborator

commented Jun 17, 2019

There was no script available to create exactly the information you requested.

inchikey_comptox_report.txt

For later usage the script to create that report:

#!/bin/bash
for x in *; do
        if [ "$x" = "figure" ]; then
                continue
        fi
        if [ -d "$x" ]; then
                cd $x
                for y in *.txt; do
                        echo `grep INCHIKEY $y` `grep COMPTOX $y`
                done
                cd ..
        fi
done | sed 's/CH\$LINK://g' | sed -r '/^\s*$/d' | sort | uniq
@schymane

This comment has been minimized.

Copy link
Member Author

commented Jun 17, 2019

Thanks @meier-rene - DTXSIDs sent to @ChemConnector to update
https://comptox.epa.gov/dashboard/chemical_lists/massbankref

and I'll be using the InChIKeys to update NORMAN-SLE shortly
https://www.norman-network.com/nds/SLE/

Thanks for the rapid turnover ;-)

@schymane

This comment has been minimized.

Copy link
Member Author

commented Jun 18, 2019

@meier-rene the CompTox list is updated with all public DTXSIDs from your dump, here's the list of non-public entries (390 total) that we should remove, see #68.
Once this is done I think we can close #66, #68 and this issue.

MassBankEU_DTXSIDs_Level6_notPublic_17062019.txt

meier-rene pushed a commit that referenced this issue Jun 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.