Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MassBank Accession / InChIKey CC0 dump for Wikidata #57

Open
schymane opened this Issue Apr 2, 2019 · 4 comments

Comments

Projects
None yet
3 participants
@schymane
Copy link
Member

schymane commented Apr 2, 2019

@meier-rene can we get a CC0 dump of MassBank Accession IDs with InChIKey mappings for @egonw to add to WikiData? He's registering this property now ;-)

@egonw

This comment has been minimized.

Copy link

egonw commented Apr 2, 2019

@meier-rene

This comment has been minimized.

Copy link
Collaborator

meier-rene commented Apr 2, 2019

Accession_to_InChi-Key.txt

This File contains a mapping of all Accessions with a Creative Commons License to InChi-Key. It was generated with the following bash script:

#!/bin/bash
for x in *; do
        if [ "$x" = "figure" ]; then
                continue
        fi
        # Literature_Specs have LICENSE: Unknown
        if [ "$x" = "Literature_Specs" ]; then
                continue
        fi
        if [ -d "$x" ]; then
                cd $x
                        grep -R  INCHIKEY * | awk -F: '{print $1 $3}' | awk  '{print $1" "$3}' | sed 's|.txt||g'
                cd ..
        fi
done

It will most likely contain more Accessions compared to the currently online ones, because I have accepted some new records this morning.

@schymane

This comment has been minimized.

Copy link
Member Author

schymane commented Apr 2, 2019

Thanks for the dump!
Re Literature specs, I created those myself by manually extracting literature data, @egonw just clarified that we are thus able to put a CC0 license on them as it was my work. Do you want to update these records with CC0 (or CC-BY for consistency?) so that we can include them as well? Likely quicker your end than mine ... ;-)

@egonw

This comment has been minimized.

Copy link

egonw commented Apr 2, 2019

Since it's only identifiers, some even argue there is no data, and it cannot even be copyrighted.

But a quick note that I am free to enter the content of the text into Wikidata will do fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.