Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign up"AUTHOR"ship in MassBank spectra #194
Comments
This comment has been minimized.
This comment has been minimized.
Another one:
|
This comment has been minimized.
This comment has been minimized.
Last one, then I leave you alone
|
This comment has been minimized.
This comment has been minimized.
I agree with @meowcat. |
This comment has been minimized.
This comment has been minimized.
Should I propose a PR to the record format description, proposing the use of [dtc] and [com] if appropriate? |
This comment has been minimized.
This comment has been minimized.
You are welcome to propose a PR for the record format description. But, if I understood it correctly, this will be a change which breaks compatibility. This means it will take some time until its included in all neede places. |
This comment has been minimized.
This comment has been minimized.
I don't believe this would break compatibility; it would merely explicitely foresee the possibility to allow putting MARC relator tags behind names. Right now, there is actually no specification at all for how the author names should be presented. So maybe this would require a small adaptation in the validator, but otherwise I don't think a change is required... |
This comment has been minimized.
This comment has been minimized.
Do we have a pointer on how R package descriptions do that ? |
This comment has been minimized.
This comment has been minimized.
@meowcat Then I probably misunderstood your proposed extension/changes. Could you please give an example? |
This comment has been minimized.
This comment has been minimized.
@sneumann: I would argue that there is currently no syntax that would be changed, since there is a variety of formats used for the AUTHORS field, with more or less the same info in slightly different iterations: https://github.com/MassBank/MassBank-data/blob/master/Eawag/EA016651.txt Different order of first and last name, use of brackets, spelling of double initials, use of punctuation, different specifications for the institutes both in format and detail etc. So whatever anyone chooses to put in their My current PR, as a basis for discussion, is mostly a suggestion how people might want to specify authorship, since this would fit in with any scheme people are currently using, and not be more or less machine-readable than before. I agree that a thought-out new version of AUTHORS (or a substitute) should be machine-readable. (Ideally we would incorporate ORCID.) |
This comment has been minimized.
This comment has been minimized.
Ah, I found a few more interesting schemes. Including my semicolon! Note that many of these are actually by "us" as in "the people discussing here". https://github.com/MassBank/MassBank-data/blob/master/CASMI_2012/SMI00021.txt This is not meant to criticise any of the formats that were used, only pointing out the complete absence of anything systematic, even among high-quality and involved contributors. |
This comment has been minimized.
This comment has been minimized.
Yes I totally agree its's a good time to start using some conventions; @MaliRemorker and I were discussing how to write the author statement for the hopefully soon-to-be-coming LCSB records, he's looking into your suggestions our side. I agree with @sneumann that we should retain a plain text AUTHOR field (but add some recommendations for use into the documentation to avoid this in the future) and add a machine readable one as an extra, to retain backwards compatibility and ease-of-use for users. ORCIDs ... maybe a separate field? |
meowcat commentedAug 26, 2019
•
edited
Could we devise a better specification for the AUTHOR field in the MassBank record?
Currently:
In particular, can we specify who should be in the author list, or differentiate who contributed the data and who made the records?
In "Eawag additional specs", we have situations where the record creator and uploader is not an author on the
PUBLICATION
. (example) My understanding is that we just added the record creator to a subset of the paper authors, somewhere between first and last author. I don't see this as a really clear and transparent solution, since as the record creator I wouldn't want to steal authority from the actual paper authors.On the other hand, for the MetaboLights records (example), which I created from publicly available data, I am not listed at all. This is also not ideal, since the listed
AUTHORS
may not even know about this record existing and should not be held responsible for problems in it, e.g. if processing went wrong.I suggest to allow the use of MARC relator terms. For example, [dtc] for where the data comes from, and [com] for who made the record.
This is how the terms are specifically defined for R packages: https://journal.r-project.org/archive/2012-1/RJournal_2012-1_Hornik~et~al.pdf
(This is already a loose application of [dtc] in the MetaboLights case since they did not actively submit the data; but it's the one that makes most sense, it seems)
There is also e.g.
In R usage, [cre] is the package maintainer, but the MARC definition is
so I would leave this out probably...