Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translator for clinicaltrials.gov #2153

Open
wants to merge 5 commits into
base: master
from
Open

Conversation

@rdvelazquez
Copy link

rdvelazquez commented Apr 1, 2020

TODO:

  • Decide the itemType to use. Currently using journalArticle but there have been discussions of using dataset or creating a new type using report as recommended by @bwwiernik
  • Set up a local development environment so I can test this out using the Zotero modules and tests Got it set up and everything works except for an issue with matching the "extra" field All three tests are passing now
  • Implement the search (if we think its a feature this should have) I don't see the need for being able to cite all the trials from a particular search of clinicaltrials.gov at this point (could always be added in later if needed, item parsing and other TODOs
  • Determine the values for translatorType and browserSupport in the metadata I think these are correct; just following these docs
  • Fix the lint errors and warnings

closes #1952
relates to manubot/manubot#216

@rdvelazquez

This comment has been minimized.

Copy link
Author

rdvelazquez commented Apr 1, 2020

@bwiernik and @adam3smith any input on what item type to use for clinical trials?

@bwiernik

This comment has been minimized.

Copy link

bwiernik commented Apr 1, 2020

I would use Report for this.

@rdvelazquez

This comment has been minimized.

Copy link
Author

rdvelazquez commented Apr 1, 2020

@dhimmel and @agitter two quick questions that I thought you may have answers to:

  1. Date: Clinical trials seem to have lots of dates; I'm currently using the "LastUpdateSubmitDate". Is this the date we would want in the citation or would we want "StudyFirstSubmitDate" or some other date? Here's an example of the date info available:
"StatusVerifiedDate":"March 2020",
              "OverallStatus":"Completed",
              "ExpandedAccessInfo":{
                "HasExpandedAccess":"No"
              },
              "StartDateStruct":{
                "StartDate":"February 6, 2020",
                "StartDateType":"Actual"
              },
              "PrimaryCompletionDateStruct":{
                "PrimaryCompletionDate":"February 25, 2020",
                "PrimaryCompletionDateType":"Actual"
              },
              "CompletionDateStruct":{
                "CompletionDate":"February 25, 2020",
                "CompletionDateType":"Actual"
              },
              "StudyFirstSubmitDate":"February 6, 2020",
              "StudyFirstSubmitQCDate":"February 6, 2020",
              "StudyFirstPostDateStruct":{
                "StudyFirstPostDate":"February 7, 2020",
                "StudyFirstPostDateType":"Actual"
              },
              "LastUpdateSubmitDate":"March 22, 2020",
              "LastUpdatePostDateStruct":{
                "LastUpdatePostDate":"March 24, 2020",
                "LastUpdatePostDateType":"Actual"
              }
  1. Author: I'm using the "ResponsiblePartyInvestigatorFullName" if one exists and the "LeadSponsorName" if it doesn't. Is this standard / ok? It seems like the "LeadSponsorName" will sometimes be a company.

I looked https://www.who.int/ictrp/How_to_cite.pdf and https://blogs.uoregon.edu/annie/2017/10/25/clinical-trial-apa-format/ but they didn't seem to be conclusive.

@bwiernik

This comment has been minimized.

Copy link

bwiernik commented Apr 1, 2020

For things like preprints, Zotero translators typically save the last updated date (i.e., the date of the version of the item actually being viewed) as the date. The first submit date could be stored in Extra with the label "Original date:"

@adam3smith

This comment has been minimized.

Copy link
Collaborator

adam3smith commented Apr 1, 2020

Agree on the regular date, but I'd be careful with using original date too widely. Its most common use is for historical publication dates of reprinted works, which are often rendered in citation styles. I don't really see that that'd be true for clinical trials (or preprints, for that matter).

@bwiernik

This comment has been minimized.

Copy link

bwiernik commented Apr 1, 2020

That's a really good point. "Submitted" might be a better (and rarely used in citation styles) variable.

@dhimmel

This comment has been minimized.

Copy link

dhimmel commented Apr 1, 2020

I'm using the "ResponsiblePartyInvestigatorFullName" if one exists and the "LeadSponsorName" if it doesn't. Is this standard / ok? It seems like the "LeadSponsorName" will sometimes be a company.

Just looking at a random record NCT04291053:

            "SponsorCollaboratorsModule":{
              "ResponsibleParty":{
                "ResponsiblePartyType":"Principal Investigator",
                "ResponsiblePartyInvestigatorFullName":"Chen Xiaoping",
                "ResponsiblePartyInvestigatorTitle":"Principal Investigator",
                "ResponsiblePartyInvestigatorAffiliation":"Tongji Hospital"
              },
              "LeadSponsor":{
                "LeadSponsorName":"Tongji Hospital",
                "LeadSponsorClass":"OTHER"
              }
            },
Some documentation

From https://prsinfo.clinicaltrials.gov/definitions.html:

3. Sponsor/Collaborators

Responsible Party, by Official Title *
Definition: An indication of whether the responsible party is the sponsor, the sponsor-investigator, or a principal investigator designated by the sponsor to be the responsible party. Select one.

  • Sponsor: The entity (for example, corporation or agency) that initiates the study
  • Principal Investigator: The individual designated as responsible party by the sponsor (see Note)
  • Sponsor-Investigator: The individual who both initiates and conducts the study
    Note: The sponsor may designate a principal investigator as the responsible party if such principal investigator meets all of the following requirements: is responsible for conducting the study; has access to and control over the data from the study; has the right to publish the results of the study; and has the ability to meet all of the requirements for submitting and updating clinical study information.

Investigator Information [*]
If the Responsible Party, by Official Title is either "Principal Investigator" or "Sponsor-Investigator," the following is required:

  • Investigator Name: Name of the investigator, including first and last name
  • Investigator Official Title: The official title of the investigator at the primary organizational affiliation
    Limit: 254 characters.
  • Investigator Affiliation: Primary organizational affiliation of the individual;
    Limit: 160 characters.

Name of the Sponsor *
Definition: The name of the entity or the individual who is the sponsor of the clinical study.
Limit: 160 characters.

Note: When a clinical study is conducted under an investigational new drug application (IND) or investigational device exemption (IDE), the IND or IDE holder is considered the sponsor. When a clinical study is not conducted under an IND or IDE, the single person or entity who initiates the study, by preparing and/or planning the study, and who has authority and control over the study, is considered the sponsor.

Collaborators
Definition: Other organizations (if any) providing support. Support may include funding, design, implementation, data analysis or reporting. The responsible party is responsible for confirming all collaborators before listing them.
Limit: 160 characters.

I think perhaps we want everything: the lead investigator, the sponsor, and collaborators. Each one of these could be different authors. I don't know too much about clinical trials however, so would be interested in what others think

@rdvelazquez

This comment has been minimized.

Copy link
Author

rdvelazquez commented Apr 2, 2020

Thank you all for the quick responses. Much appreciated!

My last commit attempts to incorporate that feedback.

  • I'm including the initial submission date in extra.submittedDate
  • I'm including the lead investigator, sponsor, and collaborators as creators and, as a way to be able to tell who is who, I'm also including this info in the extra section explicitly stating what type of creator they were.
    (there aren't many creatorTypes for report so I'm just listing everyone as an author but including this info in the extra's will let downstream analyses be able to untangle who was a collaborator vs. sponsor, etc.)
@rdvelazquez

This comment has been minimized.

Copy link
Author

rdvelazquez commented Apr 2, 2020

When I run the tests locally it says that the extra field doesn't match. Here's the testing output:

         -   "extra": {
         -     "submittedDate": "February 6, 2020"
         -     "responsiblePartyInvestigator": "undefined"
         -     "sponsor": "Gilead Sciences"
         -   }
         +   "extra": "[object Object]"

i.e. it's getting "[object Object]"

@adam3smith or someone else experienced with Zotero, any input on this?

@adam3smith

This comment has been minimized.

Copy link
Collaborator

adam3smith commented Apr 2, 2020

Haven't looked at your code, but Extra needs to be a string with the different values newline separated. It looks like you have it as an array?

@rdvelazquez

This comment has been minimized.

Copy link
Author

rdvelazquez commented Apr 2, 2020

Thanks @adam3smith! That worked. All three test are now passing. Ready for review.

@bwiernik

This comment has been minimized.

Copy link

bwiernik commented Apr 2, 2020

A few immediate comments:

  1. I would enter submittedDate just as submitted. That is the actual CSL variable. This will make it accessible for citations if needed.
  2. I would suggest not saving the same information in multiple places; that has a decent chance of producing unexpected results in citations. I'll defer to @adam3smith as to where Principal Investigators, other collaborators, and the lead sponsor should go. My intuition is that the Principal Investigator should be stored as Author, any collaborators stored as Contributor, and the Sponsor stored in Extra (and only in Extra) labeled "Sponsor:". Perhaps conditionally if there is no Principal Investigator, then Sponsor could instead be stored as an author.
  3. Use proper narrative capitalization and spacing when storing data in Extra, rather than camelCase (e.g., Principal investigator: rather than nresponsiblePartyInvestigator:).
  4. Undefined or missing values should be dropped rather than being stored as undefined
  5. "ClinicalTrials.gov" should be stored in the institution (publisher) field, and "Clinical trial registration" in the reportType field.
  6. The registration number (I think NCTId) should be stored in reportNumber
  7. The BriefSummary part of the DescriptionModule should be stored in abstract
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants
You can’t perform that action at this time.