MIDAS Journal Translator #1030

str4w · Mar 23, 2016

This is a translator for the journals hosted by Kitware - Insight Journal, VTK Journal, etc.

zuphilip · Apr 10, 2016

zuphilip reviewed Apr 10, 2016

View changes

zuphilip · Apr 10, 2016

MIDAS Journals.js

+   splitDownloadPath = downloadAllXPath.iterateNext().value.split('/');
+   version=splitDownloadPath[splitDownloadPath.length-1]
+   newItem.extra="Revision: " + version
+   newItem.issue=splitDownloadPath[splitDownloadPath.length-2]


All three lines above are missing a semicolon.

zuphilip · Apr 10, 2016

zuphilip reviewed Apr 10, 2016

View changes

zuphilip · Apr 10, 2016

MIDAS Journals.js

+   	   //Zotero.debug("splitname");
+   	   //Zotero.debug(splitname);
+   	   var z=1;
+   	   name="";


var is missing

zuphilip · Apr 10, 2016

zuphilip reviewed Apr 10, 2016

View changes

zuphilip · Apr 10, 2016

MIDAS Journals.js

+	"translatorID": "86c35e86-3f97-4e80-9356-8209c97737c2",
+	"label": "MIDAS Journals",
+	"creator": "Rupert Brooks",
+	"target": "(insight-journal|midasjournal|vtkjournal).org/browse/publication",


What is in front this? Normally we have something like ^https?://(www)?. Can we do this here also?

Moreover, escape the point in the regexp.

zuphilip · Apr 10, 2016

Can you give some specific examples for your translator?

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+*/
+
+function detectWeb(doc, url) {
+	if (url.match("browse/publication")) return "journalArticle";


this is already guaranteed by the target regexp above. Since the only required field for translation is item.title, you probably want to check here that you can get the title.

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+   var namespace = doc.documentElement.namespaceURI; 
+   var nsResolver = namespace ? function(prefix) {
+	  if (prefix == 'x') return namespace; else return null; 
+	   } : null; 


Drop all of this and, below, use ZU.xpath(doc, "your xpath here") for an element array or ZU.xpathText(doc, "your xpath here") for the .textContent of the elements.

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+   var submittedXPath = doc.evaluate('//*[@id="publication"]/div[@class="submittedby"]', doc, nsResolver, XPathResult.ANY_TYPE, null); 
+   var abstractXPath = doc.evaluate('//*[@id="publication"]/div[@class="abstract"]', doc, nsResolver, XPathResult.ANY_TYPE, null); 
+   var downloadAllXPath = doc.evaluate('//a[contains(text(),"Download All")]/@href', doc, nsResolver, XPathResult.ANY_TYPE, null); 
+   var pdfXPath1 = doc.evaluate('//a[contains(text(),"Download Paper")]/@href', doc, nsResolver, XPathResult.ANY_TYPE, null); 


Move all of these down to where they are actually assigned to the Zotero item.

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+   //var pdfXPath2 = doc.evaluate('//a[contains(text(),".pdf")]/@href', doc, nsResolver, XPathResult.ANY_TYPE, null); 
+
+
+   pdfhref=pdfXPath1.iterateNext();


missing var. Apply throughout, please.

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+	  pdflink=tmp[0]+'//' +tmp[2]+pdfhref.value;
+	  //Zotero.debug(pdflink);
+   }
+   datematch=new RegExp("on\\w+([0-9]+)\\-([0-9]+)\\-([0-9]+)");


Instead of new RegExp(...) use /.../ (and you don't need to declare them ahead of time. Use "foo".match(/o/) instead of regexp.exec("foo"))

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+	  //Zotero.debug(pdflink);
+   }
+   datematch=new RegExp("on\\w+([0-9]+)\\-([0-9]+)\\-([0-9]+)");
+   datematch=new RegExp("on +([0-9]+)\\-([0-9]+)\\-([0-9]+)");


add spaces before and after =. Apply throughout please.

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+   newItem.issue=splitDownloadPath[splitDownloadPath.length-2]
+
+   var a=0;
+   while (a<authors.length) {


This (and below) should be a for loop. for (var i=0; i<authors.length; i++) { ...

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+   	   	  name=name+splitname[z]+" ";
+   	   	  z=z+1;
+   	   }
+   	   name=name+splitname[0];


Please leave a comment in the code of what kind of string you are processing in this whole block. I can see that you're reshuffling something like Prastawa M. into M. Prastawa and then calling cleanAuthor to parse it. It would be much simpler and cleaner to

var m = "Prastawa M.".match(/(\S+)\s+(.+)/); item.creators.push({ firstName: m[2], lastName: m[1], creatorType: 'author' })`

would be even better to take prestructured data from the XML, see below comment

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+   //pdf	
+   if(pdfhref) {
+   		newItem.attachments.push({
+		   title: newItem.publicationTitle+" "+newItem.issue + " v" + version + " PDF",


Simply "Full Text PDF" is sufficient. The specifics are available in the metadata

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+		title: newItem.publicationTitle+" "+newItem.issue + " v" + version +" Snapshot",
+		url: url,
+		mimeType:"text/html"
+	});


There is generally no reason to attach a snapshot if we're attaching a full text pdf

I see your point, but most of the other translators I looked at (at least ArXiv, and ACM) do this. It seems to be the standard practice.

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+}
+
+function doWeb(doc, url) {
+	if(detectWeb(doc,url)=="journalArticle") {


Given you detectWeb, this is guaranteed.

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+	if(detectWeb(doc,url)=="journalArticle") {
+	   var articles = new Array();
+	   articles = [url];
+	   Zotero.Utilities.processDocuments(articles, scrape, function(){Zotero.done();});


I'm not sure what you're trying to do here. The page is already loaded. Why re-load it?

aurimasv · Apr 10, 2016

aurimasv reviewed Apr 10, 2016

View changes

aurimasv · Apr 10, 2016

MIDAS Journals.js

+	   var articles = new Array();
+	   articles = [url];
+	   Zotero.Utilities.processDocuments(articles, scrape, function(){Zotero.done();});
+	   Zotero.wait();


This is no longer necessary. It was needed a long time ago.

aurimasv · Apr 10, 2016

As @zuphilip says, please provide some examples and, preferably, add test pages using Scaffold.

zuphilip · Jan 1, 2018

@str4w What is the status here?

str4w · Jan 2, 2018

Oops. Honestly, I assessed the number of comments at the time as "Too large to complete this week", and then I totally forgot about it after that. It is back on my radar, I will try to look at it in the near future.

zuphilip · Jan 2, 2018

Okay, thank you for the response. Can you start with the test cases?

I see that there is now also several export options, which we could also use instead of scraping from the website:

We could use for example the structured data in the XML for the authors (full first names, split into first and last names, several authors are split). What do you think?

adam3smith · Jan 2, 2018

looking at this quickly, wouldn't it make sense to just import BibTeX or Medline (for which we already have import translators) and then add journal title and ISSN quickly based on the URL?

zuphilip · Jan 2, 2018

BibTeX does not look that good, e.g. for http://www.vtkjournal.org/browse/publication/880 we have

  Author	= "I. Herrera and C. Buchart and D. Borro",

Medline format could work also as there are also the full first names there, but publication title seems missing and maybe more. That is the reason I asked for examples...

adam3smith · Jan 2, 2018

right, publication titles seem to be missing everywhere -- that's why I thought we'd add them based on URL.

str4w · Apr 10, 2018

Sorry this dragged so long. This should be improved from the last iteration. Questions -

Looking at some of the other examples, sometimes they use
Zotero.
Z.
ZU.
what is the correct way, what are these globals?

More importantly, this would be cleaner and the data would be more complete if it could pull the exported XML (sadly, as the previous comments point out, that data is incomplete and we can't just use that for everything). To get that XML, need to make an HTTP post and parse the output - how does one do that - can the Zotero.doGet method do it?

zuphilip · Apr 15, 2018

@str4w I worked directly on top of your work here and created a PR into your fork: str4w#1 . If you merge this PR then it will be automatically update here.

There are some special functions which you can use in coding Zotero translators and they either start with Zotero. (or short Z.) or Zotero.Utilities. (short ZU.). See also https://www.zotero.org/support/dev/translators/functions

We need a post call here and therfore I used Zotero.doPost. It is important then to continue inside this function and not afterwards. See my changes.

str4w · Apr 15, 2018

Thanks for the explanations and improvements. I merged with my fork so it appears here.

zuphilip · Apr 15, 2018

Okay, great. Let me know whether there is anything left from your point of view.

str4w · Apr 16, 2018

I'm running a few tests to be sure. A couple things - I see that you have added multiples. However, I tried this link
http://www.vtkjournal.org/browse/journal/53
and detectWeb did not detect it.

I tried this link http://www.insight-journal.org/browse/publication/921 and the doWeb returns the tags below. I don't understand the + and -. Is it normal? I am not actually using the autogenerated tags in Zotero, so this doesn't trouble me directly, just wondering.

         "tags": [
           {
     -       "tag": "Reproducibility"
     +       "tag": "IPython"
           }
           {
     -       "tag": "Tardigrades"
     +       "tag": "Reproducibility"
           }
           {
             "tag": "Scipy"
           }
           {
     -       "tag": "IPython"
     +       "tag": "Tardigrades"
           }
           {
     -       "tag": "notebook"
     +       "tag": "dexy"
           }
           {
     -       "tag": "dexy"
     +       "tag": "docker"
           }
           {
             "tag": "github"
           }
           {
     -       "tag": "docker"
     +       "tag": "notebook"
           }
         ]

zuphilip · Apr 16, 2018

Okay, I added another path for this multiple with some filtering on the expected urls. Moroever, some clean-ups. Please merge again: str4w#2

Yes, what you see with the tags is normal. The differences which are indicated with the -/+ are just the ordering, which is performed alphabetically for the test cases.

str4w · Apr 16, 2018

I checked a bunch of articles, ran the tests, looks good. I added the ISSN tag, since it is available in many cases, why not have it?

Are you sure you want the article number to go into the pages field? If that is consistent with what other translators do then it would make sense, it just doesn't fit the literal definition of that field as I understand it.

Assuming the ISSN tag is ok with you and the page field for the article number was intentional, looks good to me, go for it.

zuphilip · Apr 16, 2018

Are you sure you want the article number to go into the pages field? If that is consistent with what other translators do then it would make sense, it just doesn't fit the literal definition of that field as I understand it.

@adam3smith Do I remember correctly? Can you confirm (or correct) that saving an article id is best done in the pages field?

(I will merge it as soon as we clarified this.)

adam3smith · Apr 16, 2018

Do I remember correctly? Can you confirm (or correct) that saving an article id is best done in the pages field?

Yes, that's right

zuphilip · Apr 16, 2018

Merged! 🎉 Thank you @str4w for the work! (You should reset your master branch now before any new work.)

MIDAS Journal Translator

d2dcf4c

zuphilip added the New Translator label Nov 12, 2017

zuphilip added the waiting label Jan 1, 2018

Rework based on pull request comments

Loading status checks…

f1e171c

zuphilip removed the waiting label Apr 15, 2018

zuphilip added some commits Apr 15, 2018

Update MIDAS Journals.js

e08bd9b

[MIDAS Journals]: Use XML data for authors

c0d4ba7

Merge pull request #1 from zuphilip/str4w-master
Update MIDAS and use XML data

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

Loading status checks…

8bb6f32

[MIDAS Journals]: Add another multiple, clean-up

0f0c806

zuphilip merged commit d8bfa59 into zotero:master Apr 16, 2018
1 check passed

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

zotero/translators

Join GitHub today

MIDAS Journal Translator #1030

Conversation

str4w commented Mar 23, 2016

zuphilip reviewed Apr 10, 2016 View changes

This comment has been minimized.

zuphilip reviewed Apr 10, 2016 View changes

This comment has been minimized.

zuphilip reviewed Apr 10, 2016 View changes

This comment has been minimized.

This comment has been minimized.

zuphilip commented Apr 10, 2016

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

aurimasv reviewed Apr 10, 2016 View changes

This comment has been minimized.

This comment has been minimized.

aurimasv commented Apr 10, 2016

zuphilip added the New Translator label Nov 12, 2017

zuphilip added the waiting label Jan 1, 2018

This comment has been minimized.

zuphilip commented Jan 1, 2018

This comment has been minimized.

str4w commented Jan 2, 2018

This comment has been minimized.

zuphilip commented Jan 2, 2018

This comment has been minimized.

adam3smith commented Jan 2, 2018

This comment has been minimized.

zuphilip commented Jan 2, 2018

This comment has been minimized.

adam3smith commented Jan 2, 2018

This comment has been minimized.

str4w commented Apr 10, 2018

zuphilip removed the waiting label Apr 15, 2018

zuphilip added some commits Apr 15, 2018

This comment has been minimized.

zuphilip commented Apr 15, 2018

This comment has been minimized.

str4w commented Apr 15, 2018

This comment has been minimized.

zuphilip commented Apr 15, 2018

This comment has been minimized.

str4w commented Apr 16, 2018

This comment has been minimized.

zuphilip commented Apr 16, 2018

str4w added some commits Apr 16, 2018

This comment has been minimized.

str4w commented Apr 16, 2018

This comment has been minimized.

zuphilip commented Apr 16, 2018

This comment has been minimized.

adam3smith commented Apr 16, 2018

Hide details View details zuphilip merged commit d8bfa59 into zotero:master Apr 16, 2018 1 check passed

1 check passed

zuphilip reviewed Apr 10, 2016

View changes

zuphilip reviewed Apr 10, 2016

View changes

zuphilip reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

aurimasv reviewed Apr 10, 2016

View changes

zuphilip merged commit `d8bfa59` into zotero:master Apr 16, 2018
1 check passed