SAGE Journals - Resolved tag issue + detection issues #2077

placardo · 2019-12-06T18:19:13Z

Fix #1789 and fix #2075. This works in my browser but it doesn't work in Scaffold; when the scrape function evaluate url.match() line 85 it throws an error, because in Scaffold the URL is https://journals.sagepub.com/action/**cookieAbsent** instead of the DOI but it seems to work otherwise.

From issue #1789 :

There are examples with keywords in several languages, where I would suggest to take only the keywords in one languge (e.g. English), see e.g.

Does some Zotero Utility or something else exists to select only one language?


        Resolved tag issue + detection issues

adam3smith

It looks like the tags are grouped. If you only take the first group, that should work nicely.

SAGE Journals.js


        Selected article language tags

adam3smith · 2019-12-07T19:27:57Z

SAGE Journals.js

@@ -2,14 +2,14 @@
 	"translatorID": "908c1ca2-59b6-4ad8-b026-709b7b927bda",
 	"label": "SAGE Journals",
 	"creator": "Sebastian Karcher",
-	"target": "^https?://journals\\.sagepub\\.com(/doi/((abs|full)/)?10\\.|/action/doSearch\\?|/toc/)",
+	"target": "^https?://journals\\.sagepub\\.com(/doi/((abs|full|pdf)/)?10\\.|/action/doSearch\\?|/toc/)",


Did you test this on a PDF page? Given that the filename for the post request (l. 91) is scraped from the page, I don't see how it would work.

I added the pdf because a lot of webpages besides doi/abs and doi/full have doi/pdf in their URL, see eg the search results included in the translator tests (https://journals.sagepub.com/action/doSearch?text1=test)
On Firefox when I'm on the actual pdf page the PDF translator is activated instead of SAGE Journals translator and on Chrome the SAGE Journal translator is sometimes activated and yet I can still import the item (eg here : https://journals.sagepub.com/doi/pdf/10.1177/003754979005400307?casa_token=8AElK4mX07EAAAAA:WxK8lh-rsHkjRwcM1Z8EldOJF-eKIOC3nCaL0gCLxHTQOohZcnxb9ykF8F7KJSboIuwQsisPgNl8cQ I think that's when I previously visited the webpage linking to the pdf), sometimes only the PDF translator shows up (eg here which goes directly to pdf without a webpage: https://journals.sagepub.com/doi/pdf/10.1177/003754979005400307)

Good catch on these turning up in search results, but that actually means we need to replace /doi/pdf in search results with /doi/abs to avoid import errors. Let me know if that makes sense and you can add it.

I don't have an explanation for why you'd get a working import from Chrome in the case you have -- there's really no way this should work that I can think of. The fact that it's always showing the PDF translator for Firefox is due to a restriction of the extension framework there, but that's obviously not a reason to include PDF URLs. With things working correctly, I'd expect false positives there so please rever that part both in the target and the detect function.

Actually this works without the filename argument in the POST request, if you remove it you still have a valid and complete response I guess the DOI is enough but I wasn't able to find any documentation about which arguments are necessary.
Should I really remove it from detect and target? I don't mean to squabble but it would defeat the fix as loads of pages have pdf in their URL but do not lead directly to a pdf because of the paywall and the translator wouldn't be triggered on those pages

You're right, the pdf should stay in the taget (and please do squabble if you think any of my or zuphilip's comments are wrong). I thought it redirected on paywalled pages, but it doesn't.
I've removed the unneeded post parameter (thanks for that) and redirected multiple imports to the abstract page (so we can get language, improved abstract, and where they exist, keywords).

If you could please update tests (since they now get keywords), this is ready to merge.

Oh, and please fix linting errors and warnings. Should be quick.

I added the tests manually since it does not work in Scaffold (cf above) and fixed the errors and warnings

oh, sorry! I could have done it then. I don't have that problem, so not sure what's going on with that.

np glad to help! I will look into it I'm sure there is a reasonable explanation


        Remove unnecessary post parameter

Go to abstract page for multiples


        linting + tests

adam3smith · 2019-12-10T18:39:59Z

Thanks!

Resolved tag issue + detection issues

Loading status checks…

95c745d

placardo changed the title ~~Resolved tag issue + detection issues~~ SAGE Journal - Resolved tag issue + detection issues Dec 6, 2019

placardo changed the title ~~SAGE Journal - Resolved tag issue + detection issues~~ SAGE Journals - Resolved tag issue + detection issues Dec 6, 2019

adam3smith reviewed Dec 6, 2019

View changes

SAGE Journals.js Outdated Show resolved Hide resolved

SAGE Journals.js Show resolved Hide resolved

Selected article language tags

Loading status checks…

ce024f3

zuphilip added the Improvements label Dec 7, 2019

adam3smith reviewed Dec 7, 2019

View changes

adam3smith and others added 2 commits Dec 10, 2019

Remove unnecessary post parameter

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

Loading status checks…

1e6f1a6

Go to abstract page for multiples

linting + tests

Loading status checks…

78351dd

adam3smith merged commit 4d202f1 into zotero:master Dec 10, 2019
1 check passed

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

placardo deleted the placardo:sagejournal branch Dec 10, 2019

Please note that GitHub no longer supports your web browser.

zotero/translators

SAGE Journals - Resolved tag issue + detection issues #2077

SAGE Journals - Resolved tag issue + detection issues #2077

placardo commented Dec 6, 2019

adam3smith left a comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

adam3smith commented Dec 10, 2019

Please note that GitHub no longer supports your web browser.

zotero/translators

Join GitHub today

SAGE Journals - Resolved tag issue + detection issues #2077

SAGE Journals - Resolved tag issue + detection issues #2077

Conversation

placardo commented Dec 6, 2019

adam3smith left a comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

placardo Dec 8, 2019 • edited

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

adam3smith commented Dec 10, 2019

placardo Dec 8, 2019 •

edited