Skip to content
Please note that GitHub no longer supports your web browser.

We recommend upgrading to the latest Google Chrome or Firefox.

Learn more
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAGE Journals - Resolved tag issue + detection issues #2077

Merged
merged 4 commits into from Dec 10, 2019

Conversation

@placardo
Copy link
Contributor

placardo commented Dec 6, 2019

Fix #1789 and fix #2075. This works in my browser but it doesn't work in Scaffold; when the scrape function evaluate url.match() line 85 it throws an error, because in Scaffold the URL is https://journals.sagepub.com/action/**cookieAbsent** instead of the DOI but it seems to work otherwise.

From issue #1789 :

There are examples with keywords in several languages, where I would suggest to take only the keywords in one languge (e.g. English), see e.g.

Does some Zotero Utility or something else exists to select only one language?

@placardo placardo changed the title Resolved tag issue + detection issues SAGE Journal - Resolved tag issue + detection issues Dec 6, 2019
@placardo placardo changed the title SAGE Journal - Resolved tag issue + detection issues SAGE Journals - Resolved tag issue + detection issues Dec 6, 2019
Copy link
Collaborator

adam3smith left a comment

It looks like the tags are grouped. If you only take the first group, that should work nicely.

SAGE Journals.js Outdated Show resolved Hide resolved
SAGE Journals.js Show resolved Hide resolved
@@ -2,14 +2,14 @@
"translatorID": "908c1ca2-59b6-4ad8-b026-709b7b927bda",
"label": "SAGE Journals",
"creator": "Sebastian Karcher",
"target": "^https?://journals\\.sagepub\\.com(/doi/((abs|full)/)?10\\.|/action/doSearch\\?|/toc/)",
"target": "^https?://journals\\.sagepub\\.com(/doi/((abs|full|pdf)/)?10\\.|/action/doSearch\\?|/toc/)",

This comment has been minimized.

Copy link
@adam3smith

adam3smith Dec 7, 2019

Collaborator

Did you test this on a PDF page? Given that the filename for the post request (l. 91) is scraped from the page, I don't see how it would work.

This comment has been minimized.

Copy link
@placardo

placardo Dec 7, 2019

Author Contributor

I added the pdf because a lot of webpages besides doi/abs and doi/full have doi/pdf in their URL, see eg the search results included in the translator tests (https://journals.sagepub.com/action/doSearch?text1=test)
On Firefox when I'm on the actual pdf page the PDF translator is activated instead of SAGE Journals translator and on Chrome the SAGE Journal translator is sometimes activated and yet I can still import the item (eg here : https://journals.sagepub.com/doi/pdf/10.1177/003754979005400307?casa_token=8AElK4mX07EAAAAA:WxK8lh-rsHkjRwcM1Z8EldOJF-eKIOC3nCaL0gCLxHTQOohZcnxb9ykF8F7KJSboIuwQsisPgNl8cQ I think that's when I previously visited the webpage linking to the pdf), sometimes only the PDF translator shows up (eg here which goes directly to pdf without a webpage: https://journals.sagepub.com/doi/pdf/10.1177/003754979005400307)

This comment has been minimized.

Copy link
@adam3smith

adam3smith Dec 8, 2019

Collaborator

Good catch on these turning up in search results, but that actually means we need to replace /doi/pdf in search results with /doi/abs to avoid import errors. Let me know if that makes sense and you can add it.

I don't have an explanation for why you'd get a working import from Chrome in the case you have -- there's really no way this should work that I can think of. The fact that it's always showing the PDF translator for Firefox is due to a restriction of the extension framework there, but that's obviously not a reason to include PDF URLs. With things working correctly, I'd expect false positives there so please rever that part both in the target and the detect function.

This comment has been minimized.

Copy link
@placardo

placardo Dec 8, 2019

Author Contributor

Actually this works without the filename argument in the POST request, if you remove it you still have a valid and complete response I guess the DOI is enough but I wasn't able to find any documentation about which arguments are necessary.
Should I really remove it from detect and target? I don't mean to squabble but it would defeat the fix as loads of pages have pdf in their URL but do not lead directly to a pdf because of the paywall and the translator wouldn't be triggered on those pages

This comment has been minimized.

Copy link
@adam3smith

adam3smith Dec 10, 2019

Collaborator

You're right, the pdf should stay in the taget (and please do squabble if you think any of my or zuphilip's comments are wrong). I thought it redirected on paywalled pages, but it doesn't.
I've removed the unneeded post parameter (thanks for that) and redirected multiple imports to the abstract page (so we can get language, improved abstract, and where they exist, keywords).

If you could please update tests (since they now get keywords), this is ready to merge.

This comment has been minimized.

Copy link
@adam3smith

adam3smith Dec 10, 2019

Collaborator

Oh, and please fix linting errors and warnings. Should be quick.

This comment has been minimized.

Copy link
@placardo

placardo Dec 10, 2019

Author Contributor

I added the tests manually since it does not work in Scaffold (cf above) and fixed the errors and warnings

This comment has been minimized.

Copy link
@adam3smith

adam3smith Dec 10, 2019

Collaborator

oh, sorry! I could have done it then. I don't have that problem, so not sure what's going on with that.

This comment has been minimized.

Copy link
@placardo

placardo Dec 10, 2019

Author Contributor

np glad to help! I will look into it I'm sure there is a reasonable explanation

adam3smith and others added 2 commits Dec 10, 2019
Go to abstract page for multiples
@adam3smith adam3smith merged commit 4d202f1 into zotero:master Dec 10, 2019
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@adam3smith

This comment has been minimized.

Copy link
Collaborator

adam3smith commented Dec 10, 2019

Thanks!

@placardo placardo deleted the placardo:sagejournal branch Dec 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.