New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attachment URLs are proxied even when they shouldn't be #1612

Closed
dstillman opened this Issue Dec 24, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@dstillman
Member

dstillman commented Dec 24, 2018

Thanks to some very helpful debugging, I think I've figured out what's breaking PDF downloads for some EBSCOhost users.

We appear to be converting all non-link attachment URLs to proxied URLs:

if(attachment.url) {
// Remap attachment (but not link) URLs
// TODO: provide both proxied and un-proxied URLs (also for documents)
// because whether the attachment is attached as link or file
// depends on Zotero preferences as well.
attachment.url = translate.resolveURL(attachment.url, attachment.snapshot === false);
}

We've done this for many years, presumably because, while some translators extract the download URL from the page, others build it from available metadata, and in the latter case the translator doesn't know whether it needs to be proxied and just builds a normal (or relative) URL. This usually works fine, either because the URL should in fact be proxied like the parent page or because the proxy handles an unnecessarily proxied URL. But on EBSCO, PDF URLs don't need to be proxied and for some — but not all — users the unnecessarily proxied PDF URLs (e.g., content.ebscohost.com.ezproxy2.library.drexel.edu) return 404s.

Two possible fixes:

  1. Translators could specify in the attachment object that a URL came from the page and shouldn't be proxied further.

  2. Translators could specify in the attachment object that a URL was manually built and should be proxied, and we would otherwise stop automatically proxying URLs.

(2) feels a bit cleaner, since it makes changing the URL an explicit action, but it would break translators that rely on the current behavior. Also, some translators return relative links, and proxying those by default — i.e., matching the current domain — seems like the more natural and expected option. So I think we have to go with (1).

For the flag, some options:

  • proxy: false - Most obvious, but a little misleading, since the URL could already be proxied, and really we're just saying it shouldn't be changed. Similarly, if the translator returns a relative URL and also specifies this option, it sort of implies that it should be built from an unproxied URL, when in fact it would still be resolved using the current URL, proxied or otherwise, and just wouldn't be force-converted to a proxied URL. But if documented properly, this might be OK.
  • direct: true - Opposite problem from proxy: false, and less clear what it's referring to
  • proper: true - More accurate, but just our internal term
  • convertURL: true - Accurate, but not clear that it's about the proxy

I guess I'm inclined to go with proxy: false unless someone has a better idea.

@adam3smith

This comment has been minimized.

Contributor

adam3smith commented Dec 25, 2018

Yes, please go with 1) -- having to include the proxy is very commonly required and having to do this (and especially: fix this) manually for all instances would be quite tedious.

I like the proxy: false flag, but generally don't think it matters a ton as long as its (as you say) well documented.

@dstillman dstillman closed this in 4bc8fab Dec 26, 2018

dstillman added a commit to zotero/zotero-connectors that referenced this issue Dec 26, 2018

Update Zotero submodule
This includes `proxy: false` support from zotero/zotero#1612.

dstillman added a commit to zotero/translators that referenced this issue Dec 26, 2018

EBSCOhost: Fix PDF downloads via some proxy servers
Depends on zotero/zotero#1612, which will be in Zotero Connector 5.0.51
@dstillman

This comment has been minimized.

Member

dstillman commented Dec 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment