Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target regexp and priority cleanup #789

Closed
wants to merge 1 commit into from

Conversation

@aurimasv
Copy link
Contributor

commented Oct 26, 2014

  • Make sure periods are always escaped
  • Where appropriate, always start with "^https?://"
  • For optional subdomains, always use "([^/]*\.)?", for mandatory subdomains, use "[^/]+\."
  • Remove all instances of non-capturing parentheses and replace with capturing parentheses. This doesn't matter in target regexp and makes it a bit cleaner.
  • Adjust priorities based on #765 (probably not exhaustive)

Possibly unnecessary and I'm not sure how great such a huge amount of translator updates is on Zotero servers. I'm going to do a general cleanup of processDocuments, doGet, doPost calls to use relative or at least protocol-relative URLs because of breakage in Chrome, so this should probably wait until the next commit.

@aurimasv

This comment has been minimized.

Copy link
Contributor Author

commented Oct 26, 2014

Also, if anyone has any other thoughts on general translator updates, let me know.

One thing that comes to mind is that we need to bump up Framework code.

@aurimasv

This comment has been minimized.

Copy link
Contributor Author

commented Nov 17, 2014

TODO: add missing license blocks

I'm thinking to add them in the name of whoever submitted the translator in the first place (though a lot have been rewritten) and ask the person to OK the addition. Alternatively, could the license be in Zotero's name?

@aurimasv

This comment has been minimized.

Copy link
Contributor Author

commented Nov 22, 2014

TODO: adjust all translators to convert titles to sentence case instead of title case if they are in all-upper case.

Though this might actually have to rely on more complicated logic, which should just be copied from https://github.com/zotero/zotero/blob/4.0/chrome/content/zotero/bindings/itembox.xml#L1952 into something like Zotero.Utilities.toSentenceCase

@aurimasv aurimasv force-pushed the aurimasv:target-priorities-cleanup branch from 6fd62ec to df92f55 Feb 12, 2015

Target regexp and priority cleanup
* Make sure periods are always escaped
* Where appropriate, always start with "^https?://"
* For optional subdomains, always use "([^/]+\\.)?", for mandatory subdomains, use "[^/]+\\."
* Tether domain matches with / on the right so that proxies can be determined
* Remove unnecessary matching of proxies
* Remove all instances of non-capturing parentheses and replace with capturing parentheses. This doesn't matter in target regexp
* Adjust priorities based on #765 (probably not exhaustive)

@aurimasv aurimasv force-pushed the aurimasv:target-priorities-cleanup branch from df92f55 to 8c761a2 Feb 12, 2015

@zuphilip zuphilip referenced this pull request Jan 1, 2017

Merged

Target tweaks #1211

@adam3smith adam3smith closed this in ed21c7b Jan 8, 2017

@zuphilip

This comment has been minimized.

Copy link
Collaborator

commented Jan 8, 2017

Not all of the points might be already dealt in #1211 or other commits.

I didn't look at:

  • For optional subdomains, always use "([^/]*.)?", for mandatory subdomains, use "[^/]+."

use relative or at least protocol-relative URLs

adjust all translators to convert titles to sentence case instead of title case if they are in all-upper case.

If anyone wants to continue on something here, then I would simply suggest to open a new issue.

zuphilip added a commit to zuphilip/translators that referenced this pull request Mar 28, 2018

Target tweaks (zotero#1211)
Closes zotero#789
The priority issue is separate, but we may also have done that since.
* Always start target with ^, when appropriate
* Start target always with ^https?, when approriate
* Don't escape slash in target regexp
* Escape periods in target when needed
Found by
$ grep '"target": .*[^\]\.\w' *.js
* Simplify groups in regexp of target
Remove all instances of non-capturing parentheses and replace with capturing parentheses. This doesn't matter in target regexp and makes it a bit cleaner.

zuphilip added a commit to zuphilip/translators that referenced this pull request Mar 28, 2018

Target tweaks (zotero#1211)
Closes zotero#789
The priority issue is separate, but we may also have done that since.
* Always start target with ^, when appropriate
* Start target always with ^https?, when approriate
* Don't escape slash in target regexp
* Escape periods in target when needed
Found by
$ grep '"target": .*[^\]\.\w' *.js
* Simplify groups in regexp of target
Remove all instances of non-capturing parentheses and replace with capturing parentheses. This doesn't matter in target regexp and makes it a bit cleaner.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.