Add new Google Research translator #1503

GuyAglionby · Dec 23, 2017

There was some embedded metadata but it didn't capture everything + it didn't capture the PDF, so added a translator :)

did you close this on purpose?

adam3smith · Dec 27, 2017

did you close this on purpose?

Oh -- no, sorry about that. Reopened.

I will commit changes later on to take into account the feedback on the other commit

GuyAglionby · Dec 27, 2017

Oh -- no, sorry about that. Reopened.

I will commit changes later on to take into account the feedback on the other commit

OK, will wait with the review then, thanks!

adam3smith · Dec 27, 2017

OK, will wait with the review then, thanks!

Hi, sorry for the delay but I've now rewritten the translator so it can handle author pages, search results, and the pages itself. I apologise that this is quite a big PR.

Just a weird thing I'd like to explain: some of the search results are direct links to the paper's PDF in the format https://research.google.com/pubs/archive/#####.pdf . This ID (the #####) can be used to find the publication's info page containing all the necessary info to create a Zotero item, which is what I have done. However, in some cases this page doesn't exist, so the extra GET checks that.

I'm not happy about how this is done on line 140 -- ideally we'd use responseObj.status, however when I try this I get a Permission Denied error, which I understand is CSRF protection. I don't know if there's any way to get around this, and would appreciate any pointers :)

Any other comments welcomed, thanks in advance!

GuyAglionby · Jan 28, 2018

Hi, sorry for the delay but I've now rewritten the translator so it can handle author pages, search results, and the pages itself. I apologise that this is quite a big PR.

Just a weird thing I'd like to explain: some of the search results are direct links to the paper's PDF in the format https://research.google.com/pubs/archive/#####.pdf . This ID (the #####) can be used to find the publication's info page containing all the necessary info to create a Zotero item, which is what I have done. However, in some cases this page doesn't exist, so the extra GET checks that.

I'm not happy about how this is done on line 140 -- ideally we'd use responseObj.status, however when I try this I get a Permission Denied error, which I understand is CSRF protection. I don't know if there's any way to get around this, and would appreciate any pointers :)

Any other comments welcomed, thanks in advance!

However, in some cases this page doesn't exist, so the extra GET checks that.

Can you provide an example of a page where that happens?

dstillman · Jan 28, 2018

However, in some cases this page doesn't exist, so the extra GET checks that.

Can you provide an example of a page where that happens?

Sorry, not sure how this slipped my mind. First result of https://research.google.com/search.html?q=simple%20risk%20bounds is https://research.google.com/pubs/archive/35621.pdf , where the corresponding https://research.google.com/pubs/pub35621.html is a 404.

GuyAglionby · Jan 28, 2018

Sorry, not sure how this slipped my mind. First result of https://research.google.com/search.html?q=simple%20risk%20bounds is https://research.google.com/pubs/archive/35621.pdf , where the corresponding https://research.google.com/pubs/pub35621.html is a 404.

adam3smith · Jan 28, 2018

adam3smith requested changes Jan 28, 2018

View changes

I'm happy with this, just 2 small nits. @dstillman any objections to the handling of the 404s?

Google Research.js

adam3smith · Jan 28, 2018

Google Research.js

+		]
+	},
+	{
+		"type": "web",


could you also add a test for search results (or am I just overlooking that)?

This may be a silly question, but if I add a test as follows:

{
	"type": "web",
	"url": "https://research.google.com/search.html?q=algorithm",
	"defer": true,
	"items": "multiple"
}

It fails with TranslatorTester: Google Research Test 1: failed (Translation failed: no items returned). It seems like this is because the search results are loaded by JS after the page is loaded, which is why I put the defer in there, but that doesn't seem to be actually making the test wait. I think I'm missing something?

I'm also happy to add a test case for the direct PDF link, but am having the same issue with the test failing as with the CiteSeer parser.

Happy to make the other change, will push with the test cases as and when. Thanks!

GuyAglionby · Jan 28, 2018

This may be a silly question, but if I add a test as follows:

{
	"type": "web",
	"url": "https://research.google.com/search.html?q=algorithm",
	"defer": true,
	"items": "multiple"
}

It fails with TranslatorTester: Google Research Test 1: failed (Translation failed: no items returned). It seems like this is because the search results are loaded by JS after the page is loaded, which is why I put the defer in there, but that doesn't seem to be actually making the test wait. I think I'm missing something?

I'm also happy to add a test case for the direct PDF link, but am having the same issue with the test failing as with the CiteSeer parser.

Happy to make the other change, will push with the test cases as and when. Thanks!

Yeah, the test framework sometimes hits limits, do what @zuphilip did for the other translator, i.e. comment out the test so that we have it ready for manual testing as needed. Once you remove the itemID, this is ready to go. Thanks!

adam3smith · Feb 1, 2018

Yeah, the test framework sometimes hits limits, do what @zuphilip did for the other translator, i.e. comment out the test so that we have it ready for manual testing as needed. Once you remove the itemID, this is ready to go. Thanks!

Amended as requested. Thanks!

GuyAglionby · Feb 1, 2018

Amended as requested. Thanks!

Hey there, just wanted to double check that I wasn't missing any other requested changes before this is merged?

GuyAglionby · Feb 15, 2018

Hey there, just wanted to double check that I wasn't missing any other requested changes before this is merged?

I don't think so. I'm just behind with review & merge

adam3smith · Feb 15, 2018

I don't think so. I'm just behind with review & merge

No problem at all -- I just wanted to be sure. Cheers!

GuyAglionby · Feb 15, 2018

No problem at all -- I just wanted to be sure. Cheers!

superseded by #1741

adam3smith · Sep 15, 2018

superseded by #1741

zuphilip added the New Translator label Dec 25, 2017

GuyAglionby closed this Dec 27, 2017

GuyAglionby reopened this Dec 27, 2017

zuphilip added the waiting label Jan 1, 2018

Add new Google Research translator

1c8c248

zuphilip removed the waiting label Jan 28, 2018

Amend Google Research translator

7c533b8

GuyAglionby referenced this pull request Sep 9, 2018
Open
Add Google AI translator #1741

adam3smith closed this Sep 15, 2018

zotero/translators

Add new Google Research translator #1503

GuyAglionby commented Dec 23, 2017

zuphilip added the New Translator label Dec 25, 2017

GuyAglionby closed this Dec 27, 2017

This comment has been minimized.

adam3smith commented Dec 27, 2017

This comment has been minimized.

GuyAglionby commented Dec 27, 2017

GuyAglionby reopened this Dec 27, 2017

This comment has been minimized.

adam3smith commented Dec 27, 2017

zuphilip added the waiting label Jan 1, 2018

This comment has been minimized.

GuyAglionby commented Jan 28, 2018 •

edited

This comment has been minimized.

dstillman commented Jan 28, 2018

This comment has been minimized.

GuyAglionby commented Jan 28, 2018

adam3smith requested changes Jan 28, 2018

View changes

This comment has been minimized.

zuphilip removed the waiting label Jan 28, 2018

This comment has been minimized.

GuyAglionby commented Jan 28, 2018

This comment has been minimized.

adam3smith commented Feb 1, 2018

This comment has been minimized.

GuyAglionby commented Feb 1, 2018

This comment has been minimized.

GuyAglionby commented Feb 15, 2018

This comment has been minimized.

adam3smith commented Feb 15, 2018

This comment has been minimized.

GuyAglionby commented Feb 15, 2018

GuyAglionby referenced this pull request Sep 9, 2018

Add Google AI translator #1741

This comment has been minimized.

adam3smith commented Sep 15, 2018

adam3smith closed this Sep 15, 2018

zotero/translators

Join GitHub today

Add new Google Research translator #1503

Conversation

GuyAglionby commented Dec 23, 2017

zuphilip added the New Translator label Dec 25, 2017

GuyAglionby closed this Dec 27, 2017

This comment has been minimized.

adam3smith commented Dec 27, 2017

This comment has been minimized.

GuyAglionby commented Dec 27, 2017

GuyAglionby reopened this Dec 27, 2017

This comment has been minimized.

adam3smith commented Dec 27, 2017

zuphilip added the waiting label Jan 1, 2018

This comment has been minimized.

GuyAglionby commented Jan 28, 2018 • edited

This comment has been minimized.

dstillman commented Jan 28, 2018

This comment has been minimized.

GuyAglionby commented Jan 28, 2018

adam3smith requested changes Jan 28, 2018 View changes

This comment has been minimized.

zuphilip removed the waiting label Jan 28, 2018

This comment has been minimized.

GuyAglionby commented Jan 28, 2018

This comment has been minimized.

adam3smith commented Feb 1, 2018

This comment has been minimized.

GuyAglionby commented Feb 1, 2018

This comment has been minimized.

GuyAglionby commented Feb 15, 2018

This comment has been minimized.

adam3smith commented Feb 15, 2018

This comment has been minimized.

GuyAglionby commented Feb 15, 2018

GuyAglionby referenced this pull request Sep 9, 2018

This comment has been minimized.

adam3smith commented Sep 15, 2018

adam3smith closed this Sep 15, 2018

GuyAglionby commented Jan 28, 2018 •

edited

adam3smith requested changes Jan 28, 2018

View changes