Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ISBN to Internet Archive metadata #2137

Open
wants to merge 1 commit into
base: master
from

Conversation

@mvolz
Copy link
Contributor

mvolz commented Mar 6, 2020

Books on IA have ISBNs but this info isn't being
scraped. Add support to add this info.

Add book to tests with ISBN (Example URL)
https://archive.org/details/darktowersdeutsc0000enri/mode/2up

Books on IA have ISBNs but this info isn't being
scraped. Add support to add this info.

Add book to tests with ISBN (Example URL)
https://archive.org/details/darktowersdeutsc0000enri/mode/2up
@mvolz

This comment has been minimized.

Copy link
Contributor Author

mvolz commented Mar 6, 2020

This work but for some reason the 10 digit isbns aren't making it into the final citation... do those get cleaned out or something upstream?

@adam3smith

This comment has been minimized.

Copy link
Collaborator

adam3smith commented Mar 6, 2020

which citation output (how generated) are you referring to?
Zotero does have a clean.ISBN function that might discard the second ISBN, but I'm not sure where it would be run automatically (as opposed to being called explicitly in the translator)

@mvolz

This comment has been minimized.

Copy link
Contributor Author

mvolz commented Mar 6, 2020

which citation output (how generated) are you referring to?
Zotero does have a clean.ISBN function that might discard the second ISBN, but I'm not sure where it would be run automatically (as opposed to being called explicitly in the translator)

The test case I added, https://archive.org/details/darktowersdeutsc0000enri/mode/2up

When I do ZU.debug(newItem.ISBN) the field has 4 isbns, but when I use the translator file and curl it the isbn 10s are missing. I'm not too bothered by that if you aren't but I was wondering if that was a symptom of something going wrong!

curl -d 'https://archive.org/details/darktowersdeutsc0000enri/mode/2up' -H 'Content-Type: text/plain' http://127.0.0.1:1969/web

[{"key":"9S6E5TKU","version":0,"itemType":"book","creators":[{"firstName":"David","lastName":"Enrich","creatorType":"author"},{"name":"Internet Archive","creatorType":"contributor"}],"tags":[{"tag":"Trump, Donald, 1946-","type":1}],"title":"Dark towers : Deutsche Bank, Donald Trump, and an epic trail of destruction","abstractNote":"x, 402 pages ; 24 cm; \"A searing exposé by an award-winning journalist of the most scandalous bank in the world, including its shadowy ties to Donald Trump's business empire\"--; Includes bibliographical references (pages 367-390) and index; \"A searing expose by an award-winning journalist of the most scandalous bank in the world, including its shadowy ties to Donald Trump's business empire\"--","date":"2020","publisher":"New York, NY : Custom House","language":"eng","numPages":"426","ISBN":"9780062878816 9780062878830","url":"http://archive.org/details/darktowersdeutsc0000enri","libraryCatalog":"Internet Archive","accessDate":"2020-03-06T13:45:37Z","shortTitle":"Dark towers"}]
@mvolz

This comment has been minimized.

Copy link
Contributor Author

mvolz commented Mar 6, 2020

PS I've linted a little, do you prefer it to go as additional commits in this thread, or as a separate PR?

@adam3smith

This comment has been minimized.

Copy link
Collaborator

adam3smith commented Mar 6, 2020

linting in the same PR, different (non-squashed, obv) commit, please. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.