New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support schema.org in RDF.js #1634

Merged
merged 7 commits into from May 14, 2018

Conversation

Projects
None yet
3 participants
@zuphilip
Collaborator

zuphilip commented Apr 23, 2018

I cherry-picked some commits by @dgerber and continued there the support of schema.org in RDF. Everything works quite well and I checked the dependent translators which seem to be unaffected by the changes here.

This is also the first step for the JSON-LD support #917 (still much to do afterwards). Also the Codemeta format #1587 is supported in RDF.

dgerber and others added some commits Feb 25, 2017

Update schema.org support in RDF.js
* Extended the function getFirstResults such that multiple
nodes in an array are supported.
* Fix isPartOf calculation for larger hierarchies.
* Integrate type detection from schema.org with similar
detection based on rdf:type in bib/bibo ontology.
* Update schema.org type mapping and exclude also some
generic types.
* Add more fields from schema.org
* Delete itemID which is sometimes dynamic and therefore not
reliable for the test cases.
* Some clean up work, e.g. semi-colons, inlcudes, array/object notation
* Add a test case for a journal article in schema.org
Add support for Codemeta format
Most of the properties are from schema.org and
therefore already supported. Adjust them here
and add some more for software citations.
@adam3smith

I haven't dones any testing, just a read through and have some nits, nothing major except for the IDs which I think we can't throw out.

I'd want to test this properly, but generally quite happy with this. Anything you think requires discussion?

Show outdated Hide outdated RDF.js
return result[0];
function getFirstResults(nodes, properties, onlyOneString) {
if (!nodes.length) nodes = [nodes];
for (let j=0; j<nodes.length; j++) {

This comment has been minimized.

@adam3smith

adam3smith May 8, 2018

Collaborator

for (let node of nodes)

@adam3smith

adam3smith May 8, 2018

Collaborator

for (let node of nodes)

Show outdated Hide outdated RDF.js
} else {
let c = { creatorType: creatorType };
c.lastName = getFirstResults(obj,
[ n.foaf+"familyName", n.foaf+"lastName",

This comment has been minimized.

@adam3smith

adam3smith May 8, 2018

Collaborator

looks like indenting went wrong here?

@adam3smith

adam3smith May 8, 2018

Collaborator

looks like indenting went wrong here?

case 'videogameclip':
t.soGuess = 'videoRecording'; break;
case 'tvclip':
case 'tvepisode':

This comment has been minimized.

@adam3smith

adam3smith May 8, 2018

Collaborator

why split up this & following case?

@adam3smith

adam3smith May 8, 2018

Collaborator

why split up this & following case?

This comment has been minimized.

@zuphilip

zuphilip May 8, 2018

Collaborator

This case uses t.so and the next one t.soGuess. Thus, they have different order in the final calculation:

	var itemType = t.zotero || t.bib || t.prism || t.eprints || t.og || t.dc ||
		t.so ||
		exports.defaultUnknownType || t.zoteroGuess || t.bibGuess ||
		t.prismGuess || t.ogGuess || t.dcGuess || t.soGuess;

I put "tvseries" (a whole series is maybe something else) and "episode" (this sounds generic) as a guess in the second case. Do you suggest to use t.so for all of them?

@zuphilip

zuphilip May 8, 2018

Collaborator

This case uses t.so and the next one t.soGuess. Thus, they have different order in the final calculation:

	var itemType = t.zotero || t.bib || t.prism || t.eprints || t.og || t.dc ||
		t.so ||
		exports.defaultUnknownType || t.zoteroGuess || t.bibGuess ||
		t.prismGuess || t.ogGuess || t.dcGuess || t.soGuess;

I put "tvseries" (a whole series is maybe something else) and "episode" (this sounds generic) as a guess in the second case. Do you suggest to use t.so for all of them?

This comment has been minimized.

@zuphilip

zuphilip May 8, 2018

Collaborator

Does "+1" mean it is okay like it is or should I change it?

@zuphilip

zuphilip May 8, 2018

Collaborator

Does "+1" mean it is okay like it is or should I change it?

This comment has been minimized.

@adam3smith

adam3smith May 9, 2018

Collaborator

that it's OK

@adam3smith

adam3smith May 9, 2018

Collaborator

that it's OK

case 'episode':
t.soGuess = 'tvBroadcast'; break;
case 'radioclip':
case 'radioepisode':

This comment has been minimized.

@adam3smith

adam3smith May 8, 2018

Collaborator

ditto

@adam3smith

adam3smith May 8, 2018

Collaborator

ditto

Show outdated Hide outdated RDF.js
@@ -754,19 +862,22 @@ function importItem(newItem, node) {
if(creatorType == "author") {
creators = getFirstResults(node, [n.bib+"authors", n.dc+"creator", n.dc1_0+"creator",
n.dcterms+"creator", n.eprints+"creators_name",
n.dc+"contributor", n.dc1_0+"contributor", n.dcterms+"contributor"]);
n.dc+"contributor", n.dc1_0+"contributor", n.dcterms+"contributor",
n.so+"author", n.so+"creator"]);

This comment has been minimized.

@adam3smith

adam3smith May 8, 2018

Collaborator

we should think about where we but so in the hierarchy of preference. I'd have said we prefer it to dc, say?

@adam3smith

adam3smith May 8, 2018

Collaborator

we should think about where we but so in the hierarchy of preference. I'd have said we prefer it to dc, say?

This comment has been minimized.

@zuphilip

zuphilip May 8, 2018

Collaborator

Before dc+"contributor" or before dc+"creator"?

@zuphilip

zuphilip May 8, 2018

Collaborator

Before dc+"contributor" or before dc+"creator"?

This comment has been minimized.

@adam3smith

adam3smith May 8, 2018

Collaborator

I'm not sure, but my inclination would be to say the so comes right after bib and z in expected quality, but that could be totally off base, what do you think?

@adam3smith

adam3smith May 8, 2018

Collaborator

I'm not sure, but my inclination would be to say the so comes right after bib and z in expected quality, but that could be totally off base, what do you think?

This comment has been minimized.

@zuphilip

zuphilip May 8, 2018

Collaborator

Okay

@zuphilip

zuphilip May 8, 2018

Collaborator

Okay

Show outdated Hide outdated RDF.js
@@ -1236,9 +1357,10 @@ function importNext(nodes, index, collections, resolve, reject) {
}
var newItem = new Zotero.Item();
newItem.itemID = Zotero.RDF.getResourceURI(node);
//newItem.itemID = Zotero.RDF.getResourceURI(node);

This comment has been minimized.

@adam3smith

adam3smith May 8, 2018

Collaborator

don't we need this for Zotero RDF& collection handling?

@adam3smith

adam3smith May 8, 2018

Collaborator

don't we need this for Zotero RDF& collection handling?

This comment has been minimized.

@zuphilip

zuphilip May 8, 2018

Collaborator

You are possibly right! - The itemID lead to some problems but now I can't remember. I have to look into this again...

@zuphilip

zuphilip May 8, 2018

Collaborator

You are possibly right! - The itemID lead to some problems but now I can't remember. I have to look into this again...

This comment has been minimized.

@zuphilip

zuphilip May 8, 2018

Collaborator

Ah, now I see again. The generated ID will be different for different runs of the test cases, i.e. the test cases will constantly fail, also only the itemID is different. However, the itemID is probably important for the collections. Thus, I will revert these two changes.

@zuphilip

zuphilip May 8, 2018

Collaborator

Ah, now I see again. The generated ID will be different for different runs of the test cases, i.e. the test cases will constantly fail, also only the itemID is different. However, the itemID is probably important for the collections. Thus, I will revert these two changes.

Show outdated Hide outdated RDF.js
if (importItem(newItem, node)) {
delete newItem.itemID;

This comment has been minimized.

@adam3smith

adam3smith May 8, 2018

Collaborator

ditto

@adam3smith

adam3smith May 8, 2018

Collaborator

ditto

Show outdated Hide outdated RDF.js
case 'visualartwork':
case 'sculpture':
t.so = 'artwork'; break;
//case 'datacatalog':

This comment has been minimized.

@adam3smith

adam3smith May 8, 2018

Collaborator

we may want to include this in the dataset handling (line 1097)?

@adam3smith

adam3smith May 8, 2018

Collaborator

we may want to include this in the dataset handling (line 1097)?

@zuphilip

This comment has been minimized.

Show comment
Hide comment
@zuphilip

zuphilip May 9, 2018

Collaborator

I updated the code according to your review. Please have a look at the new version.

Testing is currently limited, because schema.org will normally not be encoded within RDF. On the positive side, this means also that I expect that everything worked the same as before with RDF and EM translators. The PR here enables us to concentrate on the next steps: transformations of different formats and integration into EM. Thereby or afterwards, I expect we will see much more examples and can improve the handling of schema.org further.

There shouldn't be anything critical here or anything I see we have to discuss further. The itemID was certainly a good point, but it is now included again.

Collaborator

zuphilip commented May 9, 2018

I updated the code according to your review. Please have a look at the new version.

Testing is currently limited, because schema.org will normally not be encoded within RDF. On the positive side, this means also that I expect that everything worked the same as before with RDF and EM translators. The PR here enables us to concentrate on the next steps: transformations of different formats and integration into EM. Thereby or afterwards, I expect we will see much more examples and can improve the handling of schema.org further.

There shouldn't be anything critical here or anything I see we have to discuss further. The itemID was certainly a good point, but it is now included again.

@adam3smith

This comment has been minimized.

Show comment
Hide comment
@adam3smith

adam3smith May 14, 2018

Collaborator

Yes, this is great; thanks for working on it.

Collaborator

adam3smith commented May 14, 2018

Yes, this is great; thanks for working on it.

@adam3smith adam3smith merged commit 5a17ede into zotero:master May 14, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment