Skip to content
Please note that GitHub no longer supports your web browser.

We recommend upgrading to the latest Google Chrome or Firefox.

Learn more
Permalink
Browse files

Merge pull request #988 from fnielsen/feature/finetuning-missing-orga…

…nization-page-598

Enriching the missing page for organization aspect
  • Loading branch information
Daniel-Mietchen committed Dec 8, 2019
2 parents ec22839 + b029602 commit 5022628796948ec07c2feaba415d6e7ecc7d36a9
Showing with 115 additions and 9 deletions.
  1. +115 −9 scholia/app/templates/organization_missing.html
@@ -6,50 +6,139 @@

<script type="text/javascript">
missingAuthorResolvingSparql = `
missingAuthorItemsSparql = `
SELECT
(COUNT(?work) AS ?count)
?author_name
(CONCAT(
'https://tools.wmflabs.org/author-disambiguator/?doit=Look+for+author&name=',
ENCODE_FOR_URI(?author_name)) AS ?author_resolver_url)
WHERE {
{
SELECT DISTINCT ?author_name {
?researcher_ ( wdt:P108 | wdt:P463 | wdt:P1416 ) / wdt:P361* wd:{{ q }} .
?researcher_ skos:altLabel | rdfs:label ?author_name_ .
# The SELECT-DISTINCT-BIND trick here is due to Stanislav Kralin
# https://stackoverflow.com/questions/53933564
BIND (STR(?author_name_) AS ?author_name)
}
LIMIT 2000
}
OPTIONAL { ?work wdt:P2093 ?author_name . }

This comment has been minimized.

Copy link
@fnielsen

fnielsen Dec 16, 2019

Owner

Why are we having an OPTIONAL here when there is a HAVING below?

}
GROUP BY ?author_name
HAVING (?count > 0)
ORDER BY DESC(?count)
`
missingAuthorResolvingNoInitialsSparql = `
# Find frequent unresolved author names that are coauthors with
# people associated with an organization.
SELECT
?works
?researcher ?researcherLabel
?author_name
(CONCAT(
'https://tools.wmflabs.org/author-disambiguator/?doit=Look+for+author&name=',
ENCODE_FOR_URI(?author_name)) AS ?resolver_url)
"https://tools.wmflabs.org/author-disambiguator/?doit=Look+for+author&name=",
ENCODE_FOR_URI(?author_name ), "&filter=wdt%3AP50+wd%3A", ?qid ) AS ?resolver_url)

This comment has been minimized.

Copy link
@fnielsen

fnielsen Dec 16, 2019

Owner

There are two whitespaces here

WITH {
SELECT DISTINCT ?researcher_ WHERE {
?researcher_ ( wdt:P108 | wdt:P463 | wdt:P1416 ) / wdt:P361* wd:{{q}} .
}
} AS %researchers
WITH {
# SELECT DISTINCT ?author_name (COUNT(DISTINCT ?work) as ?count) ?author (REPLACE(STR(?author), ".*Q", "Q") AS ?qid) WHERE {
SELECT
(COUNT(DISTINCT ?work) AS ?works)
?author_name
(SAMPLE(?researcher_) AS ?researcher)
(REPLACE(STR(?researcher), ".*Q", "Q") AS ?qid)
WHERE {
INCLUDE %researchers
?work wdt:P50 ?researcher_ ;
wdt:P2093 ?author_name .
FILTER(!regex (LCASE(?author_name), "^.*?((\\\\b\\\\w{1}\\\\b)|[.]).*$")).
}
GROUP BY ?author_name
ORDER BY DESC(?works)
LIMIT 200
} AS %researchers_and_number_of_works
WHERE {
INCLUDE %researchers_and_number_of_works
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,da,de,es,fr,nl,no,ru,sv,zh" . }
}
ORDER BY DESC(?works)
`
missingAuthorResolvingWithInitialsSparql = `
# Find frequent unresolved author names that are coauthors with
# people associated with an organization.
SELECT
?works
?researcher ?researcherLabel
?author_name
(CONCAT(
"https://tools.wmflabs.org/author-disambiguator/?doit=Look+for+author&name=",
ENCODE_FOR_URI(?author_name ), "&filter=wdt%3AP50+wd%3A", ?qid ) AS ?resolver_url)
WITH {
SELECT DISTINCT ?researcher_ WHERE {
?researcher_ ( wdt:P108 | wdt:P463 | wdt:P1416 ) / wdt:P361* wd:{{ q }} .
?researcher_ ( wdt:P108 | wdt:P463 | wdt:P1416 ) / wdt:P361* wd:{{q}} .

This comment has been minimized.

Copy link
@fnielsen

fnielsen Dec 16, 2019

Owner

It is unclear why the space has been erased here.

}
} AS %researchers
WITH {
# SELECT DISTINCT ?author_name (COUNT(DISTINCT ?work) as ?count) ?author (REPLACE(STR(?author), ".*Q", "Q") AS ?qid) WHERE {

This comment has been minimized.

Copy link
@fnielsen

fnielsen Dec 16, 2019

Owner

Unclear why the uncommented code is there.

SELECT
(COUNT(DISTINCT ?work) AS ?works)
?author_name
(SAMPLE(?researcher_) AS ?researcher)
(REPLACE(STR(?researcher), ".*Q", "Q") AS ?qid)
WHERE {
INCLUDE %researchers
?work wdt:P50 ?researcher_ ;
wdt:P2093 ?author_name .
FILTER(regex (LCASE(?author_name), "^.*?((\\\\b\\\\w{1}\\\\b)|[.]).*$")).
}
GROUP BY ?author_name
ORDER BY DESC(?works)
LIMIT 2000
LIMIT 200
} AS %researchers_and_number_of_works
WHERE {
INCLUDE %researchers_and_number_of_works
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,da,de,es,fr,nl,no,ru,sv,zh" . }
}
ORDER BY DESC(?works)
`
$(document).ready(function() {
sparqlToDataTable(
missingAuthorResolvingSparql, "#missing-author-resolving",
missingAuthorResolvingWithInitialsSparql, "#missing-author-resolving-initials",
{
linkPrefixes: {
researcher: '../../author/',
},
}
);
sparqlToDataTable(
missingAuthorResolvingNoInitialsSparql, "#missing-author-no-initials",
{
linkPrefixes: {
researcher: '../../author/',
},
});
}
);
sparqlToDataTable(
missingAuthorItemsSparql, "#missing-author-items",
);
});
@@ -64,13 +153,30 @@ <h1 id="h1">Organization</h1>

Missing information with respect to the organization.

<h2 id="h2">Missing author items</h2>

This comment has been minimized.

Copy link
@fnielsen

fnielsen Dec 16, 2019

Owner

Bad identifier.


The authors listed below may only be represented as strings in Wikidata
with no link to Wikidata items.
Follow the link to use the Author disambiguator tool to try to resolve
the authors.

<table class="table table-hover" id="missing-author-items"></table>

<h2 id="h3">Author name strings to be resolved</h2>

<h2>Author name strings to be resolved</h2>

<h3 id="h31">Author name strings without initials</h3>

This comment has been minimized.

Copy link
@fnielsen

fnielsen Dec 16, 2019

Owner

Bad identifier


The author may be represented as a string rather than a Wikidata item.
Follow the link to use the Author disambiguator tool to try to resolve the author name.
To facilitate curation, author name strings containing single-letter initials have been removed from this table.
<table class="table table-hover" id="missing-author-no-initials"></table>

<h3 id="h32">Author name strings with initials</h3>

This comment has been minimized.

Copy link
@fnielsen

fnielsen Dec 16, 2019

Owner

The identifier does not follow the pattern.


This table shows author name strings that contain single-letter initials.

<table class="table table-hover" id="missing-author-resolving"></table>
<table class="table table-hover" id="missing-author-resolving-initials"></table>

This comment has been minimized.

Copy link
@fnielsen

fnielsen Dec 16, 2019

Owner

The identifier does not follow the pattern.



{% endblock %}

1 comment on commit 5022628

@Daniel-Mietchen

This comment has been minimized.

Copy link
Collaborator Author

Daniel-Mietchen commented on 5022628 Jan 12, 2020

@fnielsen looks like this has all been addressed by b28d1e3 , so this one can be closed.

Please sign in to comment.
You can’t perform that action at this time.