New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more stats to Scholia main page #1022

Open
Daniel-Mietchen opened this Issue Oct 26, 2018 · 8 comments

Comments

Projects
None yet
1 participant
@Daniel-Mietchen
Copy link
Owner

Daniel-Mietchen commented Oct 26, 2018

It would be useful to have more stats on the Scholia homepage,and the following query (from here) provided that in principle but caused problems with the embeds, since it had too many characters:

SELECT ?count ?description
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] ?p [] . }
} AS %triples
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { ?property a wikibase:Property.  }
} AS %properties
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P50 []. }
} AS %authors
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P69 [] . }
} AS %almamater
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P108 [] . }
} AS %employer
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P166 [] . }
} AS %award_received
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P212 [] . }
} AS %isbn13
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P225 []. }
} AS %taxa
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P234 []. }
} AS %inchi
WITH {
  SELECT (COUNT(DISTINCT ?serials) AS ?count) WHERE { ?serials wdt:P236 [] . }
} AS %issn
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P356 []. }
} AS %dois
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P496 []. }
} AS %orcids
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P625 []. }
} AS %geoloc
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P638 [] . }
} AS %pdb
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P686 [] . }
} AS %gene
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P698 []. }
} AS %pmids
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P699 [] . }
} AS %disease
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P859 [] . }
} AS %sponsor
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P818 [] . }
} AS %arxivID
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P921 []. }
} AS %topics
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P932 []. }
} AS %pmcids
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P1416 [] . }
} AS %affiliation
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P2093 []. }
} AS %authorstrings
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P2427 [] . }
} AS %GRID
WITH {
  SELECT (COUNT(*) AS ?count) WHERE { [] wdt:P2860 [] . }
} AS %cites
WHERE {
  {
    INCLUDE %triples
    BIND("Total number of triples" AS ?description)
  }
  UNION
  {
    INCLUDE %properties
    BIND("Total number of properties" AS ?description)
  }
  UNION
  {
    INCLUDE %pmids
    BIND("Items with a PubMed ID" AS ?description)
  }
  UNION
  {
    INCLUDE %pmcids
    BIND("Items with a PubMed Central ID" AS ?description)
  }
  UNION
  {
    INCLUDE %dois
    BIND("Items with a Digital Object Identifier (DOI)" AS ?description)
  }
  UNION
  {
    INCLUDE %cites
    BIND("Citations" AS ?description)
  }
  UNION
  {
    INCLUDE %authors
    BIND("Links from items about works to items about their authors" AS ?description)
  }
  UNION
  {
    INCLUDE %authorstrings
    BIND("Author name strings on items about works" AS ?description)
  }
  UNION
  {
    INCLUDE %orcids
    BIND("Items about authors with an ORCID profile that has public content" AS ?description)
  }
  UNION
  {
    INCLUDE %taxa
    BIND("Items with a taxon name" AS ?description)
  }
  UNION
  {
    INCLUDE %geoloc
    BIND("Items with a geolocation" AS ?description)
  }
  UNION
  {
    INCLUDE %topics
    BIND("Links from items about works to items about their main subjects" AS ?description)
  }
  UNION
  {
    INCLUDE %inchi
    BIND("Items with an International Chemical Identifier (InChI)" AS ?description)
  }
  UNION
  {
    INCLUDE %isbn13
    BIND("Items with a 13-digit International Standard Book Number (ISBN 13)" AS ?description)
  }
  UNION
  {
    INCLUDE %award_received
    BIND("Links from items about people or others to an award they have received" AS ?description)
  }
  UNION
  {
    INCLUDE %affiliation
    BIND("Links from items about people to items about groups they are affiliated with" AS ?description)
  }
  UNION
  {
    INCLUDE %employer
    BIND("Links from items about people to items about their employer" AS ?description)
  }
  UNION
  {
    INCLUDE %almamater
    BIND("Links from items about people to items about the educational establishments they attended" AS ?description)
  }
  UNION
  {
    INCLUDE %issn
    BIND("Items with an International Standard Serial Number (ISSN)" AS ?description)
  }
  UNION
  {
    INCLUDE %arxivID
    BIND("Items with an arxivID" AS ?description)
  }
  UNION
  {
    INCLUDE %GRID
    BIND("Items about institutions with an identifier from the Global Research Identifier Database (GRID ID)" AS ?description)
  }
  UNION
  {
    INCLUDE %sponsor
    BIND("Links from items about anything to items about corresponding sponsors" AS ?description)
  }
  UNION
  {
    INCLUDE %disease
    BIND("Items indexed in the Disease Ontology" AS ?description)
  }
  UNION
  {
    INCLUDE %gene
    BIND("Items indexed in the Gene Ontology" AS ?description)
  }
  UNION
  {
    INCLUDE %pdb
    BIND("Protein structures indexed in the Protein Data Bank" AS ?description)
  }
}
ORDER BY DESC(?count)

The longest URL that worked for me is this one (or http://tinyurl.com/ycbxjzdz for short), which has 2532 characters in the actual query, as compared to 4867 in the query above.

There are several ways to address this:

  • remove unnecessary characters (e.g. spaces)
    • shorten variable names
    • write the repetitive parts of the query in a more efficient way
    • reduce the text to be displayed under description
  • splitting the stats section into more than one panel, e.g. one for general stats, one more for bibliometrics, possibly some more for discipline-specific identifiers etc.
@Daniel-Mietchen

This comment has been minimized.

Copy link
Owner

Daniel-Mietchen commented Oct 27, 2018

Here is a version that has 2524 characters for the query itself and works fine in embeds:

SELECT ?count ?description
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] ?p []}} AS %triples
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P50 []}} AS %authors
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P69 []}} AS %almamater
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P108 []}} AS %employer
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P166 []}} AS %awards
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P212 []}} AS %isbn13
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P225 []}} AS %taxa
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P356 []}} AS %dois
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P496 []}} AS %orcids
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P625 []}} AS %geoloc
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P698 []}} AS %pmids
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P921 []}} AS %topics
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P932 []}} AS %pmcids
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P2093 []}} AS %authorstrings
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P2860 []}} AS %cites
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P859 []}} AS %sponsor

WHERE {
{INCLUDE %triples BIND("Total number of triples" AS ?d)} UNION
{INCLUDE %pmids BIND("Items with a PubMed ID" AS ?d)} UNION
{INCLUDE %pmcids BIND("Items with a PubMed Central ID" AS ?d)} UNION
{INCLUDE %dois BIND("Items with a Digital Object Identifier (DOI)" AS ?d)} UNION
{INCLUDE %cites BIND("Citations" AS ?d)} UNION
{INCLUDE %authors BIND("Links from items about works to items about their authors" AS ?d)} UNION
{INCLUDE %authorstrings BIND("Author name strings on items about works" AS ?d)} UNION
{INCLUDE %orcids BIND("Items about authors with an ORCID profile that has public content" AS ?d)} UNION
{INCLUDE %taxa BIND("Items with a taxon name" AS ?d)} UNION
{INCLUDE %geoloc BIND("Items with a geolocation" AS ?d)} UNION
{INCLUDE %topics BIND("Links from items about works to items about their main subjects" AS ?d)} UNION
{INCLUDE %isbn13 BIND("Items with a 13-digit International Standard Book Number (ISBN 13)" AS ?d)} UNION
{INCLUDE %awards BIND("Links from items about people or others to an award they have received" AS ?d)} UNION
{INCLUDE %employer BIND("Links from items about people to items about their employer" AS ?d)} UNION
{INCLUDE %almamater BIND("Links from items about people to items about the educational establishments they attended" AS ?d)} UNION
{INCLUDE %sponsor BIND("Links from items about anything to items about corresponding sponsors" AS ?d)}
BIND (?c AS ?count)
BIND (?d AS ?description)
}
ORDER BY DESC(?c)
@Daniel-Mietchen

This comment has been minimized.

Copy link
Owner

Daniel-Mietchen commented Oct 27, 2018

Here is another version that also has the number of properties and still works:

SELECT ?count ?description
WITH {SELECT (COUNT(*) AS ?c) WHERE {?s ?p ?o}} AS %trip
WITH {SELECT (COUNT(*) AS ?c) WHERE {?p a wikibase:Property}} AS %prop
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P50 []}} AS %auth
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P69 []}} AS %alma
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P108 []}} AS %employer
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P166 []}} AS %awards
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P212 []}} AS %isbn13
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P225 []}} AS %taxa
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P356 []}} AS %dois
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P496 []}} AS %orcids
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P625 []}} AS %geo
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P698 []}} AS %pmids
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P921 []}} AS %topics
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P932 []}} AS %pmcids
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P2093 []}} AS %au_strg
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P2860 []}} AS %cites
WITH {SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P859 []}} AS %sponsor
WHERE {
{INCLUDE %trip BIND("Total number of triples" AS ?d)} UNION
{INCLUDE %prop BIND("Total number of properties" AS ?d)} UNION
{INCLUDE %pmids BIND("Items with a PubMed ID" AS ?d)} UNION
{INCLUDE %pmcids BIND("Items with a PubMed Central ID" AS ?d)} UNION
{INCLUDE %dois BIND("Items with a Digital Object Identifier (DOI)" AS ?d)} UNION
{INCLUDE %cites BIND("Citations" AS ?d)} UNION
{INCLUDE %auth BIND("Links from items about works to items about their authors" AS ?d)} UNION
{INCLUDE %au_strg BIND("Author name strings on items about works" AS ?d)} UNION
{INCLUDE %orcids BIND("Items about authors with an ORCID profile" AS ?d)} UNION
{INCLUDE %taxa BIND("Items with a taxon name" AS ?d)} UNION
{INCLUDE %geo BIND("Items with a geolocation" AS ?d)} UNION
{INCLUDE %topics BIND("Links from items about works to items about their main subjects" AS ?d)} UNION
{INCLUDE %isbn13 BIND("13-digit International Standard Book Numbers (ISBN 13)" AS ?d)} UNION
{INCLUDE %awards BIND("Links from items about people or others to an award they have received" AS ?d)} UNION
{INCLUDE %employer BIND("Links from items about people to items about their employer" AS ?d)} UNION
{INCLUDE %alma BIND("Links from items about people to items about the educational establishments they attended" AS ?d)} UNION
{INCLUDE %sponsor BIND("Links from items about anything to items about corresponding sponsors" AS ?d)}
BIND (?c AS ?count)
BIND (?d AS ?description)
}
ORDER BY DESC(?c)

It actually has 2581 characters in the query proper but fewer special ones, which means fewer characters for URL encoding. The total length of the query URL is then 4275, and the embed URL (short version is http://tinyurl.com/y7ywvq8z) has 4027 characters.

@Daniel-Mietchen

This comment has been minimized.

Copy link
Owner

Daniel-Mietchen commented Oct 27, 2018

Here is another version that may well be compact enough to provide space to include all the remaining properties that were included in the original query of this ticket:

SELECT ?count ?description
WHERE {
{{SELECT (COUNT(*) AS ?c) WHERE {?s ?p ?o}} BIND("Total number of triples" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {?p a wikibase:Property}} BIND("Total number of properties" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P50 []}} BIND("Links from items about works to items about their authors" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P69 []}} BIND("Links from items about people to items about the educational establishments they attended" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P108 []}} BIND("Links from items about people to items about their employer" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P166 []}} BIND("Links from items about people or others to an award they have received" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P225 []}} BIND("Items with a taxon name" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P496 []}} BIND("Items about authors with an ORCID profile" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P625 []}} BIND("Items with a geolocation" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P698 []}} BIND("Items with a PubMed ID" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P859 []}} BIND("Links from items about anything to items about corresponding sponsors" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P921 []}} BIND("Links from items about works to items about their main subjects" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P932 []}} BIND("Items with a PubMed Central ID" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P356 []}} BIND("Items with a Digital Object Identifier (DOI)" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P2093 []}} BIND("Author name strings on items about works" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P2860 []}} BIND("Citations" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P212 []}} BIND("13-digit International Standard Book Numbers (ISBN 13)" AS ?d)} 
BIND (?c AS ?count)
BIND (?d AS ?description)
}
ORDER BY DESC(?c)
@Daniel-Mietchen

This comment has been minimized.

Copy link
Owner

Daniel-Mietchen commented Oct 27, 2018

So here is the full list of all 25 stats from the initial query, and it all still seems to work:

SELECT ?count ?description
WHERE {
{{SELECT (COUNT(*) AS ?c) WHERE {?s ?p ?o}} BIND("Total number of triples" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {?p a wikibase:Property}} BIND("Total number of properties" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P50 []}} BIND("Links from items about works to items about their authors" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P69 []}} BIND("Links from items about people to items about the educational establishments they attended" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P108 []}} BIND("Links from items about people to items about their employer" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P166 []}} BIND("Links from items about people or others to an award they have received" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P212 []}} BIND("13-digit International Standard Book Numbers (ISBN 13)" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P225 []}} BIND("Items with a taxon name" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P234 []}} BIND("Items with an International Chemical Identifier (InChI)" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P236 []}} BIND("Items with an International Standard Serial Number (ISSN)" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P496 []}} BIND("Items about authors with an ORCID profile" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P625 []}} BIND("Items with a geolocation" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P638 []}} BIND("Protein structures indexed in the Protein Data Bank" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P686 []}} BIND("Items indexed in the Gene Ontology" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P698 []}} BIND("Items with a PubMed ID" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P699 []}} BIND("Items indexed in the Disease Ontology" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P818 []}} BIND("Items with an arxivID" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P859 []}} BIND("Links from items about anything to items about corresponding sponsors" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P921 []}} BIND("Links from items about works to items about their main subjects" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P932 []}} BIND("Items with a PubMed Central ID" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P356 []}} BIND("Items with a Digital Object Identifier (DOI)" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P1416 []}} BIND("Links from items about people to items about groups they are affiliated with" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P2093 []}} BIND("Author name strings on items about works" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P2427 []}} BIND("Items about institutions with an identifier from the Global Research Identifier Database (GRID ID)" AS ?d)} UNION
{{SELECT (COUNT(*) AS ?c) WHERE {[] wdt:P2860 []}} BIND("Citations" AS ?d)} 
BIND (?c AS ?count)
BIND (?d AS ?description)
}
ORDER BY DESC(?c)
@Daniel-Mietchen

This comment has been minimized.

Copy link
Owner

Daniel-Mietchen commented Oct 27, 2018

Here is some background info on past and current limits to URL lengths in different browsers:
https://stackoverflow.com/questions/417142/what-is-the-maximum-length-of-a-url-in-different-browsers .

@Daniel-Mietchen

This comment has been minimized.

Copy link
Owner

Daniel-Mietchen commented Nov 25, 2018

Also worth adding the count of scholarly articles:

SELECT (COUNT(*) AS ?count) WHERE { ?item wdt:P31 wd:Q13442814. }
@Daniel-Mietchen

This comment has been minimized.

Copy link
Owner

Daniel-Mietchen commented Nov 25, 2018

@Daniel-Mietchen

This comment has been minimized.

Copy link
Owner

Daniel-Mietchen commented Feb 6, 2019

Just got

<h1>Bad Message 431</h1><pre>reason: Request Header Fields Too Large</pre>

in response to a complex query that resulted in a long URL. Background on HTTP 431: https://httpstatuses.com/431 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment