Aim: To perform a range of meta-analyses into the published Palaeontology literature.
Open Paleo is an open project that anyone can contribute to on GitHub. All data sources, methods, code, and results are openly shared for collaboration and inspection as the project evolves.
We strongly encourage others to participate in the project, propose their own ideas, and to contribute or re-use any of the data or other information available here.
This project will include looking at factors such as:
Ultimately, this information might prove useful in developing standards, protocols, and best practices for palaeontological research and publishing.
Two research papers have already come out of this project:
An overview of Open Access publishing in palaeontology, Tennant and Lomax (2019)
Open Science in Dinosaur Paleontology, Tennant and Farke (2019)
Journal selection was for the top-20 cited Paleontology journals according to Google Scholar in 2018.
Metadata were extracted from Scopus journal-by-journal (as csv files), with the only filter being on the dates, constrained to published articles between 2015-2016. This includes information such as:
Using Visdat R package to visually inspect the data, we were able to spot the misaligned rows and block shifted columns. These formatting errors were then fixed in MS-Excel and saved again in CSV format with UTF-8 encoding. Following this, the headers were formatted for user friendliness during analysis and the empty rows and columns were scrubbed off the data using Janitor R package.
Data for PLOS ONE were obtained using the Rplos package in R. The code, resulting data, and Unpaywall query results can all be found here. Note that some of the data here are different to that obtained to Scopus queries.
The next phase is to use the Unpaywall DOI checker on the DOI list for each journal. This provides information such as:
All of the results of these steps are available within this repository.
While Unpaywall checks to see if legitimate versions of articles have been made OA (i.e., via green self-archiving routes), researchers often also often tend to share their articles in non-copyright compliant ways. This includes on platforms such as ResearchGate or Academia.edu.
Therefore, data will be cross-checked with Google Scholar, which has this information at an article-level, to see:
WikiCite provides a lot of integrated data around scholarly literature, linking research papers with authors, topics, species, and much, much more. All data is CCZero and integrates many online resources. Scholia gives an idea what it can do for paleontology.