New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Think about "Ten Simple Rules for Disambiguating Author Name Strings" #1133

Open
Daniel-Mietchen opened this Issue Feb 6, 2019 · 1 comment

Comments

Projects
None yet
1 participant
@Daniel-Mietchen
Copy link
Owner

Daniel-Mietchen commented Feb 6, 2019

Start with some basic version, then apply it to some model corpus and refine until a useful set of 10 emerges.

Potential model corpus:

  • model corpus https://tools.wmflabs.org/author-disambiguator/?fuzzy=0&name=Li+Li
    ns
    Some seeds for rules:
  • take stock of the information available beyond author name string, e.g. affiliations, publication dates, journal names, co-authors (via IDs or name strings, including co-authors of co-authors), topic, citation patterns, language
  • check whoever else might have solved your current concrete disambiguation problem
@Daniel-Mietchen

This comment has been minimized.

Copy link
Owner

Daniel-Mietchen commented Feb 6, 2019

  • number of characters in strings
  • specificity of topics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment