Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upNER Learning Guide #248
Comments
ianmilligan1
added
the
question
label
Jan 11, 2019
ianmilligan1
self-assigned this
Jan 11, 2019
ianmilligan1
referenced this issue
Jan 11, 2019
Closed
Discussion: Do we want to implement LGA as a derivative in AUK? #245
This comment has been minimized.
This comment has been minimized.
If we have |
This comment has been minimized.
This comment has been minimized.
Let's continue this in archivesunleashed/auk-notebooks#24. If it turns out that this route doesn't work, we can re-open, but I think a separate learning guide might be overkill for this. |
ianmilligan1
closed this
Mar 5, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
ianmilligan1 commentedJan 11, 2019
In #246 we explored the use of NER within AUK to generate new derivatives with named entities, and concluded that it was too computationally intensive (by several orders of magnitude) to justify adding into the platform.
I suggested:
So let's add a new learning guide under full text.
Some Questions
What platform should we use? The simplest is to just point them to the Archives Unleashed Toolkit and to use this script which is found here.
They just add the classifier, find the extracted text files, and then get NER output.
Option Two would be to build on the earlier learning guide on NLTK and point them to that in a Python environment.
Any thoughts? I don't have too much experience with NER.