Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upCreate tests for ExtractEntities.scala #48
Comments
Tests exist for this, but are disabled because there is no NER classifier in the repo. Is it possible to include the NER classifier in-house, or is there a licensing issue with doing so? |
Looks like I shouldn't have removed it here 0ec8ab1, and I don't see @lintool I assume |
I'm interested that NER seems to be included in Tika, which we all ready use for the PDFParser. I could take this on as a November project I think. It may mean some configuration. I think NER is basic GPL, so I don't think it's a licensing issue. |
Our thought back when we designed the entity extractor was less a licensing issue, but more not wanting the repository to balloon out of control in terms of size. It was a general aversion to having too much cruft build up within the repo. @lintool |
Yes, that's exactly it. My usual treatment is to create a separate repo, e.g., |
We have tika-parsers in our pom.xml. There is a version of NER in there -- maybe we could use that? https://wiki.apache.org/tika/TikaAndNER |
@greebie Maven isn't going to pull that that classifier file from that I can tell. |
Okay -- back to square one. I think we are stuck not being able to test this one. |
Although I may be able to mock a NERClassifier .. let's keep this open until I can figure out. |
ruebot commentedOct 2, 2017
•
edited
ExtractEntities.scala
has no test coverage. We need to create tests for it.