Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.
Sign upDetectLanguage.scala: class LanguageIdentifier in package language is deprecated #286
Comments
ruebot
added
RA-Task
clean-up
labels
Oct 17, 2018
ruebot
referenced this issue
Oct 17, 2018
Merged
Update Apache Tika - security vulnerabilities; resolves #131. #285
This comment has been minimized.
This comment has been minimized.
@borislin if you have time, do you want to take this one on? It should be an easy one. |
ruebot
referenced this issue
Oct 17, 2018
Closed
Update to Apache Tika 1.19.1; security vulnerabilities in 1.12 #131
ruebot
added
the
upstream deprecation
label
Oct 17, 2018
This comment has been minimized.
This comment has been minimized.
@ruebot Sure, I'll work on this. |
ruebot
assigned
borislin
Oct 18, 2018
This comment has been minimized.
This comment has been minimized.
Update: Current code for to fix this issue: https://github.com/archivesunleashed/aut/tree/refactor-detect-language I can't test my code now due to a a lot of dependency issues/errors in Maven log: build.log After discussing with @ruebot, it turns out that it's more complicated than we thought and we need more time to sort out this dependency hell before pushing a PR for this issue. |
ianmilligan1
unassigned
borislin
Jan 11, 2019
This comment has been minimized.
This comment has been minimized.
I just pushed a fix to the dependency errors. They were caused by a conflict between versions of Guava. Hadoop 2.6.5 is bringing in Guava 11, while tika-langdetect requires a more modern version (1.19.1 calls for 17.0). I created a version of tika-langdetect that shades Guava, basically following what is described here. I pushed my changes to The build is still failing, but now it's because two tests fail:
I haven't looked into this yet. This shading solution is obviously not ideal, but it might do in the short term since we should be using the updated tika. The long term solution would be to upgrade Hadoop and our other dependencies. |
jrwiebe
self-assigned this
Jan 23, 2019
jrwiebe
added a commit
that referenced
this issue
Jan 23, 2019
This comment has been minimized.
This comment has been minimized.
I remember going down this rabbit hole, and had setup a bunch of exclusions on the Guava dependencies. Maybe it would be worth going down that path again? That said, the transitive dependencies on this project are not fun to sort out! |
This comment has been minimized.
This comment has been minimized.
Started digging into the test failures. I suspect Tika is returning more with this version, and we need to dig into that more. But, maybe we should update our implementation too? I hadn't noticed this example before in the API documentation. |
This comment has been minimized.
This comment has been minimized.
Boris was never able to build it, and ran out of time before he left to finish it, so that explains why it never got that far. |
ruebot commentedOct 17, 2018
Follow-on to #285
I believe we need to update
DectectLanguage
to use this method.