Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DetectMimeTypeTika.scala - do we actually use it? #330

Open
ruebot opened this issue Jul 25, 2019 · 1 comment

Comments

Projects
None yet
2 participants
@ruebot
Copy link
Member

commented Jul 25, 2019

Following up on the question in Slack.

What's the use case for DetectMimeTypeTika? We use getMimeType elsewhere, but I a little confused how it actually works.

The only thing that I can see that calls it is a test for it.

I was digging through the Git history here and on the Warcbase repo, and can't really tell what it's used for, but it goes way back to Pig days. Maybe it's just legacy and we can remove it?

@lintool @jrwiebe @ianmilligan1 thoughts?

ruebot added a commit that referenced this issue Jul 25, 2019

@jrwiebe

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2019

I was using it for binary extraction, since the MimeType recorded in the WARCs is not always reliable. (I haven't committed the binary extraction methods yet, in part because I think I was having some Tika related memory issues.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.