Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DetectMimeTypeTika.scala - do we actually use it? #330

Closed
ruebot opened this issue Jul 25, 2019 · 3 comments

Comments

@ruebot
Copy link
Member

commented Jul 25, 2019

Following up on the question in Slack.

What's the use case for DetectMimeTypeTika? We use getMimeType elsewhere, but I'm a little confused how it actually works.

The only thing that I can see that calls it is a test.

I was digging through the Git history here and on the Warcbase repo, and can't really tell what it's used for, but it goes way back to Pig days. Maybe it's just legacy and we can remove it?

@lintool @jrwiebe @ianmilligan1 thoughts?

ruebot added a commit that referenced this issue Jul 25, 2019

@jrwiebe

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2019

I was using it for binary extraction, since the MimeType recorded in the WARCs is not always reliable. (I haven't committed the binary extraction methods yet, in part because I think I was having some Tika related memory issues.)

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Jul 25, 2019

@jrwiebe yeah, I noticed that last night, and ended up hitting that error from #308 again some how. Moved it over to getMimeType because of that.

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Jul 31, 2019

I'll mark this as answered and close it since we've captured the meaning well in the discussion on #302.

@ruebot ruebot closed this Jul 31, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.