Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upConsider an option to use rtika to detect file type #180
Comments
This comment has been minimized.
This comment has been minimized.
Interesting. For reference: https://github.com/ropensci/rtika But, but, rJava |
This comment has been minimized.
This comment has been minimized.
I think that the CRAN version does not use rJava, instead it offers an install_* function and pushes system calls out to a CLI. So although you do need java on your system, you don’t need working rJava. |
This comment has been minimized.
This comment has been minimized.
So, via magic numbers? Even those can fail. For example, .ODS and .XLSX files can have the same signatures. And they're both .zip files anyway, so that's another level of ambiguity. On the other hand, it doesn't take that much time for I have a proof of concept here (though many formats not yet loaded and discussion/brainstorming needed before submitting it as a PR) |
This comment has been minimized.
This comment has been minimized.
I'm not super keen on trying to parse with every imaginable import function. That might produce some kind of unanticipated weird behavior if one of those underlying functions changes to start supporting different kind of file or if we add future functionality that changes the deterministic order of import attempts. We could add a separate function that does that, though, like |
jsonbecker commentedApr 26, 2018
It’d be cool to add an option that defaults to false that would use the new rtika package to detect file types rather than file extensions.
I’d be willing to take this work on if it seems interesting.