Skip to content
Please note that GitHub no longer supports your web browser.

We recommend upgrading to the latest Google Chrome or Firefox.

Learn more
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use readtext for other extensions #28

Merged
merged 1 commit into from Oct 17, 2019

Conversation

@adam3smith
Copy link
Contributor

commented Oct 17, 2019

The idea is to use readtext whenever it explicitly supports a format (list taken from their manual)

The idea is to use `readtext` whenever it explicitly supports a format (list taken from their manual)
@adam3smith adam3smith requested a review from greebie Oct 17, 2019
@greebie

This comment has been minimized.

Copy link
Collaborator

commented Oct 17, 2019

I think this accomplishes the task described in the heading, so I think we should be fine merging. However, I did test this with the following as a test file:

 <?xml version="1.0" encoding="UTF-8"?>
 <links>
 <link>https://www.google.com</link>
 <link>https://www.example.com</link>
 <notalink>NOT A LINK</notalink>
 </links>

and it returns an empty list, which means we end up with an

 Error in matrix(unlist(newlst), nrow = length(newlst), byrow = T) : 
  'data' must be of a vector type, was 'NULL'

Which should not happen anyway (if we have a 'NULL' it should quietly ignore the entry).

I suggest we merge this and add an issue either to include the empty link boilerplate (which really should go away anyway) or use a partial function (if R has that or a similar feature) to disinclude all null entries in the lapply statement. The problem with XML can be catalogued as part of #18 (part of the solution would be to try and improve the regex).

(If you agree just thumbs up and I'll merge).

@adam3smith

This comment has been minimized.

Copy link
Contributor Author

commented Oct 17, 2019

might be worthwhile to just use read_xml and read_html for the respective files rather than doing crazy regex...
XKCD Regex

@greebie greebie merged commit 7cbae93 into master Oct 17, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@greebie

This comment has been minimized.

Copy link
Collaborator

commented Oct 17, 2019

Weird - I got a broken merge warning in my email, but checks passed according to the website.

@greebie

This comment has been minimized.

Copy link
Collaborator

commented Oct 17, 2019

Problem fixed itself after reset. Seemed like a blip with the R package for Travis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.