Use readtext for other extensions #28

adam3smith · 2019-10-17T13:20:31Z

The idea is to use readtext whenever it explicitly supports a format (list taken from their manual)


        Use readtext for other extensions

The idea is to use `readtext` whenever it explicitly supports a format (list taken from their manual)

greebie · 2019-10-17T13:52:22Z

I think this accomplishes the task described in the heading, so I think we should be fine merging. However, I did test this with the following as a test file:

 <?xml version="1.0" encoding="UTF-8"?>
 <links>
 <link>https://www.google.com</link>
 <link>https://www.example.com</link>
 <notalink>NOT A LINK</notalink>
 </links>

and it returns an empty list, which means we end up with an

 Error in matrix(unlist(newlst), nrow = length(newlst), byrow = T) : 
  'data' must be of a vector type, was 'NULL'

Which should not happen anyway (if we have a 'NULL' it should quietly ignore the entry).

I suggest we merge this and add an issue either to include the empty link boilerplate (which really should go away anyway) or use a partial function (if R has that or a similar feature) to disinclude all null entries in the lapply statement. The problem with XML can be catalogued as part of #18 (part of the solution would be to try and improve the regex).

(If you agree just thumbs up and I'll merge).

adam3smith · 2019-10-17T14:48:23Z

might be worthwhile to just use read_xml and read_html for the respective files rather than doing crazy regex...

greebie · 2019-10-17T15:05:32Z

Weird - I got a broken merge warning in my email, but checks passed according to the website.

greebie · 2019-10-17T16:17:53Z

Problem fixed itself after reset. Seemed like a blip with the R package for Travis.

adam3smith requested a review from greebie Oct 17, 2019

greebie approved these changes Oct 17, 2019

View changes

greebie merged commit 7cbae93 into master Oct 17, 2019
2 checks passed

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details

continuous-integration/travis-ci/pr The Travis CI build passed
Details

This was referenced Oct 17, 2019

Tools for Misfigured Urls #18

Open

More Elegant Exit when result returns NULL #29

Open

Please note that GitHub no longer supports your web browser.

QualitativeDataRepository/archivr

Use readtext for other extensions #28

Use readtext for other extensions #28

adam3smith commented Oct 17, 2019

This comment has been minimized.

greebie commented Oct 17, 2019 •

edited

This comment has been minimized.

adam3smith commented Oct 17, 2019

This comment has been minimized.

greebie commented Oct 17, 2019

This comment has been minimized.

greebie commented Oct 17, 2019

Please note that GitHub no longer supports your web browser.

QualitativeDataRepository/archivr

Join GitHub today

Use readtext for other extensions #28

Conversation

adam3smith commented Oct 17, 2019

This comment has been minimized.

greebie commented Oct 17, 2019 • edited

This comment has been minimized.

adam3smith commented Oct 17, 2019

This comment has been minimized.

greebie commented Oct 17, 2019

This comment has been minimized.

greebie commented Oct 17, 2019

greebie commented Oct 17, 2019 •

edited