New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExtractBoilerpipeText to remove headers as well. #253 #256

Merged
merged 3 commits into from Aug 11, 2018

Conversation

Projects
None yet
3 participants
@greebie
Contributor

greebie commented Aug 11, 2018

GitHub issue(s):
#253

What does this Pull Request do?

Alters ExtractBoilerpipeText to remove headers from text output as well.

How should this be tested?

The test I created + Travis and Codecov should cover it.

Additional Notes:

I attempted to provide a .keepHeader option, but to do so would require moving the .apply call from the object and that would require documentation changes. This seemed too much to accommodate a fairly marginal use-case.

This option can be provided by returning a class with .keepHeader and .text attributes. I do not think it's too necessary.

Interested parties

@ianmilligan1

Thanks in advance for your help with the Archives Unleashed Toolkit!

greebie added some commits Aug 10, 2018

@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot Aug 11, 2018

Codecov Report

Merging #256 into master will decrease coverage by 0.16%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #256      +/-   ##
==========================================
- Coverage   70.52%   70.35%   -0.17%     
==========================================
  Files          41       41              
  Lines        1038     1039       +1     
  Branches      191      191              
==========================================
- Hits          732      731       -1     
- Misses        240      242       +2     
  Partials       66       66
Impacted Files Coverage Δ
...ivesunleashed/matchbox/ExtractBoilerpipeText.scala 66.66% <100%> (-20.84%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4cf9a7...a35da9a. Read the comment docs.

codecov bot commented Aug 11, 2018

Codecov Report

Merging #256 into master will decrease coverage by 0.16%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #256      +/-   ##
==========================================
- Coverage   70.52%   70.35%   -0.17%     
==========================================
  Files          41       41              
  Lines        1038     1039       +1     
  Branches      191      191              
==========================================
- Hits          732      731       -1     
- Misses        240      242       +2     
  Partials       66       66
Impacted Files Coverage Δ
...ivesunleashed/matchbox/ExtractBoilerpipeText.scala 66.66% <100%> (-20.84%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4cf9a7...a35da9a. Read the comment docs.

@ianmilligan1

Much cleaner output!

i.e.

(20091219,www.davidsuzuki.org,http://www.davidsuzuki.org/_pvw706D01C3/Oceans/PNCIMA/Unique-Features-Within-PNCIMA.asp,Here are more of the many unique features within PNCIMA: Endangered Species •There are 32 species listed as Endangered, Threatened, and Special Concern by the Committee on the Status of Endangered Wildlife in Canada (COSEWIC) that live in or migrate through the PNCIMA. North Pacific Right Whales •They are the most endangered whale in the world ...
@ianmilligan1

This comment has been minimized.

Show comment
Hide comment
@ianmilligan1

ianmilligan1 Aug 11, 2018

Member

I'm happy to merge if this looks good to you as well @ruebot.

Member

ianmilligan1 commented Aug 11, 2018

I'm happy to merge if this looks good to you as well @ruebot.

@ruebot

This comment has been minimized.

Show comment
Hide comment
@ruebot

ruebot Aug 11, 2018

Member

Do we need to update any documentation?

Member

ruebot commented Aug 11, 2018

Do we need to update any documentation?

@ianmilligan1

This comment has been minimized.

Show comment
Hide comment
@ianmilligan1

ianmilligan1 Aug 11, 2018

Member

Nope, it's a straight change (just removes the http headers from the boilerplate removal, which is what we should have been doing all along) - tested w/ the documentation script and works well.

Member

ianmilligan1 commented Aug 11, 2018

Nope, it's a straight change (just removes the http headers from the boilerplate removal, which is what we should have been doing all along) - tested w/ the documentation script and works well.

@ruebot

This comment has been minimized.

Show comment
Hide comment
@ruebot

ruebot Aug 11, 2018

Member

@greebie does this resolve #253? The title and description of that issue, and this PR don't align well.

Member

ruebot commented Aug 11, 2018

@greebie does this resolve #253? The title and description of that issue, and this PR don't align well.

@ruebot ruebot merged commit 84a4c09 into master Aug 11, 2018

4 checks passed

codecov/patch 100% of diff hit (target 70.52%)
Details
codecov/project Absolute coverage decreased by -0.16% but relative coverage increased by +29.47% compared to e4cf9a7
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@ruebot ruebot deleted the issue-253 branch Aug 11, 2018

@greebie

This comment has been minimized.

Show comment
Hide comment
@greebie

greebie Aug 12, 2018

Contributor

Yes it does resolve. I thought the solution was something else, but this seemed more straight-forward.

Contributor

greebie commented Aug 12, 2018

Yes it does resolve. I thought the solution was something else, but this seemed more straight-forward.

@ruebot

This comment has been minimized.

Show comment
Hide comment
@ruebot

ruebot Aug 12, 2018

Member

Ok. In the future, please be more explicit because as it stands now, it is very unclear. We need to know how and why the PR resolves the issues, and if it differs from the issue created, why.

Member

ruebot commented Aug 12, 2018

Ok. In the future, please be more explicit because as it stands now, it is very unclear. We need to know how and why the PR resolves the issues, and if it differs from the issue created, why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment