Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PRE REVIEW]: htmldate: A Python package to extract publication dates from web pages #2360

Closed
whedon opened this issue Jun 18, 2020 · 32 comments
Closed
Assignees

Comments

@whedon
Copy link
Collaborator

@whedon whedon commented Jun 18, 2020

Submitting author: @adbar (Adrien Barbaresi)
Repository: https://github.com/adbar/htmldate
Version: v0.6.3
Editor: @danielskatz
Reviewers: @geoffbacon, @proycon
Managing EiC: Arfon Smith

⚠️ JOSS reduced service mode ⚠️

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Author instructions

Thanks for submitting your paper to JOSS @adbar. Currently, there isn't an JOSS editor assigned to your paper.

@adbar if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). In addition, this list of people have already agreed to review for JOSS and may be suitable for this submission (please start at the bottom of the list).

Editor instructions

The JOSS submission bot @whedon is here to help you find and assign reviewers and start the main review. To find out what @whedon can do for you type:

@whedon commands
@whedon whedon added the pre-review label Jun 18, 2020
@whedon
Copy link
Collaborator Author

@whedon whedon commented Jun 18, 2020

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks.

⚠️ JOSS reduced service mode ⚠️

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
@arfon
Copy link
Member

@arfon arfon commented Jun 18, 2020

👋 @adbar - thanks for your submission to JOSS. From a quick inspection of this submission it's not entirely obvious that it meets our submission criteria. In particular, this item:

  • Your software should have an obvious research application

Could you confirm here that there is a research application for this software (and explain what that application is)? The section 'what should my paper contain' has some guidance for the sort of content we're looking to be present in the paper.md.

Many thanks!

@whedon
Copy link
Collaborator Author

@whedon whedon commented Jun 18, 2020

Software report (experimental):

github.com/AlDanial/cloc v 1.84  T=5.05 s (51.8 files/s, 49499.8 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                           236          53096          10389         181722
Python                          12            307            776           2484
reStructuredText                 5            261            232            231
Markdown                         3             98              0            170
TeX                              1             16              0            133
YAML                             2              9              3            124
DOS Batch                        1              8              1             26
INI                              1              3              6             15
make                             1              4              7              9
-------------------------------------------------------------------------------
SUM:                           262          53802          11414         184914
-------------------------------------------------------------------------------


Statistical information for the repository '2360' was gathered on 2020/06/18.
The following historical commit information, by author, was found:

Author                     Commits    Insertions      Deletions    % of changes
Adrien Barbaresi               239          8040           4514           99.60
DerKozmonaut                     4            34              2            0.29
Vincent Barbaresi                1             4             10            0.11

Below are the number of rows from each author that have survived and are still
intact in the current revision:

Author                     Rows      Stability          Age       % in comments
Adrien Barbaresi           3566           44.4          8.5               12.25
DerKozmonaut                  1            2.9         10.8              100.00
@whedon whedon added the Python label Jun 18, 2020
@whedon
Copy link
Collaborator Author

@whedon whedon commented Jun 18, 2020

Reference check summary:

OK DOIs

- None

MISSING DOIs

- https://doi.org/10.1515/zgl-2017-0017 may be missing for title: Die Korpusplattform des "Digitalen Wörterbuchs der deutschen Sprache" (DWDS)
- https://doi.org/10.18653/v1/w16-2602 may be missing for title: Efficient construction of metadata-enhanced web corpora
- https://doi.org/10.1162/coli.2007.33.1.147 may be missing for title: Googleology is bad science
- https://doi.org/10.1007/s10579-009-9081-4 may be missing for title: The WaCky Wide Web: a collection of very large linguistically processed web-crawled corpora

INVALID DOIs

- None
@whedon
Copy link
Collaborator Author

@whedon whedon commented Jun 18, 2020

@danielskatz
Copy link

@danielskatz danielskatz commented Jun 19, 2020

@whedon assign me as editor

@whedon
Copy link
Collaborator Author

@whedon whedon commented Jun 19, 2020

OK, the editor is @danielskatz

@danielskatz danielskatz removed the waitlisted label Jun 19, 2020
@danielskatz
Copy link

@danielskatz danielskatz commented Jun 19, 2020

@adbar - In the introduction to your paper, please add some text to describe the kinds of research that will use this software. For example, on your readme, you mention "methodological approach to derive information from web documents in order to build text databases for research (chiefly linguistics and natural language processing). There are web pages for which neither the URL nor the server response provide a reliable way to find out when a document was published or modified." It would be useful to put some of this concept/info into the paper

@danielskatz
Copy link

@danielskatz danielskatz commented Jun 19, 2020

Also, note that some of your references are missing DOIs. After making changes, write a new comment in this thread with @whedon generate pdf to regenerate the PDF, and use @whedon check references to check the references. Also note that whedon sometimes gets references wrong, and makes incorrect suggestions, so please check these.

@danielskatz
Copy link

@danielskatz danielskatz commented Jun 19, 2020

Finally, please suggest potential reviewers by mentioning them here in this thread (by github username, if they have one, without tagging them with an @). This list of people have already agreed to review for JOSS and may be suitable for this submission (please start at the bottom of the list).

@danielskatz
Copy link

@danielskatz danielskatz commented Jun 23, 2020

👋 @adbar - Can you confirm that you've seen the requests to you? And when do you think you might have an updated paper ready, along with reviewer suggestions?

@adbar
Copy link

@adbar adbar commented Jun 26, 2020

Thank you for your feedback! I've seen the requests and I will try to improve the text accordingly. I'm planning to have it done by the end of next week.

@adbar
Copy link

@adbar adbar commented Jul 3, 2020

I reviewed the introduction and added the missing DOIs.
@whedon generate pdf
@whedon check references

@adbar
Copy link

@adbar adbar commented Jul 3, 2020

Reviewers suggested: ajoer, hadware, tresoldi, geoffbacon, proycon
Are further requirements needed for the pre-reviewing phase?

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 3, 2020

@adbar - commands to whedon need to be the first thing in a comment, and there can only be one per comment

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 3, 2020

@whedon generate pdf

@whedon
Copy link
Collaborator Author

@whedon whedon commented Jul 3, 2020

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 3, 2020

@whedon check references

@whedon
Copy link
Collaborator Author

@whedon whedon commented Jul 3, 2020

Reference check summary:

OK DOIs

- 10.1515/zgl-2017-0017 is OK
- 10.18653/v1/w16-2602 is OK
- 10.1162/coli.2007.33.1.147 is OK
- 10.1007/s10579-009-9081-4 is OK

MISSING DOIs

- None

INVALID DOIs

- None
@danielskatz
Copy link

@danielskatz danielskatz commented Jul 3, 2020

👋 @ajoer - Would you be willing to review this submission for JOSS?

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 3, 2020

👋 @geoffbacon - Would you be willing to review this submission for JOSS?

@geoffbacon
Copy link
Collaborator

@geoffbacon geoffbacon commented Jul 4, 2020

Hi @danielskatz - I'd be happy to!

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 4, 2020

Thanks - I'll add you in the system, though we won't start the review until we get one more reviewer.

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 4, 2020

@whedon assign @geoffbacon as reviewer

@whedon
Copy link
Collaborator Author

@whedon whedon commented Jul 4, 2020

OK, @geoffbacon is now a reviewer

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 4, 2020

👋 @proycon - Would you be willing to review this submission for JOSS?

@proycon
Copy link

@proycon proycon commented Jul 4, 2020

@danielskatz Sure, I'd be glad to help!

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 4, 2020

Thanks - I'll add you and start the review.

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 4, 2020

@whedon add @proycon as reviewer

@whedon
Copy link
Collaborator Author

@whedon whedon commented Jul 4, 2020

OK, @proycon is now a reviewer

@danielskatz
Copy link

@danielskatz danielskatz commented Jul 4, 2020

@whedon start review

@whedon
Copy link
Collaborator Author

@whedon whedon commented Jul 4, 2020

OK, I've started the review over in #2439.

@whedon whedon closed this Jul 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants
You can’t perform that action at this time.