Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upWARCs with datetime before year 1900 cause error in indexer #603
Comments
machawk1
added
bug
External project dependence
ipwb indexer
and removed
External project dependence
labels
Jan 26, 2019
This comment has been minimized.
This comment has been minimized.
This seems to be an issue with strftime with potential solutions provided here. |
machawk1
added
the
External project dependence
label
Jan 26, 2019
This comment has been minimized.
This comment has been minimized.
shawnmjones
commented
Jan 27, 2019
When would a WARC have a datetime prior to the year 1900? |
This comment has been minimized.
This comment has been minimized.
@shawnmjones A WARC generated through conventional means should not, since 1900 predates the creation of the WARC spec and the Web. The WARC spec cites the W3C profile of the ISO W3C profile of ISO 8601:1988 spec as the WARC-Date basis. Dates prior to 1900 are legal here, so should not cause an exception. However, the interpretation of a dates prior to 1900 in this field is likely due to a misinterpretation, misconfiguration, or a fabricated example, as attached ↑. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
machawk1 commentedJan 26, 2019
fb_fab_dates 2.warc.txt