Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upMemento headers and commas #239
Comments
acoburn
added
the
area/http
label
Oct 12, 2018
acoburn
added this to the 0.8.0 Release milestone
Oct 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ajs6f
Oct 12, 2018
Member
Shouldn't we be engaging with the Memento community as well about something like this?
Shouldn't we be engaging with the Memento community as well about something like this? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
That's a really good idea. I'll send a question to their dev list. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ajs6f
Oct 12, 2018
Member
Yeah, @hvdsomp is very responsive and I guarantee you'll get some good discussion.
Yeah, @hvdsomp is very responsive and I guarantee you'll get some good discussion. |
added a commit
that referenced
this issue
Oct 12, 2018
acoburn
referenced a pull request that will
close
this issue
Oct 12, 2018
Open
Suppress timemap link parameters by default #240
added a commit
that referenced
this issue
Oct 14, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
martinklein0815
Oct 15, 2018
Thanks for the ping, @ajs6f and @acoburn! I am replying on behalf of @hvdsomp and other members of the Memento crew.
Suppressing the optional "from" and "until" parameters in a timemap link is definitely a pragmatic solution but one that comes with the loss of relevant information - the time span covered in the timemap.
The preferable solution, in our opinion, is to use/promote the use of HTTP link header parsers that do "the right thing", meaning the recognition of commas in headers and their recognition in context. This would likely also help for parsing other headers that hold a datetime such as Memento datetime, Last-Modified, and Date. From our experience, the popular Python-based "requests" library is a good example for a library that distinguishes between commas to separate link headers and commas used within quotes (like for the Memento datetime). For the following link header:
Link: https://github.com/leeper/pdfcount; rel="original", https://scholarlyorphans.org/memento/https://github.com/leeper/pdfcount; rel="timegate", https://scholarlyorphans.org/memento/timemap/link/https://github.com/leeper/pdfcount; rel="timemap"; type="application/link-format", https://scholarlyorphans.org/memento/20180828012815/https://github.com/leeper/pdfcount; rel="memento"; datetime="Tue, 28 Aug 2018 01:28:15 GMT"; collection="memento"
the parsed and formatted output is:
{
'original': {'url': 'https://github.com/leeper/pdfcount', 'rel': 'original'},
'timegate': {'url': 'https://scholarlyorphans.org/memento/https://github.com/leeper/pdfcount', 'rel': 'timegate'},
'timemap': {'url': 'https://scholarlyorphans.org/memento/timemap/link/https://github.com/leeper/pdfcount', 'rel': 'timemap', 'type': 'application/link-format'},
'memento': {'url': 'https://scholarlyorphans.org/memento/20180828012815/https://github.com/leeper/pdfcount', 'rel': 'memento', 'datetime': 'Tue, 28 Aug 2018 01:28:15 GMT', 'collection': 'memento'}
}
and for the link header with from and until parameters:
Link: http://a.example.org/; rel="original timegate", http://a.example.org/?version=all&style=timemap ; rel="timemap"; type="application/link-format" ; from="Tue, 15 Sep 2000 11:28:26 GMT" ; until="Wed, 20 Jan 2010 09:34:33 GMT"
it outputs:
{
'original timegate': {'url': 'http://a.example.org/', 'rel': 'original timegate'},
'timemap': {'url': 'http://a.example.org/?version=all&style=timemap', 'rel': 'timemap', 'type': 'application/link-format', 'from': 'Tue, 15 Sep 2000 11:28:26 GMT', 'until': 'Wed, 20 Jan 2010 09:34:33 GMT'}
}
We are not aware of any many Java-based libraries that perform equally well. We developed our own Java-based library to correctly parse headers and we are happy to share the code base. Our implementation in based on input from here:
https://jar-download.com/artifacts/org.jboss.resteasy.mobile/resteasy-mobile/1.0.0/source-code/org/jboss/resteasy/plugins/delegates/LinkHeaderDelegate.java
Another example of a project that adopted the same code base to implement a link header parser is:
https://github.com/temenostech/IRIS/blob/master/interaction-core/src/main/java/com/temenos/interaction/core/hypermedia/LinkHeaderDelegate.java
Regardless, we would be interested in seeing our custom implementation integrated into off-the-shelf Java parsers - can you see a collaborative path to make this happen?
martinklein0815
commented
Oct 15, 2018
Thanks for the ping, @ajs6f and @acoburn! I am replying on behalf of @hvdsomp and other members of the Memento crew. Suppressing the optional "from" and "until" parameters in a timemap link is definitely a pragmatic solution but one that comes with the loss of relevant information - the time span covered in the timemap. Link: https://github.com/leeper/pdfcount; rel="original", https://scholarlyorphans.org/memento/https://github.com/leeper/pdfcount; rel="timegate", https://scholarlyorphans.org/memento/timemap/link/https://github.com/leeper/pdfcount; rel="timemap"; type="application/link-format", https://scholarlyorphans.org/memento/20180828012815/https://github.com/leeper/pdfcount; rel="memento"; datetime="Tue, 28 Aug 2018 01:28:15 GMT"; collection="memento" the parsed and formatted output is: and for the link header with from and until parameters: Link: http://a.example.org/; rel="original timegate", http://a.example.org/?version=all&style=timemap ; rel="timemap"; type="application/link-format" ; from="Tue, 15 Sep 2000 11:28:26 GMT" ; until="Wed, 20 Jan 2010 09:34:33 GMT" it outputs: We are not aware of any many Java-based libraries that perform equally well. We developed our own Java-based library to correctly parse headers and we are happy to share the code base. Our implementation in based on input from here: Another example of a project that adopted the same code base to implement a link header parser is: Regardless, we would be interested in seeing our custom implementation integrated into off-the-shelf Java parsers - can you see a collaborative path to make this happen? |
acoburn commentedOct 12, 2018
The Memento specification describes two optional parameters in the associated link headers:
from
anduntil
. These parameters, if included,MUST
be formatted according to RFC 1123. For example:until="Wed, 20 Jan 2010 09:34:33 GMT"
Generating this value is no problem. The problem is that many downstream libraries that parse Link headers don't expect a comma to appear inside a link, even though putting it inside a quoted parameter is legit. This is actually a pretty significant problem from a practical perspective, and I would like to suggest that the
from
anduntil
parameters be removed from the memento headers by default.I will introduce a configuration property that can be set to enable those parameters, but I would like the default for them to be turned off.