Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to restrict the number of memento-related headers #402

Open
acoburn opened this issue May 15, 2019 · 7 comments

Comments

5 participants
@acoburn
Copy link
Member

commented May 15, 2019

Over time, the number of Memento headers will continue to grow for a given resource. It turns out that certain HTTP proxies might encounter issues with such a large number of (uncompressed) link headers. Given that it will always be possible to retrieve a TimeMap resource with the complete list of Memento URLs for a resource, it seems sensible to (a) provide a default limit on the number of Memento headers produced in GET responses and (b) make that limit configurable.

What would folks think about a scenario in which, by default, only the most recent 20 Memento versions are listed in the headers. Alternatively, it would also be possible to include the first memento and the last, say, 19 version URLs. The number 20 is somewhat arbitrary here, but it seems like an OK value, unless folks think it should be lower (or higher).

/cc @mjgiarlo @jermnelson @jmartin-sul

@acoburn acoburn added this to To Do in Trellis Linked Data server via automation May 15, 2019

@acoburn acoburn added this to the 0.9.0 Release milestone May 15, 2019

@mjgiarlo

This comment has been minimized.

Copy link

commented May 15, 2019

@acoburn Like you say, the TimeMap should contain the full set. I wonder whether in that case it would be defensible to return 0 Memento versions and assume the client will GET the TimeMap and act on it should the client need access to versions?

I don't have strong inclinations though. Any of the ideas you propose would likely work for our purposes.

@ajs6f

This comment has been minimized.

Copy link
Member

commented May 15, 2019

Can we call on the Memento gang for this? @azaroth42, @hvdsomp, @phonedude, any thoughts? (And thanks in advance!) Has this problem come up for Memento sites (seems like it must have at some point)?

@mjgiarlo

This comment has been minimized.

Copy link

commented May 15, 2019

Can we call on the Memento gang for this? @azaroth42, @hvdsomp, @phonedude, any thoughts? (And thanks in advance!) Has this problem come up for Memento sites (seems like it must have at some point)?

📲 📝

@acoburn

This comment has been minimized.

Copy link
Member Author

commented May 15, 2019

@mjgiarlo or possibly just the first and last version URLs. Given the Java types involved, that would be an exceedingly cheap operation, and putting these values in Link headers is nice for LDP clients, even if it is only a small subset.

@mjgiarlo

This comment has been minimized.

Copy link

commented May 15, 2019

@acoburn That too would be 💯 🅰️-🆗

@azaroth42

This comment has been minimized.

Copy link

commented May 15, 2019

When I was working on Memento, our systems only ever included: First, Last, Timemap, Timegate, Prev, Next.

For very thorough archives, such as content management systems with full version persistence like wikipedia, the number of mementos is overwhelming, and the timemap has the full list. HTTP header links are good for navigation, not for data management :)

@hvdsomp

This comment has been minimized.

Copy link

commented May 15, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.