Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collaboration with EDGI resource maintainers #22

Open
patcon opened this issue Jun 25, 2017 · 7 comments
Open

Collaboration with EDGI resource maintainers #22

patcon opened this issue Jun 25, 2017 · 7 comments

Comments

@patcon
Copy link
Contributor

@patcon patcon commented Jun 25, 2017

Heyo! I'm with Environmental Data & Governance Initiative (EDGI), the new kids on the archiving block. We have two related working groups building software:

  1. Archiving working group. (aka Archiving WG) Building software like alpha.archivers.space for scraping data from gov sites, facilitating a community pipeline for metadata collection, and keeping those datasets in the hands of civil society (potentially via decentralized file systems).
  2. Website Change Monitoring working group. (aka Web Monitoring WG) For building out our system to monitor diffs in gov domains via a pipeline that vets these changes past the eyes of domain experts and into the hands of journalists.

In an effort to help orient our Archiving WG (1), one of our members (h/t @mhucka!) started a similar effort to your awesome list, but in spreadsheet form. The main difference is that we were aiming for a comprehensive survey, not just the "awesome" stuff. This comprehensive approach ensures the we and collaborators know both what we know about and what we don't.)

Related: edgi-govdata-archiving/overview#145

Inspired by that effort, Web Monitoring WG (2) started another awesome list:
https://github.com/edgi-govdata-archiving/awesome-website-change-monitoring

Questions

  1. Any thoughts on how we could integrate our efforts? If there's mutual interest:
    1. might you be amenable to making the resource explicitly collaborative, instituting clear contribution process like provided by a CONTRIBUTING.md. (We'd be happy to share our CONTRIBUTING.md!)
    2. Is our separate website change monitoring tool resource aligned? Might it be part of a collaboration? Should it remain a separate resource?

Curious to hear your thoughts! Thanks!

cc: @dcwalk

@ruebot

This comment has been minimized.

Copy link
Member

@ruebot ruebot commented Jun 26, 2017

@patcon pull requests are definitely welcome. I think it would be great if there were some sections around y'all's work in the current list. Contributing document would be great as well.

@anjackson

This comment has been minimized.

Copy link
Member

@anjackson anjackson commented Jun 26, 2017

Thanks for this @patcon this all looks great!

We want to support tool adoption and development (for all, not just IIPC members) but we've had trouble maintaining our tools lists over long periods of time. The old Tools page on the IIPC website had become hopelessly outdated, and having been involved with this kind of thing before (here's the relevant part of yet another tool registry called COPTR), I've been wary of setting up anything too complex/burdensome.

This is why we've ended up with a relatively brief 'awesome list' rather than something more detailed and comprehensive (like that superb spreadsheet of yours).

So, given all this, how about we try working together as follows:

First, we (IIPC) more clearly define the scope of our awesome list and contribution process (thanks for the template!).

We then aim to include (all/most?) tools in both resources, but the IIPC Awesome List focuses on giving a brief description of each in prose, and defers to your either of your spreadsheets for more details (where available). We might mention which IIPC members are using which tools, but that's about it.

Does that sound like a reasonable approach, for now at least? We're open to anything that will help build some community momentum in this area!

anjackson added a commit to ukwa/awesome-web-archiving that referenced this issue Jun 26, 2017
ruebot added a commit that referenced this issue Jun 26, 2017
* Some clean up and added Slack.

* Separate the basic and mroe advanced stuff, and add the intro video in.

* Added some new links and detail responding to #22.
@anjackson

This comment has been minimized.

Copy link
Member

@anjackson anjackson commented Jun 27, 2017

I've made first pass and improving the list. Feedback welcome.

I'll look at the CONTRIBUTING.md when I get time.

@mhucka

This comment has been minimized.

Copy link

@mhucka mhucka commented Jun 27, 2017

@anjackson Yup, what you describe sounds reasonable. When I first saw the awesome list, I wondered if that and the spreadsheet should be merged somehow, but then realized that the awesome list provides something the spreadsheet doesn't: a curated list of software grouped by purpose, with short summaries of each software package. It also has information about resources other than software. So, they each have their role.

One thing we're thinking of doing for the spreadsheet is to create a front-end form to enter the information. This would make it easier to fill out entries, and would also improve the correctness of the entries by giving people better guidance about the information being sought. (The latter by virtue of the fact that in a web form, one can explain in more detail the info being sought.) It might also help with crowd-sourcing the data entry.

Anyway, thanks for being willing and interested in connecting these resources!

@patcon

This comment has been minimized.

Copy link
Contributor Author

@patcon patcon commented Jun 27, 2017

I am an open data fan, and so am pondering whether the README could be generated from more structured CSV data, so long as it's a subset... ;)

(I like the idea of a world where pretty human-friendly awesome lists are generated from structured data, be that yaml or csv or whatever.)

@pirate

This comment has been minimized.

Copy link
Contributor

@pirate pirate commented Jun 10, 2018

For posterity, there are a bunch of these lists floating around, to varying degrees of accuracy:

I also made an attempt at combining them all here: https://www.reddit.com/r/DataHoarder/wiki/software#wiki_website_archiving_tools
awesome-web-archiving seems like a really good list to keep up-to-date though, because it's indexed via the lists-of-lists network that is awesome-lists.

anjackson added a commit that referenced this issue Oct 16, 2018
* Some clean up and added Slack.

* Separate the basic and mroe advanced stuff, and add the intro video in.

* Added some new links and detail responding to #22.

* Add specific section for web publishers.
@ruebot

This comment has been minimized.

Copy link
Member

@ruebot ruebot commented Feb 25, 2020

Are we good to close this one? We have a CONTRIBUTING.md, pull requests are always open, and we're in the process of resolving #48.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.