Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification around network diagrams #275

Open
ruebot opened this Issue Mar 25, 2019 · 8 comments

Comments

Projects
None yet
4 participants
@ruebot
Copy link
Member

ruebot commented Mar 25, 2019

During the Team Kompromat presentation at the DC Datathon, @edsu noted that the network diagrams can be misleading. One could assume that the network diagram represents what is in the archive itself that was analyzed. We should clarify that this is not the case. So, what's the best place to do it? A note on the diagram, something in the documentation? Something else?

@ruebot ruebot added the ux label Mar 25, 2019

@ianmilligan1

This comment has been minimized.

Copy link
Member

ianmilligan1 commented Mar 25, 2019

Hmm. Maybe a note in the documentation as well as a hover-over question mark icon to display some help text like we do with the derivatives?

@greebie

This comment has been minimized.

Copy link
Contributor

greebie commented Mar 25, 2019

Could we be specific about what is the case (we capture every domain and create an edge for every link we find in the web page)? That is a limitation of the network graphs, since I think people imagine the archives to contain everything in Way Back. (That would be really nice, of course!)

@ianmilligan1

This comment has been minimized.

Copy link
Member

ianmilligan1 commented Mar 25, 2019

Just that what is being visualized is the domains that are captured as well as the domains that they link to (which may or may not be in the actual web archived collection).

@ianmilligan1

This comment has been minimized.

Copy link
Member

ianmilligan1 commented Mar 25, 2019

What about something like this?

Screen Shot 2019-03-25 at 11 04 28 AM

@greebie

This comment has been minimized.

Copy link
Contributor

greebie commented Mar 25, 2019

That works for me!

@ruebot

This comment has been minimized.

Copy link
Member Author

ruebot commented Mar 25, 2019

@ianmilligan1 I like that!

@edsu does that work?

@edsu

This comment has been minimized.

Copy link

edsu commented Mar 25, 2019

Thanks for hearing this part of the presentation, and dropping it in here. You guys are awesome. I like the explanation.

I guess I was imagining (at least) two different types of users of this view.

  • Archivists might like to see what was linked to but not crawled, because it could help them building their collections.
  • Researchers who are trying to understand the content might not care too much about what was archived, and are more interested in seeing the relationships regardless of whether they were crawled. Although I guess seeing what was not crawled could help inform other visualizations, like text analysis, etc.

Maybe it would need to be two views? It would be nice if the underlying derivative Gephi file had a property indicating whether it was crawled or not. Then it could be easy for people to examine...

@greebie

This comment has been minimized.

Copy link
Contributor

greebie commented Mar 25, 2019

Adding a "crawled" or "domain"=1 attribute to the gexf would not be too expensive or difficult. Might be worth considering something in the sigmaJS to indicate a crawl as well (change the text size and/or colour? or the node shape?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.