Skip to content
Permalink
Browse files

Small page for RESAW Workshop (#124)

  • Loading branch information...
ianmilligan1 authored and ruebot committed May 27, 2019
1 parent 89ee08d commit d00a89b0928f6e75ec5c3e2de8a8114674288b28
Showing with 43 additions and 0 deletions.
  1. +43 −0 content/aut/resaw.md
@@ -0,0 +1,43 @@
---
title: Digging into WARCs - Hands-On with the Archives Unleashed Toolkit
date: 2019-05-27T11:51:41-04:00
---

![AUK Notebook screenshot](/images/prompt.png)

## Workshop Description

Welcome to "Digging into WARCs: Hands-On with the Archives Unleashed Toolkit." The Archives Unleashed Toolkit, or AUT, is an open-source platform for managing and analyzing web archives built on Apache Spark.

This is a hands-on introductory workshop with the [Archives Unleashed Toolkit](/aut) and [Archives Unleashed Jupyter Notebooks](/notebooks). No existing technical knowledge is needed, and we will be aiming this at beginner and intermediate users alike.

This workshop is being held on **Tuesday, June 18th** from **14:00-17:30** in **room BG2 0.02** at the ["Web that Was: Archives, Traces, Reflections"](http://thewebthatwas.net) conference.

## Workshop Schedule

| Time | Content |
|-------------|----------------------------------------------------------------------|
| 1400 - 1410 | Introductions, Getting Settled |
| 1410 - 1430 | Introduction to the Archives Unleashed Toolkit (and related project) |
| 1430 - 1530 | Hands-on with the Archives Unleashed Toolkit, Gephi, etc. |
| 1530 - 1600 | Coffee Break |
| 1600 - 1630 | More Advanced Analytics (DataFrames, etc.) |
| 1630 - 1715 | Digging into WARCs with Jupyter Notebooks |
| 1715 - 1730 | Wrap Up |

## Homework

What should you bring? If you want to dig into WARCs yourself, you'll need a laptop. If not, we will be working through exercises collectively, so you are more than welcome to participate in that manner too.

If you are planning to participate in the hands-on components, to reduce load on the conference WiFi we would also like you to do the following homework:

* Installing **Docker for Windows or Mac**: <https://archivesunleashed.org/aut/docker-install/>
* Please install **Anaconda Distribution for your platform**: <https://www.anaconda.com/>distribution/. If you are versed on the command line, try installing our Notebooks using the instructions under Local (Anaconda) here: <https://github.com/archivesunleashed/auk-notebooks>. If not, don’t worry!

While we will provide sample data as part of the workshop, you may want to try it out with your own data. If you have some small WARCs you can bring those, or alternatively you can crawl a few websites with <https://webrecorder.io> and export the WARC(s).

## Questions?

If you have any questions, please contact [Nick Ruest](mailto:nick@archivesunleashed.org) or [Ian Milligan](mailto:i2milligan@uwaterloo.ca).

![AUK Notebook screenshot](/images/gephi.png)

0 comments on commit d00a89b

Please sign in to comment.
You can’t perform that action at this time.