Permalink
Please
sign in to comment.
@@ -0,0 +1,43 @@ | |||
--- | |||
title: Digging into WARCs - Hands-On with the Archives Unleashed Toolkit | |||
date: 2019-05-27T11:51:41-04:00 | |||
--- | |||
|
|||
![AUK Notebook screenshot](/images/prompt.png) | |||
|
|||
## Workshop Description | |||
|
|||
Welcome to "Digging into WARCs: Hands-On with the Archives Unleashed Toolkit." The Archives Unleashed Toolkit, or AUT, is an open-source platform for managing and analyzing web archives built on Apache Spark. | |||
|
|||
This is a hands-on introductory workshop with the [Archives Unleashed Toolkit](/aut) and [Archives Unleashed Jupyter Notebooks](/notebooks). No existing technical knowledge is needed, and we will be aiming this at beginner and intermediate users alike. | |||
|
|||
This workshop is being held on **Tuesday, June 18th** from **14:00-17:30** in **room BG2 0.02** at the ["Web that Was: Archives, Traces, Reflections"](http://thewebthatwas.net) conference. | |||
|
|||
## Workshop Schedule | |||
|
|||
| Time | Content | | |||
|-------------|----------------------------------------------------------------------| | |||
| 1400 - 1410 | Introductions, Getting Settled | | |||
| 1410 - 1430 | Introduction to the Archives Unleashed Toolkit (and related project) | | |||
| 1430 - 1530 | Hands-on with the Archives Unleashed Toolkit, Gephi, etc. | | |||
| 1530 - 1600 | Coffee Break | | |||
| 1600 - 1630 | More Advanced Analytics (DataFrames, etc.) | | |||
| 1630 - 1715 | Digging into WARCs with Jupyter Notebooks | | |||
| 1715 - 1730 | Wrap Up | | |||
|
|||
## Homework | |||
|
|||
What should you bring? If you want to dig into WARCs yourself, you'll need a laptop. If not, we will be working through exercises collectively, so you are more than welcome to participate in that manner too. | |||
|
|||
If you are planning to participate in the hands-on components, to reduce load on the conference WiFi we would also like you to do the following homework: | |||
|
|||
* Installing **Docker for Windows or Mac**: <https://archivesunleashed.org/aut/docker-install/> | |||
* Please install **Anaconda Distribution for your platform**: <https://www.anaconda.com/>distribution/. If you are versed on the command line, try installing our Notebooks using the instructions under Local (Anaconda) here: <https://github.com/archivesunleashed/auk-notebooks>. If not, don’t worry! | |||
|
|||
While we will provide sample data as part of the workshop, you may want to try it out with your own data. If you have some small WARCs you can bring those, or alternatively you can crawl a few websites with <https://webrecorder.io> and export the WARC(s). | |||
|
|||
## Questions? | |||
|
|||
If you have any questions, please contact [Nick Ruest](mailto:nick@archivesunleashed.org) or [Ian Milligan](mailto:i2milligan@uwaterloo.ca). | |||
|
|||
![AUK Notebook screenshot](/images/gephi.png) |
0 comments on commit
d00a89b