Skip to content
Permalink
Browse files

Install deps with requirements.txt; resolve #32. (#33)

- Add requirements.txt
- Update Docker to install depencies with requirements.txt
- Remove nltk_data
- Install nltk_data on Docker build
- Update README with new instructions
  • Loading branch information...
ruebot authored and ianmilligan1 committed Mar 7, 2019
1 parent 7e20729 commit 9afb3ebaffefcd5f856ae50367f3a9cc55ca06f0
Showing with 21 additions and 2,334,346 deletions.
  1. +3 −6 Dockerfile
  2. +13 −8 README.md
  3. BIN nltk_data/corpora/stopwords.zip
  4. +0 −32 nltk_data/corpora/stopwords/README
  5. +0 −248 nltk_data/corpora/stopwords/arabic
  6. +0 −165 nltk_data/corpora/stopwords/azerbaijani
  7. +0 −94 nltk_data/corpora/stopwords/danish
  8. +0 −101 nltk_data/corpora/stopwords/dutch
  9. +0 −179 nltk_data/corpora/stopwords/english
  10. +0 −235 nltk_data/corpora/stopwords/finnish
  11. +0 −155 nltk_data/corpora/stopwords/french
  12. +0 −231 nltk_data/corpora/stopwords/german
  13. +0 −265 nltk_data/corpora/stopwords/greek
  14. +0 −199 nltk_data/corpora/stopwords/hungarian
  15. +0 −758 nltk_data/corpora/stopwords/indonesian
  16. +0 −279 nltk_data/corpora/stopwords/italian
  17. +0 −380 nltk_data/corpora/stopwords/kazakh
  18. +0 −255 nltk_data/corpora/stopwords/nepali
  19. +0 −176 nltk_data/corpora/stopwords/norwegian
  20. +0 −203 nltk_data/corpora/stopwords/portuguese
  21. +0 −356 nltk_data/corpora/stopwords/romanian
  22. +0 −151 nltk_data/corpora/stopwords/russian
  23. +0 −313 nltk_data/corpora/stopwords/spanish
  24. +0 −114 nltk_data/corpora/stopwords/swedish
  25. +0 −53 nltk_data/corpora/stopwords/turkish
  26. BIN nltk_data/sentiment/vader_lexicon.zip
  27. BIN nltk_data/tokenizers/punkt.zip
  28. +0 −98 nltk_data/tokenizers/punkt/PY3/README
  29. BIN nltk_data/tokenizers/punkt/PY3/czech.pickle
  30. BIN nltk_data/tokenizers/punkt/PY3/danish.pickle
  31. BIN nltk_data/tokenizers/punkt/PY3/dutch.pickle
  32. BIN nltk_data/tokenizers/punkt/PY3/english.pickle
  33. BIN nltk_data/tokenizers/punkt/PY3/estonian.pickle
  34. BIN nltk_data/tokenizers/punkt/PY3/finnish.pickle
  35. BIN nltk_data/tokenizers/punkt/PY3/french.pickle
  36. BIN nltk_data/tokenizers/punkt/PY3/german.pickle
  37. BIN nltk_data/tokenizers/punkt/PY3/greek.pickle
  38. BIN nltk_data/tokenizers/punkt/PY3/italian.pickle
  39. BIN nltk_data/tokenizers/punkt/PY3/norwegian.pickle
  40. BIN nltk_data/tokenizers/punkt/PY3/polish.pickle
  41. BIN nltk_data/tokenizers/punkt/PY3/portuguese.pickle
  42. BIN nltk_data/tokenizers/punkt/PY3/slovene.pickle
  43. BIN nltk_data/tokenizers/punkt/PY3/spanish.pickle
  44. BIN nltk_data/tokenizers/punkt/PY3/swedish.pickle
  45. BIN nltk_data/tokenizers/punkt/PY3/turkish.pickle
  46. +0 −98 nltk_data/tokenizers/punkt/README
  47. +0 −159,140 nltk_data/tokenizers/punkt/czech.pickle
  48. +0 −162,767 nltk_data/tokenizers/punkt/danish.pickle
  49. +0 −97,138 nltk_data/tokenizers/punkt/dutch.pickle
  50. +0 −61,702 nltk_data/tokenizers/punkt/english.pickle
  51. +0 −206,369 nltk_data/tokenizers/punkt/estonian.pickle
  52. +0 −240,379 nltk_data/tokenizers/punkt/finnish.pickle
  53. +0 −80,529 nltk_data/tokenizers/punkt/french.pickle
  54. +0 −181,299 nltk_data/tokenizers/punkt/german.pickle
  55. +0 −89,257 nltk_data/tokenizers/punkt/greek.pickle
  56. +0 −90,202 nltk_data/tokenizers/punkt/italian.pickle
  57. +0 −162,978 nltk_data/tokenizers/punkt/norwegian.pickle
  58. +0 −245,172 nltk_data/tokenizers/punkt/polish.pickle
  59. +0 −90,795 nltk_data/tokenizers/punkt/portuguese.pickle
  60. +0 −106,925 nltk_data/tokenizers/punkt/slovene.pickle
  61. +0 −82,636 nltk_data/tokenizers/punkt/spanish.pickle
  62. +0 −133,719 nltk_data/tokenizers/punkt/swedish.pickle
  63. +0 −138,187 nltk_data/tokenizers/punkt/turkish.pickle
  64. +5 −0 requirements.txt
@@ -8,15 +8,12 @@ LABEL description="Docker image for the Archives Unleashed Notebooks"
LABEL website="https://archivesunleashed.org/"

# Install auk-notebook dependencies.
RUN pip install matplotlib==3.0.2 \
numpy==1.15.1 \
pandas==0.23.4 \
networkx==2.2 \
nltk==3.4
COPY requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
RUN python -m nltk.downloader punkt vader_lexicon stopwords

# Copy auk-notebook files over.
COPY data $HOME/data
COPY nltk_data $HOME/nltk_data
COPY auk-notebook.ipynb $HOME
COPY auk-notebook-example.ipynb $HOME

@@ -20,11 +20,24 @@
* pandas (0.23.4)
* networkx (2.2)
* nltk (3.4)
* punkt
* vader_lexicon
* stopwords

## Usage

We suggest using [Docker](https://www.docker.com/get-started), or [Anaconda Distribution](https://www.anaconda.com/distribution).

### Local (Anaconda)

```bash
git clone https://github.com/archivesunleashed/auk-notebooks.git
cd auk-notebooks
pip install -r requirements.txt
python -m nltk.downloader punkt vader_lexicon stopwords
jupyter notebook
```

### Docker Hub

```bash
@@ -50,14 +63,6 @@ docker run --rm -it -p 8888:8888 -v "/path/to/own/data:/home/jovyan/data" auk-no
This repository also uses the [Jupyter Docker Stacks](https://jupyter-docker-stacks.readthedocs.io/en/latest/index.html), which provide [a lot of helpful options to take advantage of](https://jupyter-docker-stacks.readthedocs.io/en/latest/using/common.html#docker-options).

### Local (Anaconda)

```bash
git clone https://github.com/archivesunleashed/auk-notebooks.git
cd auk-notebooks
jupyter notebook
```

## License

This application is available as open source under the terms of the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0).
Binary file not shown.

This file was deleted.

Oops, something went wrong.

This file was deleted.

Oops, something went wrong.
Oops, something went wrong.

0 comments on commit 9afb3eb

Please sign in to comment.
You can’t perform that action at this time.