Skip to content
Permalink
Browse files

Install deps with requirements.txt; resolve #32. (#33)

- Add requirements.txt
- Update Docker to install depencies with requirements.txt
- Remove nltk_data
- Install nltk_data on Docker build
- Update README with new instructions
  • Loading branch information...
ruebot authored and ianmilligan1 committed Mar 7, 2019
1 parent 7e20729 commit 9afb3ebaffefcd5f856ae50367f3a9cc55ca06f0
Showing with 21 additions and 2,334,346 deletions.
  1. +3 −6 Dockerfile
  2. +13 −8 README.md
  3. BIN nltk_data/corpora/stopwords.zip
  4. +0 −32 nltk_data/corpora/stopwords/README
  5. +0 −248 nltk_data/corpora/stopwords/arabic
  6. +0 −165 nltk_data/corpora/stopwords/azerbaijani
  7. +0 −94 nltk_data/corpora/stopwords/danish
  8. +0 −101 nltk_data/corpora/stopwords/dutch
  9. +0 −179 nltk_data/corpora/stopwords/english
  10. +0 −235 nltk_data/corpora/stopwords/finnish
  11. +0 −155 nltk_data/corpora/stopwords/french
  12. +0 −231 nltk_data/corpora/stopwords/german
  13. +0 −265 nltk_data/corpora/stopwords/greek
  14. +0 −199 nltk_data/corpora/stopwords/hungarian
  15. +0 −758 nltk_data/corpora/stopwords/indonesian
  16. +0 −279 nltk_data/corpora/stopwords/italian
  17. +0 −380 nltk_data/corpora/stopwords/kazakh
  18. +0 −255 nltk_data/corpora/stopwords/nepali
  19. +0 −176 nltk_data/corpora/stopwords/norwegian
  20. +0 −203 nltk_data/corpora/stopwords/portuguese
  21. +0 −356 nltk_data/corpora/stopwords/romanian
  22. +0 −151 nltk_data/corpora/stopwords/russian
  23. +0 −313 nltk_data/corpora/stopwords/spanish
  24. +0 −114 nltk_data/corpora/stopwords/swedish
  25. +0 −53 nltk_data/corpora/stopwords/turkish
  26. BIN nltk_data/sentiment/vader_lexicon.zip
  27. BIN nltk_data/tokenizers/punkt.zip
  28. +0 −98 nltk_data/tokenizers/punkt/PY3/README
  29. BIN nltk_data/tokenizers/punkt/PY3/czech.pickle
  30. BIN nltk_data/tokenizers/punkt/PY3/danish.pickle
  31. BIN nltk_data/tokenizers/punkt/PY3/dutch.pickle
  32. BIN nltk_data/tokenizers/punkt/PY3/english.pickle
  33. BIN nltk_data/tokenizers/punkt/PY3/estonian.pickle
  34. BIN nltk_data/tokenizers/punkt/PY3/finnish.pickle
  35. BIN nltk_data/tokenizers/punkt/PY3/french.pickle
  36. BIN nltk_data/tokenizers/punkt/PY3/german.pickle
  37. BIN nltk_data/tokenizers/punkt/PY3/greek.pickle
  38. BIN nltk_data/tokenizers/punkt/PY3/italian.pickle
  39. BIN nltk_data/tokenizers/punkt/PY3/norwegian.pickle
  40. BIN nltk_data/tokenizers/punkt/PY3/polish.pickle
  41. BIN nltk_data/tokenizers/punkt/PY3/portuguese.pickle
  42. BIN nltk_data/tokenizers/punkt/PY3/slovene.pickle
  43. BIN nltk_data/tokenizers/punkt/PY3/spanish.pickle
  44. BIN nltk_data/tokenizers/punkt/PY3/swedish.pickle
  45. BIN nltk_data/tokenizers/punkt/PY3/turkish.pickle
  46. +0 −98 nltk_data/tokenizers/punkt/README
  47. +0 −159,140 nltk_data/tokenizers/punkt/czech.pickle
  48. +0 −162,767 nltk_data/tokenizers/punkt/danish.pickle
  49. +0 −97,138 nltk_data/tokenizers/punkt/dutch.pickle
  50. +0 −61,702 nltk_data/tokenizers/punkt/english.pickle
  51. +0 −206,369 nltk_data/tokenizers/punkt/estonian.pickle
  52. +0 −240,379 nltk_data/tokenizers/punkt/finnish.pickle
  53. +0 −80,529 nltk_data/tokenizers/punkt/french.pickle
  54. +0 −181,299 nltk_data/tokenizers/punkt/german.pickle
  55. +0 −89,257 nltk_data/tokenizers/punkt/greek.pickle
  56. +0 −90,202 nltk_data/tokenizers/punkt/italian.pickle
  57. +0 −162,978 nltk_data/tokenizers/punkt/norwegian.pickle
  58. +0 −245,172 nltk_data/tokenizers/punkt/polish.pickle
  59. +0 −90,795 nltk_data/tokenizers/punkt/portuguese.pickle
  60. +0 −106,925 nltk_data/tokenizers/punkt/slovene.pickle
  61. +0 −82,636 nltk_data/tokenizers/punkt/spanish.pickle
  62. +0 −133,719 nltk_data/tokenizers/punkt/swedish.pickle
  63. +0 −138,187 nltk_data/tokenizers/punkt/turkish.pickle
  64. +5 −0 requirements.txt
@@ -8,15 +8,12 @@ LABEL description="Docker image for the Archives Unleashed Notebooks"
LABEL website="https://archivesunleashed.org/"

# Install auk-notebook dependencies.
RUN pip install matplotlib==3.0.2 \
numpy==1.15.1 \
pandas==0.23.4 \
networkx==2.2 \
nltk==3.4
COPY requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
RUN python -m nltk.downloader punkt vader_lexicon stopwords

# Copy auk-notebook files over.
COPY data $HOME/data
COPY nltk_data $HOME/nltk_data
COPY auk-notebook.ipynb $HOME
COPY auk-notebook-example.ipynb $HOME

@@ -20,11 +20,24 @@
* pandas (0.23.4)
* networkx (2.2)
* nltk (3.4)
* punkt
* vader_lexicon
* stopwords

## Usage

We suggest using [Docker](https://www.docker.com/get-started), or [Anaconda Distribution](https://www.anaconda.com/distribution).

### Local (Anaconda)

```bash
git clone https://github.com/archivesunleashed/auk-notebooks.git
cd auk-notebooks
pip install -r requirements.txt
python -m nltk.downloader punkt vader_lexicon stopwords
jupyter notebook
```

### Docker Hub

```bash
@@ -50,14 +63,6 @@ docker run --rm -it -p 8888:8888 -v "/path/to/own/data:/home/jovyan/data" auk-no
This repository also uses the [Jupyter Docker Stacks](https://jupyter-docker-stacks.readthedocs.io/en/latest/index.html), which provide [a lot of helpful options to take advantage of](https://jupyter-docker-stacks.readthedocs.io/en/latest/using/common.html#docker-options).

### Local (Anaconda)

```bash
git clone https://github.com/archivesunleashed/auk-notebooks.git
cd auk-notebooks
jupyter notebook
```

## License

This application is available as open source under the terms of the [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0).
Binary file not shown.

This file was deleted.

Oops, something went wrong.

This file was deleted.

Oops, something went wrong.
Oops, something went wrong.

0 comments on commit 9afb3eb

Please sign in to comment.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.