Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use NLTK stopwords, cite example dataset. #15

Merged
merged 14 commits into from Mar 4, 2019

Conversation

Projects
None yet
3 participants
@ruebot
Copy link
Member

ruebot commented Mar 3, 2019

DO NOT MERGE UNTIL #12 is merged.

- Resolve #14
- Resolve #13
- Update notebooks to use NLTK stopwords
- Add NLTK stopwords

ruebot added some commits Mar 1, 2019

Setup Docker for notebooks
- Add Dockerfile,
- Add required nltk data
- Add instructions in README
Resolve #9
- Rename notebook files
  - Create example notebook with existing data
  - Create notebook with empty collection id, and help text
- Update language in README and notebooks
- Tweak Dockerfile
Use NLTK stopwords, cite example dataset.
- Resolve #14
- Resolve #13
- Update notebooks to use NLTK stopwords
- Add NLTK stopwords

@ruebot ruebot requested review from ianmilligan1 and greebie Mar 3, 2019

@ruebot

This comment has been minimized.

Copy link
Member Author

ruebot commented Mar 3, 2019

Binder

@ruebot

This comment has been minimized.

Copy link
Member Author

ruebot commented Mar 3, 2019

Since we have some discussion going on #13, we'll need to make sure we don't mark it as resolved in the commit message. Just partially addressed, since we're actually citing it now.

@ianmilligan1
Copy link
Member

ianmilligan1 left a comment

Tested and works well with the NLTK stop words!

ruebot added some commits Mar 4, 2019

@ruebot

This comment has been minimized.

Copy link
Member Author

ruebot commented Mar 4, 2019

Commit message:

Title: Use NLTK stopwords, update README (#15)

Body:

    - Resolve #14
    - Partially address #13
    - Resolve #17 
    - Update notebooks to use NLTK stopwords
    - Add NLTK stopwords

screenshot from 2019-03-03 21-56-01

README.md Outdated
docker build -t auk-notebook .
docker run --rm -it -p 8888:8888 auk-notebook
```console
$ docker run --rm -it -p 8888:8888 archivesunleashed/auk-notebooks

This comment has been minimized.

@ruebot

ruebot Mar 4, 2019

Author Member

@ianmilligan1 @greebie something to consider here, do you want the prompt example (console)? Or remove the prompt, and use bash? I'm fine with either.

This comment has been minimized.

@ianmilligan1

ianmilligan1 Mar 4, 2019

Member

I'm fine with either as well. Is there a preferred style in this field do you think? I've seen both..

This comment has been minimized.

@ruebot

ruebot Mar 4, 2019

Author Member

It's usually one or the other. My general preference is for the prompt to be there. But, that can cause problems for folks copying & pasting without knowing what the $ represents.

This comment has been minimized.

@ianmilligan1

ianmilligan1 Mar 4, 2019

Member

Heh, and checking our repositories, I see we use both (I tend to just provide bash commands myself without prompts). I think given our audience, maybe if everything's equal, let's just make it copy-and-paste friendly and go with bash?

This comment has been minimized.

@ruebot

ruebot Mar 4, 2019

Author Member

Cool. I'll change it back.

@greebie

This comment has been minimized.

Copy link
Collaborator

greebie commented Mar 4, 2019

Works great. One reminder is that if someone has a notebook up, the docker instructions may not show the correct output. Something to consider for the README.md.

@greebie

greebie approved these changes Mar 4, 2019

ruebot added some commits Mar 4, 2019

@ruebot

This comment has been minimized.

Copy link
Member Author

ruebot commented Mar 4, 2019

@ianmilligan1 once dockercloud finishes, this should be good to merge with the commit message above.

Once that's done, I'm jump on some of the other open issues.

@ianmilligan1 ianmilligan1 merged commit d1088fa into master Mar 4, 2019

1 check passed

ci/dockercloud Your tests passed in Docker Cloud
Details

@ianmilligan1 ianmilligan1 deleted the issue-14 branch Mar 4, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.