Skip to content
Please note that GitHub no longer supports your web browser.

We recommend upgrading to the latest Google Chrome or Firefox.

Learn more
Building a text classifier with extremely small datasets
Jupyter Notebook Python
Branch: master
Clone or download
Latest commit 0e1ceb9 Nov 25, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
corpus Initial commit Nov 20, 2019
datasets Initial commit Nov 20, 2019
notebooks Added notebooks for dimensionality reduction Nov 21, 2019
web_crawled Initial commit Nov 20, 2019
.gitignore Initial commit Nov 20, 2019
README.md Update README.md Nov 25, 2019

README.md

Text Classification With Extremely Small Datasets

Accompanying blog : https://towardsdatascience.com/text-classification-with-extremely-small-datasets-333d322caee2

Credits:

  1. Abhijnan Chakraborty, Bhargavi Paranjape, Sourya Kakarla, and Niloy Ganguly. "Stop Clickbait: Detecting and Preventing Clickbaits in Online News Media”. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Fransisco, US, August 2016.
  2. Potthast et al. (2016) https://webis.de/downloads/publications/papers/stein_2016b.pdf
  3. Terrier Stop Word list : https://github.com/terrier-org/terrier-desktop/blob/master/share/stopword-list.txt
  4. Downworthy : https://github.com/snipe/downworthy
  5. Dale Chall Easy word list: http://www.readabilityformulas.com/articles/dale-chall-readability-word-list.php
You can’t perform that action at this time.