Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
31 lines (27 sloc) 826 Bytes

NwalaTextUtils

Collection of text processing Python functions.

Installation

Stable

$ pip install NwalaTextUtils

Dev.

$ git clone https://github.com/oduwsdl/NwalaTextUtils.git
$ cd NwalaTextUtils/
$ pip install .

Usage Examples

Dereference and Remove Boilerplate from URIs:

import json
from NwalaTextUtils.textutils import prlGetTxtFrmURIs

uris_lst = [
    'http://www.euro.who.int/en/health-topics/emergencies/pages/news/news/2015/03/united-kingdom-is-declared-free-of-ebola-virus-disease',
    'https://time.com/3505982/ebola-new-cases-world-health-organization/',
    'https://www.scientificamerican.com/article/why-ebola-survivors-struggle-with-new-symptoms/'
  ]

doc_lst = prlGetTxtFrmURIs(uris_lst)
with open('doc_lst.json', 'w') as outfile:
    json.dump(doc_lst, outfile)
You can’t perform that action at this time.