Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Python versions of App utilities #409

Closed
ruebot opened this issue Jan 17, 2020 · 0 comments
Closed

Implement Python versions of App utilities #409

ruebot opened this issue Jan 17, 2020 · 0 comments

Comments

@ruebot
Copy link
Member

@ruebot ruebot commented Jan 17, 2020

Functionality RDD Scala DF Python DF
CommandLineApp Yes Yes
DomainFrequencyExtractor Yes Yes
DomainGraphExtractor Yes Yes
ExtractEntities Yes -
ExtractGraphX Yes No
ExtractPopularImages Yes Yes
NERCombinerJson Yes -
PlainTextExtractor Yes No
WriteGEXF Yes Yes
WriteGraph Yes Yes
WriteGraphML Yes Yes

Stealing @SinghGursimran's very helpful tables here 😃

ruebot added a commit that referenced this issue Jan 18, 2020
- Add remove_http_header, remove_prefix_www
- Rename extract_domain_func to extract_domain
- Formatting updates
- Addresses #409
@ruebot ruebot added this to ToDo in DataFrames and PySpark Feb 5, 2020
ruebot added a commit that referenced this issue May 19, 2020
- Resolves #408
- Alphabetizes DataFrameloader functions
- Alphabetizes UDFs functions
- Move DataFrameLoader to df packages
- Move UDFs out of df into their own package
- Rename UDFs (no more DF tagged to the end).
- Update tests as necessary
- Partially addresses #410, #409
- Supersedes #412.
ianmilligan1 pushed a commit that referenced this issue May 19, 2020
- Resolves #408
- Alphabetizes DataFrameloader functions
- Alphabetizes UDFs functions
- Move DataFrameLoader to df packages
- Move UDFs out of df into their own package
- Rename UDFs (no more DF tagged to the end).
- Update tests as necessary
- Partially addresses #410, #409
- Supersedes #412.
@ruebot ruebot self-assigned this May 19, 2020
@ruebot ruebot moved this from ToDo to In Progress in DataFrames and PySpark May 19, 2020
ruebot added a commit that referenced this issue May 25, 2020
- Partially addresses #409
- Clean up formatting in app.py
- Cleanup doc comments on the Scala side
- TODO: Extract Entities
@ruebot ruebot moved this from In Progress to In review in DataFrames and PySpark May 27, 2020
@ruebot ruebot moved this from In review to Done in DataFrames and PySpark May 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.