Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upUDFs that filter on url should also filter on src #418
Comments
This comment has been minimized.
This comment has been minimized.
@SinghGursimran want this one since we're stuck in a holding pattern on the Python side of things until I sort out the Scala UDF -> Python UDF linkage? |
This comment has been minimized.
This comment has been minimized.
@ruebot Shall I add a new function to incorporate src and dest OR accommodate this within the same function using an extra argument? |
This comment has been minimized.
This comment has been minimized.
Based on the chat @lintool and I where having in Slack this morning, it'd be amending the current functions. I think we could just do this with try cases (oh, I don't know what the proper Scala term is for it |
This comment has been minimized.
This comment has been minimized.
Ok.... |
This comment has been minimized.
This comment has been minimized.
@SinghGursimran If it helps to see an actual use case/test case, this is how it popped up: https://gist.github.com/ruebot/60b5f848252284b7f380e3d5006d7135 I tried to run the |
ruebot commentedFeb 10, 2020
We are currently unable to run a number of DataFrame filters on
.imageLinks()
andwebgraph()
because they havesrc
and/ordest
columns instead ofurl
. The DataFrame filters should be able to filter on those columns as well.keepUrlsDF
keepDomainsDF
discardUrlsDF
discardDomainsDF
discardUrlPatternsDF
keepUrlPatternsDF