Skip to content
Please note that GitHub no longer supports your web browser.

We recommend upgrading to the latest Google Chrome or Firefox.

Learn more
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDF CaMeL cASe consistency issues #368

Open
lintool opened this issue Oct 26, 2019 · 4 comments

Comments

@lintool
Copy link
Member

lintool commented Oct 26, 2019

In terms of Scala RDD UDFs, we have:

RemoveHTML(r.getContentString)

And:

RemoveHTML(RemoveHttpHeader(r.getContentString))

I can't think of a case when you'd want clean text but want to keep the HTTP headers... so RemoveHTML should just call RemoveHttpHeader.

Also, we're mixing camel cases, so it should either be:

  1. RemoveHtml and RemoveHttpHeader
  2. RemoveHTML and RemoveHTTPHeader

Note the MiXEd mess we have now.

Option (1) is more conforming to Java practices, but then we have removePrefixWww, which just looks odd. Maybe we can rename to RemoveW3Prefix?

We also have ComputeMD5 and ComputeSHA1, so perhaps option (2) is better?

Thoughts?

@lintool lintool added the clean-up label Oct 26, 2019
@ruebot

This comment has been minimized.

Copy link
Member

ruebot commented Oct 26, 2019

Oh, there's more (UDFs, class, and object names)!

Might as well make a list so we have more info on whatever decision we make:

I might have missed some.


I can't think of a case when you'd want clean text but want to keep the HTTP headers... so RemoveHTML should just call RemoveHttpHeader.

👍

@lintool

This comment has been minimized.

Copy link
Member Author

lintool commented Oct 26, 2019

I think my vote is for RemoveHTML, RemoveHTTPHeader, etc.
Also, fewer things to change.

@ruebot

This comment has been minimized.

Copy link
Member

ruebot commented Oct 26, 2019

I'm confused.

So, if I'm I understanding you correctly, you're suggesting that we don't worry about the list I dropped in, which highlight other examples of what you originally raised, and just roll with inconsistency, or resolve just the two you highlighted?

@lintool

This comment has been minimized.

Copy link
Member Author

lintool commented Oct 26, 2019

What I meant was that RemoveHttpHeader seems to be the only UDF that's inconsistently cased. I think if we renamed RemoveHttpHeader to RemoveHTTPHeader, this issue resolves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.