Update Doc for Latest Changes #19

SinghGursimran · 2019-11-07T03:09:53Z

No description provided.


        Updated Documentation


        Updated Documentation


        Updated_Documentation


        UpdateCurrentDoc


        Update


        changes

ruebot

Just a couple changes, other than that. Good to go.

ruebot · 2019-11-07T11:50:04Z

current/text-analysis.md

  .saveAsTextFile("plain-text-noheaders/")
 ```

 As most plain text use cases do not require HTTP headers to be in the output, we are removing headers in the following examples.

 ### Scala DF

-TODO
+```scala


Let's change this so it is consistent with the others:

import io.archivesunleashed._ import io.archivesunleashed.df._ RecordLoader.loadArchives("example.warc.gz", sc) .extractValidPagesDF() .select(RemoveHTML($"content")) .write .option("header","true") .csv("plain-text-noheaders/")

ruebot · 2019-11-07T11:50:04Z

current/text-analysis.md

+import io.archivesunleashed._
+import io.archivesunleashed.df._
+
+RecordLoader.loadArchives("src/test/resources/warc/example.warc.gz", sc).extractValidPagesDF()


We can strip out: src/test/resources/warc/ here.


        changes


        Merge branch 'master' of https://github.com/SinghGursimran/aut-docs-new


        review

g285sing added 6 commits Nov 5, 2019

Updated Documentation

c0fe6c3

Updated Documentation

4406e65

Updated_Documentation

e7dae2a

UpdateCurrentDoc

88f9759

Update

b910251

changes

c94d32b

ruebot requested changes Nov 7, 2019

View changes

g285sing added 3 commits Nov 7, 2019

changes

052d94e

Merge branch 'master' of https://github.com/SinghGursimran/aut-docs-new

61516ad

review

9785d1d

ruebot approved these changes Nov 7, 2019

View changes

ruebot merged commit 4f73504 into archivesunleashed:master Nov 7, 2019

Please note that GitHub no longer supports your web browser.

archivesunleashed/aut-docs-new

Update Doc for Latest Changes #19

Update Doc for Latest Changes #19

SinghGursimran commented Nov 7, 2019

ruebot left a comment

This comment has been minimized.

This comment has been minimized.

Please note that GitHub no longer supports your web browser.

archivesunleashed/aut-docs-new

Join GitHub today

Update Doc for Latest Changes #19

Conversation

SinghGursimran commented Nov 7, 2019

ruebot left a comment

This comment has been minimized.

ruebot Nov 7, 2019

This comment has been minimized.

ruebot Nov 7, 2019