Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test aut with Apache Spark 2.4.0 #295

Closed
ruebot opened this issue Nov 30, 2018 · 7 comments

Comments

Projects
None yet
3 participants
@ruebot
Copy link
Member

commented Nov 30, 2018

Spark 2.4.0 came out in November 2018. We should put aut a few tests with it.

  1. Update pom.xml to use Apache Spark 2.4.0
  2. Update pom.xml to use Scala 2.11.12
  3. Test all examples here
  4. Test examples slated for 0.18.0 release
  5. Test with AUK

Did I miss anything?

@ruebot ruebot self-assigned this Nov 30, 2018

ruebot added a commit that referenced this issue Nov 30, 2018

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Nov 30, 2018

@ianmilligan1

This comment has been minimized.

Copy link
Member

commented Nov 30, 2018

Tested with Spark 2.4.0 on all the examples both currently live on the docs as well as those on the 0.18.0 branch. All checked out.

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Nov 30, 2018

That's great! Thank you!!

I'll test again with AUK. I did it on Monday, and didn't run into this. Trying to figure out if that made it into the release or not. Because the release notes have it as a known issue, but it worked on Monday. Maybe it's more of an FYI 🤷‍♂

...anyway. Testing again!

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Nov 30, 2018

Apache Spark 2.4.0 + aut 0.17.0 in AUK works:

/home/nruest/bin/spark-2.4.0-bin-hadoop2.7/bin/spark-shell --master local\[10\] --driver-memory 30G --conf spark.network.timeout=10000000 --conf spark.executor.heartbeatInterval=600s --conf spark.driver.maxResultSize=4G --packages "io.archivesunleashed:aut:0.17.0" -i /home/nruest/tmp/auk/75/3552/2/spark_jobs/3552.scala | tee /home/nruest/tmp/auk/75/3552/2/spark_jobs/3552.scala.log

...

$ tree /home/nruest/tmp/auk/75/3552/2
/home/nruest/tmp/auk/75/3552/2
├── derivatives
│   ├── all-domains
│   │   └── output
│   │       ├── part-00000
│   │       ├── part-00001
│   │       ├── part-00002
│   │       ├── part-00003
│   │       ├── part-00004
│   │       └── _SUCCESS
│   ├── all-text
│   │   └── output
│   │       ├── part-00000
│   │       ├── part-00001
│   │       ├── part-00002
│   │       ├── part-00003
│   │       ├── part-00004
│   │       ├── part-00005
│   │       ├── part-00006
│   │       ├── part-00007
│   │       ├── part-00008
│   │       ├── part-00009
│   │       └── _SUCCESS
│   └── gephi
│       └── 3552-gephi.graphml
└── spark_jobs
    ├── 3552.scala
    └── 3552.scala.log

7 directories, 20 files
@ruebot

This comment has been minimized.

Copy link
Member Author

commented Nov 30, 2018

Apache Spark 2.4.0 + aut 0.17.1-SNAPSHOT in AUK fails:

/home/nruest/bin/spark-2.4.0-bin-hadoop2.7/bin/spark-shell --master local\[10\] --driver-memory 30G --conf spark.network.timeout=10000000 --conf spark.executor.heartbeatInterval=600s --conf spark.driver.maxResultSize=4G --packages "io.archivesunleashed:aut:0.17.1-SNAPSHOT" -i /home/nruest/tmp/auk/75/3552/2/spark_jobs/3552.scala | tee /home/nruest/tmp/auk/75/3552/2/spark_jobs/3552.scala.log

:: problems summary ::
:::: WARNINGS
                [NOT FOUND  ] commons-configuration#commons-configuration;1.8!commons-configuration.jar (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/commons-configuration/commons-configuration/1.8/commons-configuration-1.8.jar

                [NOT FOUND  ] org.apache.commons#commons-lang3;3.3.1!commons-lang3.jar (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/org/apache/commons/commons-lang3/3.3.1/commons-lang3-3.3.1.jar

                [NOT FOUND  ] com.google.protobuf#protobuf-java;3.2.0!protobuf-java.jar(bundle) (1ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/com/google/protobuf/protobuf-java/3.2.0/protobuf-java-3.2.0.jar

                [NOT FOUND  ] commons-codec#commons-codec;1.11!commons-codec.jar (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/commons-codec/commons-codec/1.11/commons-codec-1.11.jar

                [NOT FOUND  ] commons-io#commons-io;2.6!commons-io.jar (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/commons-io/commons-io/2.6/commons-io-2.6.jar

                [NOT FOUND  ] com.fasterxml.jackson.core#jackson-core;2.9.6!jackson-core.jar(bundle) (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.9.6/jackson-core-2.9.6.jar

                [NOT FOUND  ] com.fasterxml.jackson.core#jackson-annotations;2.9.6!jackson-annotations.jar(bundle) (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.9.6/jackson-annotations-2.9.6.jar

                [NOT FOUND  ] javax.xml.bind#jaxb-api;2.3.0!jaxb-api.jar (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/javax/xml/bind/jaxb-api/2.3.0/jaxb-api-2.3.0.jar

                [NOT FOUND  ] javax.ws.rs#javax.ws.rs-api;2.1!javax.ws.rs-api.${packaging.type} (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/javax/ws/rs/javax.ws.rs-api/2.1/javax.ws.rs-api-2.1.${packaging.type}

                [NOT FOUND  ] javax.annotation#javax.annotation-api;1.3!javax.annotation-api.jar (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/javax/annotation/javax.annotation-api/1.3/javax.annotation-api-1.3.jar

                [NOT FOUND  ] org.apache.httpcomponents#httpcore;4.4.10!httpcore.jar (0ms)

        ==== local-m2-cache: tried

          file:/home/nruest/.m2/repository/org/apache/httpcomponents/httpcore/4.4.10/httpcore-4.4.10.jar

                ::::::::::::::::::::::::::::::::::::::::::::::

                ::              FAILED DOWNLOADS            ::

                :: ^ see resolution messages for details  ^ ::

                ::::::::::::::::::::::::::::::::::::::::::::::

                :: commons-configuration#commons-configuration;1.8!commons-configuration.jar

                :: org.apache.commons#commons-lang3;3.3.1!commons-lang3.jar

                :: com.google.protobuf#protobuf-java;3.2.0!protobuf-java.jar(bundle)

                :: javax.xml.bind#jaxb-api;2.3.0!jaxb-api.jar

                :: commons-codec#commons-codec;1.11!commons-codec.jar

                :: javax.ws.rs#javax.ws.rs-api;2.1!javax.ws.rs-api.${packaging.type}

                :: javax.annotation#javax.annotation-api;1.3!javax.annotation-api.jar

                :: commons-io#commons-io;2.6!commons-io.jar

                :: org.apache.httpcomponents#httpcore;4.4.10!httpcore.jar

                :: com.fasterxml.jackson.core#jackson-core;2.9.6!jackson-core.jar(bundle)

                :: com.fasterxml.jackson.core#jackson-annotations;2.9.6!jackson-annotations.jar(bundle)

                ::::::::::::::::::::::::::::::::::::::::::::::



:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [download failed: commons-configuration#commons-configuration;1.8!commons-configuration.jar, download failed: org.apache.commons#commons-lang3;3.3.1!commons-lang3.jar, download failed: com.google.protobuf#protobuf-java;3.2.0!pr
otobuf-java.jar(bundle), download failed: javax.xml.bind#jaxb-api;2.3.0!jaxb-api.jar, download failed: commons-codec#commons-codec;1.11!commons-codec.jar, download failed: javax.ws.rs#javax.ws.rs-api;2.1!javax.ws.rs-api.${packaging.type}, download failed: javax.annotation#javax.ann
otation-api;1.3!javax.annotation-api.jar, download failed: commons-io#commons-io;2.6!commons-io.jar, download failed: org.apache.httpcomponents#httpcore;4.4.10!httpcore.jar, download failed: com.fasterxml.jackson.core#jackson-core;2.9.6!jackson-core.jar(bundle), download failed: co
m.fasterxml.jackson.core#jackson-annotations;2.9.6!jackson-annotations.jar(bundle)]
        at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1306)
        at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:315)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:143)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

I'll dig into the dependency issues more.

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Dec 3, 2018

Working on getting things to build with --packages on this branch. Things are starting to get ugly. We don't build now because we hit something like this with a test. I can keep chipping away on this, but I'd say let's make it backburner for now since I don't think it is much of a priority.

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.338 sec <<< FAILURE!
command line app tests(io.archivesunleashed.CommandLineAppTest)  Time elapsed: 5.118 sec  <<< ERROR!
java.lang.IllegalArgumentException: Illegal pattern component: XXX
	at org.apache.commons.lang3.time.FastDatePrinter.parsePattern(FastDatePrinter.java:282)
	at org.apache.commons.lang3.time.FastDatePrinter.init(FastDatePrinter.java:149)
	at org.apache.commons.lang3.time.FastDatePrinter.<init>(FastDatePrinter.java:142)
	at org.apache.commons.lang3.time.FastDateFormat.<init>(FastDateFormat.java:384)
	at org.apache.commons.lang3.time.FastDateFormat.<init>(FastDateFormat.java:369)
	at org.apache.commons.lang3.time.FastDateFormat$1.createInstance(FastDateFormat.java:91)
	at org.apache.commons.lang3.time.FastDateFormat$1.createInstance(FastDateFormat.java:88)
	at org.apache.commons.lang3.time.FormatCache.getInstance(FormatCache.java:82)
	at org.apache.commons.lang3.time.FastDateFormat.getInstance(FastDateFormat.java:165)
	at org.apache.spark.sql.execution.datasources.csv.CSVOptions.<init>(CSVOptions.scala:139)
	at org.apache.spark.sql.execution.datasources.csv.CSVOptions.<init>(CSVOptions.scala:41)
	at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.prepareWrite(CSVFileFormat.scala:72)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:103)
	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)
	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
	at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
	at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
	at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
	at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
	at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:668)
	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:276)
	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:270)
	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:228)
	at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:656)
	at io.archivesunleashed.app.CommandLineApp.save(CommandLineApp.scala:183)
	at io.archivesunleashed.app.CommandLineApp$$anonfun$12.apply(CommandLineApp.scala:147)
	at io.archivesunleashed.app.CommandLineApp$$anonfun$12.apply(CommandLineApp.scala:142)
	at io.archivesunleashed.app.CommandLineApp.dfHandler(CommandLineApp.scala:250)
	at io.archivesunleashed.app.CommandLineApp.process(CommandLineApp.scala:298)
	at io.archivesunleashed.app.CommandLineAppRunner$.test(CommandLineApp.scala:346)
	at io.archivesunleashed.CommandLineAppTest$$anonfun$2$$anonfun$apply$mcV$sp$1.apply(CommandLineAppTest.scala:76)
	at io.archivesunleashed.CommandLineAppTest$$anonfun$2$$anonfun$apply$mcV$sp$1.apply(CommandLineAppTest.scala:75)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
	at io.archivesunleashed.CommandLineAppTest$$anonfun$2.apply$mcV$sp(CommandLineAppTest.scala:75)
	at io.archivesunleashed.CommandLineAppTest$$anonfun$2.apply(CommandLineAppTest.scala:74)
	at io.archivesunleashed.CommandLineAppTest$$anonfun$2.apply(CommandLineAppTest.scala:74)
	at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
	at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
	at org.scalatest.Transformer.apply(Transformer.scala:22)
	at org.scalatest.Transformer.apply(Transformer.scala:20)
	at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
	at org.scalatest.TestSuite$class.withFixture(TestSuite.scala:196)
	at org.scalatest.FunSuite.withFixture(FunSuite.scala:1560)
	at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183)
	at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
	at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
	at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
	at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196)
	at io.archivesunleashed.CommandLineAppTest.org$scalatest$BeforeAndAfter$$super$runTest(CommandLineAppTest.scala:32)
	at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:203)
	at io.archivesunleashed.CommandLineAppTest.runTest(CommandLineAppTest.scala:32)
	at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
	at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
	at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396)
	at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
	at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379)
	at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
	at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229)
	at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
	at org.scalatest.Suite$class.run(Suite.scala:1147)
	at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
	at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
	at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
	at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
	at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233)
	at io.archivesunleashed.CommandLineAppTest.org$scalatest$BeforeAndAfter$$super$run(CommandLineAppTest.scala:32)
	at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:258)
	at io.archivesunleashed.CommandLineAppTest.run(CommandLineAppTest.scala:32)
	at org.scalatest.junit.JUnitRunner.run(JUnitRunner.scala:99)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
	at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
Results :

Tests in error: 
  command line app tests(io.archivesunleashed.CommandLineAppTest): Illegal pattern component: XXX

Tests run: 106, Failures: 0, Errors: 1, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:06 min
[INFO] Finished at: 2018-12-03T13:49:11-05:00
[INFO] Final Memory: 95M/949M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project aut: There are test failures.
[ERROR] 
[ERROR] Please refer to /home/nruest/git/aut/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

@ruebot ruebot added the on hold label Dec 9, 2018

@jrwiebe jrwiebe self-assigned this Jan 24, 2019

@ruebot

This comment has been minimized.

Copy link
Member Author

commented Jan 25, 2019

@jrwiebe I think I hit the Guava problem in this one too.

ruebot added a commit that referenced this issue Jul 4, 2019

Update to Spark 2.4.3 and update Tika to 1.20.
- Resolves #295
- Resolves #308
- Resolves #286
- Pulls in unfinished work by @jrwiebe and @borislin.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.