Big Data Meanderings: compiling spark 1.3 RC3 and GA on fc21/linux

When I built spark 1.3 RC3 on FC21, I needed to make a few adjustments.

First, I had installed scala 2.11.6 from scala-lang.org
Modified pom.xml and commented (using ) out the compiler plugins for quasi-quotes, which are now built into scala 2.11's libraries unlike in 2.10. Check out sql/catalyst/pom.xml.
Excluded kafka, which does not have maven presence, for 2.11, from the build command line.
Ran dev/change-version-to-2.11.sh

I built spark with:

mvn -Phadoop-2.4 -Phive -Pyarn -Pscala-2.11 -pl \!external/kafka,\!external/kafka-assembly,\!examples -DskipTests clean package

The list of modules you can exclude are listed in the pom directly under the modules section. "-pl" is a maven command line option to exclude a module (that's an el not a one).2

You may want to use -Phadoop-provided if you are going to run on yarn directly as the AM in that deployment model will already contain the hadoop jars you need. I included yarn so I could run on yarn, but startup is very slow with anything hadoop so you may want to just use the spark master model for everything.

Update for GA:

In the GA release, it appears that they have set a flag to exclude quasi-quotes for scala 2.11. The build did not work for me so I still had to comment out the dependency in the sql/catalyst/pom.xml file. The kafka modules are suppose to be only used for the scala-2.10 profile, however, the pom.xml did not work. Essentially, I still had to do everything that I listed above even for GA.

Note for 1.5.x
You need to read the instructions at: http://spark.apache.org/docs/latest/building-spark.html#building-for-scala-211.

Spark now comes with its own maven distribution so you always use the version that the spark team uses. Look for build/mvn. I had an older maven install so I had to set the M2_HOME variable to the explicit spark distribution maven directory to get the spark supplied maven to run correctly.

Big Data Meanderings

Wednesday, March 11, 2015

compiling spark 1.3 RC3 and GA on fc21/linux

No comments:

Post a Comment