Running your first Apache Spark app

The Spark Getting Started guide is pretty good, but it’s not immediately obvious that your don’t run your app using Spark API as a standalone executable app. If you try, you’ll get an error like this:

17/11/07 19:15:20 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:376)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at kh.textanalysis.spark.SparkWordCount.workCount(
at kh.textanalysis.spark.SparkWordCount.main(

Instead, if using Maven, package the app with ‘mvn package’, start a local master node:


and then you submit it to your Spark node for processing:

./sbin/spark-submit \
  --class "MyApp" \
  --master local[1] \


