The Spark Getting Started guide is pretty good, but it’s not immediately obvious that your don’t run your app using Spark API as a standalone executable app. If you try, you’ll get an error like this:
17/11/07 19:15:20 ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:376) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901) at kh.textanalysis.spark.SparkWordCount.workCount(SparkWordCount.java:16) at kh.textanalysis.spark.SparkWordCount.main(SparkWordCount.java:10)
Instead, if using Maven, package the app with ‘mvn package’, start a local master node:
./sbin/start-master.sh
and then you submit it to your Spark node for processing:
./sbin/spark-submit \
--class "MyApp" \
--master local[1] \
target/MyApp-1.0.jar
Can you please confirm on Class and Target-jar details please
I have a more detailed Spark example here if this helps: https://www.kevinhooke.com/2017/11/07/apache-spark-word-count-big-data-analytics-with-a-publicly-available-data-set-part-2/