Nitendra Gautam

Submit Apache Spark Job with REST API

When working with Apache spark ,there are times when you need to trigger a Spark job on demand from outside the cluster. There are two ways in which we can submit Apache spark job in a cluster.

  • Spark Submit from within the Spark cluster

To submit a spark job from within the spark cluster we use spark-submit . Below is a sample shell script which submits the Spark job .Most of the argumenst are self-explanotary .


$SPARK_HOME/bin/spark-submit \
 --class com.nitendragautam.sparkbatchapp.main.Boot \
--master spark:// \
--deploy-mode cluster \
--supervise \
--executor-memory 4G \
--driver-memory 4G \
--total-executor-cores 2 \
/home/hduser/sparkbatchapp.jar \
/home/hduser/NDSBatchApp/input \

  • REST API from outside the Spark cluster

In this post i will explain how to trigger Spark job with the help of REST API.I Please make sure that Spark Cluster is running before submitting Spark Job.

Spark Master
Figure: Apache Spark Master

Trigger Spark batch job by using Shell Script

Create a Shell script named with below contents. Give the shells script


curl -X POST --header "Content-Type:application/json;charset=UTF-8" --data '{
  "appResource": "/home/hduser/sparkbatchapp.jar",
  "sparkProperties": {
    "spark.executor.memory": "4g",
    "spark.master": "spark://",
    "spark.driver.memory": "4g",
    "spark.driver.cores": "2",
    "spark.eventLog.enabled": "false",
    "": "Spark REST API201804291717022",
    "spark.submit.deployMode": "cluster",
    "spark.jars": "/home/hduser/sparkbatchapp.jar",
    "spark.driver.supervise": "true"
  "clientSparkVersion": "2.0.1",
  "mainClass": "com.nitendragautam.sparkbatchapp.main.Boot",
  "environmentVariables": {
  "action": "CreateSubmissionRequest",
  "appArgs": [

Once the spark Job successfully gets executed ,you will see a output with below contents.

[email protected]: sh
  "action" : "CreateSubmissionResponse",
  "message" : "Driver successfully submitted as driver-20180429125849-0001",
  "serverSparkVersion" : "2.0.1",
  "submissionId" : "driver-20180429125849-0001",
  "success" : true

Check Status of Spark Job using REST API

If you want to check the status of your Spark Job ,you can use the Submission Id and below shell script.

  "action" : "SubmissionStatusResponse",
  "driverState" : "FINISHED",
  "serverSparkVersion" : "2.0.1",
  "submissionId" : "driver-20180429125849-0001",
  "success" : true,
  "workerHostPort" : "",
  "workerId" : "worker-20180429124356-"