# spark-submit

## Usage

`spark-submit` is used to execute a PySpar&#x6B;*-based* Prism project. This is distinct from a Python-based Prism project, in which case you would need to use the [`run`](https://docs.runprism.com/v0.2.3/cli/run) command.

In order to use the `spark-submit` command, you must have a PySpark profile specified in `profile.yml`.

```
Usage: prism spark-submit [OPTIONS]                                                                                                 
                                                                                                                                     
 Execute your Prism project as a PySpark job.                                                                                        
                                                                                                                                     
 Examples:                                                                                                                           
                                                                                                                                     
  • prism spark-submit                                                                                                               
  • prism spark-submit -m module01.py -m module02.py                                                                                 
  • prism spark-submit -m module01 --all-downstream                                                                                  
  • prism spark-submit -v VAR1=VALUE1 -v VAR2=VALUE2                                                                                 
  • prism spark-submit --context '{"hi": 1}'                                                                                         
                                                                                                                                     
╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --module          -m  TEXT                     Modules to execute. You can specify multiple modules with as follows: -m           │
│                                                <your_first_module> -m <your_second_module>.                                       │
│ --all-downstream                               Execute all tasks downstream of modules specified with --module.                   │
│ --all-upstream                                 Execute all tasks upstream of modules specified with --module.                     │
│ --full-refresh                                 Run tasks from scratch (even the ones that are considered done)                    │
│ --log-level       -l  [info|warn|error|debug]  Set the log level.                                                                 │
│ --full-tb                                      Show the full traceback when an error occurs.                                      │
│ --vars            -v  TEXT                     Variables as key value pairs. These overwrite variables in prism_project.py. All   │
│                                                values are intepreted as strings.                                                  │
│ --context             TEXT                     Context as a dictionary. Must be a valid JSON. These overwrite variables in        │
│                                                prism_project.py.                                                                  │
│ --help                                         Show this message and exit.                                                        │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```

{% hint style="info" %}
**Important:** this command is *identical* to `prism run`, with the exception that this should only be used to submit PySpark jobs.
{% endhint %}

Here's what the output looks like in Terminal:

```
$ prism spark-submit
--------------------------------------------------------------------------------
<HH:MM:SS> | INFO  | Running with prism v0.2.3...
<HH:MM:SS> | INFO  | Found project directory at /my_first_project
 
<HH:MM:SS> | INFO  | RUNNING 'parsing prism_project.py'.............................................. [RUN]
<YY/MM/DD> <HH:MM:SS> WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
<HH:MM:SS> | INFO  | FINISHED 'parsing prism_project.py'............................................. [DONE in 0.03s]
<HH:MM:SS> | INFO  | RUNNING 'task DAG'.............................................................. [RUN]
<HH:MM:SS> | INFO  | FINISHED 'task DAG'............................................................. [DONE in 0.01s]
<HH:MM:SS> | INFO  | RUNNING 'creating pipeline, DAG executor'....................................... [RUN]
<HH:MM:SS> | INFO  | FINISHED 'creating pipeline, DAG executor'...................................... [DONE in 0.01s]
 
<HH:MM:SS> | INFO  | ===================== tasks (vermilion-hornet-Gyycw4kRWG) =====================
<HH:MM:SS> | INFO  | 1 of 2 RUNNING EVENT 'decorated_task.example_task'.............................. [RUN]
<HH:MM:SS> | INFO  | 1 of 2 FINISHED EVENT 'decorated_task.example_task'............................. [DONE in 0.02s]
<HH:MM:SS> | INFO  | 2 of 2 RUNNING EVENT 'class_task.ExampleTask'................................... [RUN]
<HH:MM:SS> | INFO  | 2 of 2 FINISHED EVENT 'class_task.ExampleTask'.................................. [DONE in 0.01s]
 
<HH:MM:SS> | INFO  | Done!
--------------------------------------------------------------------------------
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.runprism.com/v0.2.3/cli/spark-submit.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
