spark-submit

Usage

spark-submit is used to execute a PySpark-based Prism project. This is distinct from a Python-based Prism project, in which case you would need to use the run command.

In order to use the spark-submit command, you must have a PySpark profile specified in profile.yml.

Usage: prism spark-submit [OPTIONS]                                                                                                 
                                                                                                                                     
 Execute your Prism project as a PySpark job.                                                                                        
                                                                                                                                     
 Examples:                                                                                                                           
                                                                                                                                     
  โ€ข prism spark-submit                                                                                                               
  โ€ข prism spark-submit -m module01.py -m module02.py                                                                                 
  โ€ข prism spark-submit -m module01 --all-downstream                                                                                  
  โ€ข prism spark-submit -v VAR1=VALUE1 -v VAR2=VALUE2                                                                                 
  โ€ข prism spark-submit --context '{"hi": 1}'                                                                                         
                                                                                                                                     
โ•ญโ”€ Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ --module          -m  TEXT                     Modules to execute. You can specify multiple modules with as follows: -m           โ”‚
โ”‚                                                <your_first_module> -m <your_second_module>.                                       โ”‚
โ”‚ --all-downstream                               Execute all tasks downstream of modules specified with --module.                   โ”‚
โ”‚ --all-upstream                                 Execute all tasks upstream of modules specified with --module.                     โ”‚
โ”‚ --log-level       -l  [info|warn|error|debug]  Set the log level.                                                                 โ”‚
โ”‚ --full-tb                                      Show the full traceback when an error occurs.                                      โ”‚
โ”‚ --vars            -v  TEXT                     Variables as key value pairs. These overwrite variables in prism_project.py. All   โ”‚
โ”‚                                                values are intepreted as strings.                                                  โ”‚
โ”‚ --context             TEXT                     Context as a dictionary. Must be a valid JSON. These overwrite variables in        โ”‚
โ”‚                                                prism_project.py.                                                                  โ”‚
โ”‚ --help                                         Show this message and exit.                                                        โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Important: this command is identical to prism run, with the exception that this should only be used to submit PySpark jobs.

Here's what the output looks like in Terminal:

Last updated