spark-submit
Usage
spark-submit
is used to execute a PySpark-based Prism project. This is distinct from a Python-based Prism project, in which case you would need to use the run
command.
In order to use the spark-submit
command, you must have a PySpark profile specified in profile.yml
.
Usage: prism spark-submit [OPTIONS]
Execute your Prism project as a PySpark job.
Examples:
• prism spark-submit
• prism spark-submit -m module01.py -m module02.py
• prism spark-submit -m module01 --all-downstream
• prism spark-submit -v VAR1=VALUE1 -v VAR2=VALUE2
• prism spark-submit --context '{"hi": 1}'
╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --module -m TEXT Modules to execute. You can specify multiple modules with as follows: -m │
│ <your_first_module> -m <your_second_module>. │
│ --all-downstream Execute all tasks downstream of modules specified with --module. │
│ --all-upstream Execute all tasks upstream of modules specified with --module. │
│ --log-level -l [info|warn|error|debug] Set the log level. │
│ --full-tb Show the full traceback when an error occurs. │
│ --vars -v TEXT Variables as key value pairs. These overwrite variables in prism_project.py. All │
│ values are intepreted as strings. │
│ --context TEXT Context as a dictionary. Must be a valid JSON. These overwrite variables in │
│ prism_project.py. │
│ --help Show this message and exit. │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Important: this command is identical to prism run
, with the exception that this should only be used to submit PySpark jobs.
Here's what the output looks like in Terminal:
$ prism spark-submit
--------------------------------------------------------------------------------
<HH:MM:SS> | INFO | Running with prism v0.2.0rc1...
<HH:MM:SS> | INFO | Found project directory at /Users/my_first_project
<HH:MM:SS> | INFO | RUNNING EVENT 'parsing prism_project.py'................................................ [RUN]
22/06/28 21:01:30 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
<HH:MM:SS> | INFO | FINISHED EVENT 'parsing prism_project.py'............................................... [DONE in 0.03s]
<HH:MM:SS> | INFO | RUNNING EVENT 'module DAG'.............................................................. [RUN]
<HH:MM:SS> | INFO | FINISHED EVENT 'module DAG'............................................................. [DONE in 0.01s]
<HH:MM:SS> | INFO | RUNNING EVENT 'creating pipeline, DAG executor'......................................... [RUN]
<HH:MM:SS> | INFO | FINISHED EVENT 'creating pipeline, DAG executor'........................................ [DONE in 0.01s]
<HH:MM:SS> | INFO | ===================== tasks 'vermilion-hornet-Gyycw4kRWG' =====================
<HH:MM:SS> | INFO | 1 of 1 RUNNING EVENT 'module01.py'...................................................... [RUN]
<HH:MM:SS> | INFO | 1 of 1 FINISHED EVENT 'module01.py'..................................................... [DONE in 0.01s]
Done!
---------------------------------------------------------------------------------
Last updated