run

Usage

prism run is used to execute a Python-based Prism project. This is distinct from a PySpark-based Prism project, in which case you would need to use the spark-submit command.

Usage: prism run [OPTIONS]                                                                                                          
                                                                                                                                     
 Execute your Prism project.                                                                                                         
                                                                                                                                     
 Examples:                                                                                                                           
                                                                                                                                     
  • prism run                                                                                                                        
  • prism run -m module01.py -m module02.py                                                                                          
  • prism run -m module01 --all-downstream                                                                                           
  • prism run -v VAR1=VALUE1 -v VAR2=VALUE2                                                                                          
  • prism run --context '{"hi": 1}'                                                                                                  
                                                                                                                                     
╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --module          -m  TEXT                     Modules to execute. You can specify multiple modules with as follows: -m           │
│                                                <your_first_module> -m <your_second_module>.                                       │
│ --all-downstream                               Execute all tasks downstream of modules specified with --module                    │
│ --all-upstream                                 Execute all tasks upstream of modules specified with --module                      │
│ --full-refresh                                 Run tasks from scratch (even the ones that are considered done)                    │
│ --log-level       -l  [info|warn|error|debug]  Set the log level                                                                  │
│ --full-tb                                      Show the full traceback when an error occurs                                       │
│ --vars            -v  TEXT                     Variables as key value pairs. These overwrite variables in prism_project.py. All   │
│                                                values are intepreted as strings.                                                  │
│ --context             TEXT                     Context as a dictionary. Must be a valid JSON. These overwrite variables in        │
│                                                prism_project.py                                                                   │
│ --help                                         Show this message and exit.                                                        │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Here's what prism run does under the hood:

It first executes the prism compile command.
- Recall that this command parses the tasks contained in the modules/ folder for tasks.ref(...) calls, creates the dependency graph, and stores the project metadata (e.g., the configuration, the tasks.ref(...) calls, the targets, and the topological sort) in a manifest.
It parses the prism_project.py file. It also parses the configuration files listed within prism_project.py (e.g., PROFILE_YML_PATH, TRIGGERS_YML_PATH).
It creates a pipeline consisting of the global variables from prism_project.py and the adapters from profile.yml.
It executes the tasks in a topological order.

Example

Here's what the output looks like in Terminal:

$ prism run
--------------------------------------------------------------------------------
<HH:MM:SS> | INFO  | Running with prism v0.2.3...
<HH:MM:SS> | INFO  | Found project directory at /my_first_project
 
<HH:MM:SS> | INFO  | RUNNING 'parsing prism_project.py'.............................................. [RUN]
<HH:MM:SS> | INFO  | FINISHED 'parsing prism_project.py'............................................. [DONE in 0.03s]
<HH:MM:SS> | INFO  | RUNNING 'task DAG'.............................................................. [RUN]
<HH:MM:SS> | INFO  | FINISHED 'task DAG'............................................................. [DONE in 0.01s]
<HH:MM:SS> | INFO  | RUNNING 'creating pipeline, DAG executor'....................................... [RUN]
<HH:MM:SS> | INFO  | FINISHED 'creating pipeline, DAG executor'...................................... [DONE in 0.01s]
 
<HH:MM:SS> | INFO  | ===================== tasks (vermilion-hornet-Gyycw4kRWG) =====================
<HH:MM:SS> | INFO  | 1 of 2 RUNNING EVENT 'decorated_task.example_task'.............................. [RUN]
<HH:MM:SS> | INFO  | 1 of 2 FINISHED EVENT 'decorated_task.example_task'............................. [DONE in 0.02s]
<HH:MM:SS> | INFO  | 2 of 2 RUNNING EVENT 'class_task.ExampleTask'................................... [RUN]
<HH:MM:SS> | INFO  | 2 of 2 FINISHED EVENT 'class_task.ExampleTask'.................................. [DONE in 0.01s]
 
<HH:MM:SS> | INFO  | Done!
--------------------------------------------------------------------------------

Required arguments

There are no required arguments for run.

Optional arguments

Here are the optional arguments you can run with run:

--full-tb : Display full traceback if errors arise at any stage of the pipeline
--log-level: Log level, one of info, warn, error, or critical.
--vars: Prism variables as key-value pairs key=value. These overwrite any variable definitions in prism_project.py. All values are read as strings.
--context: Prism variables as JSON. Cannot co-exist with --vars. These overwrite any variable definitions in prism_project.py.
--modules: Paths to modules that you want to run; if not specified, all modules in pipeline are run. Paths should be specified relative to the modules folder.
--all-upstream: Run all modules upstream of those specified in --modules
--all-downstream: Run all modules downstream of those specified in --modules

Previousinit Nextspark-submit

Last updated 1 year ago