The previous example showed how to use the prism.decorators.target decorator to define a single target. Prism also supports multiple targets in a single task. There are two ways that this can be done:
Multiple prism.decorators.target calls
Using prism.decorators.target_iterator
Multiple prism.decorators.target
Building off of the previous example, let's say you want your HelloWorld task to also produce a file with the string "foo, bar". You can do that as follows:
In this revised task, the objects (i.e., the strings) corresponding to the different targets are returned as a tuple, and for every returned object in the tuple there is a prism.decorators.target call.
Good to know: make sure you match the order of the objects in the tuple and prism target decorators. In the above,
"Hello, world" --> /Users/hello_world.txt
"foo, bar" --> /Users/foo_bar.txt.
But, if we switched the order of the tuple (i.e., return test_str2, test_str), then we would have the reverse:
"foo, bar" --> /Users/hello_world.txt
"Hello, world" --> /Users/foo_bar.txt.
That's it! That's all you have to do to specify multiple targets. Now, tasks.ref("hello_world.py") will return a tuple of the target paths:
Using multiple prism.decorators.target calls can be tedious if you need to save a dozen or more targets at a time. That's where prism.decorators.target_iterator comes in.
Here's how to use it. Let's say you have data from different clients stored in different CSVs and you want to apply the same processing to all of them.
# tasks/process_client_data.py
import prism.task
import prism.target
import prism.decorators
import prism_project
import pandas as pd
class ProcessClientData(prism.task.PrismTask):
@prism.decorators.target_iterator(
type=prism.target.PandasCsv, loc="/Users/"
)
def run(self, tasks, hooks):
results_dict = {}
clients = ['clientA', 'clientB', 'clientC', 'clientD']
for cl in clients:
df = pd.read_csv(f'{cl}.csv')
df_processed = ... # do some processing here
results_dict[f'{cl}_processed.csv'] = df_processed
return results_dict
# tasks/process_client_data.py
from prism.decorators import task, target_iterator
import prism.target
@task(
targets=[
target_iterator(type=prism.target.PandasCsv, loc="/Users/")
]
)
def process_client_data(tasks, hooks):
results_dict = {}
clients = ['clientA', 'clientB', 'clientC', 'clientD']
for cl in clients:
df = pd.read_csv(f'{cl}.csv')
df_processed = ... # do some processing here
results_dict[f'{cl}_processed.csv'] = df_processed
return results_dict
Important: this decorator requires the output of the task function to be a dictionary mapping the name of the file to the object you want to save.
Here's what's happening under the hood:
We iterate through the different clients, process their data, and store the processed data in a dictionary (results_dict). This dictionary uses the desired target file name as keys and the objects as values.
The task function returns the results_dict. Prism then iterates through the key, value pairs in the dictionary and saves each value (i.e., each object) to the path {loc}/{key}.
In other words, the task above will save four targets:
/Users/clientA_processed.csv
/Users/clientB_processed.csv
/Users/clientC_processed.csv
/Users/clientD_processed.csv
Now, tasks.ref("process_client_data.py") will return the base loc path in which all the targets were saved (i.e., "/Users/"):