# Tasks

In its most basic form, any data pipeline can be thought of as a series of discrete steps that run in some sort of sequence. For example, ETL pipelines generally have three steps: extract -> transform -> load.

Prism projects are no different. A Prism project is composed of a set of tasks, and these tasks contain the brunt of the project's core logic.

## What are tasks?

In Prism, tasks can be either classes or functions. Here what they look like:

{% tabs %}
{% tab title="Class-based tasks" %}

```python
# tasks/hello_world.py

import prism.task
import prism.target

class HelloWorld(prism.task.PrismTask):
        
    def run(self):
        test_str = "Hello, world!"
        return test_str
```

{% endtab %}

{% tab title="Function-based tasks" %}

```python
# tasks/hello_world.py

from prism.decorators import task

@task()
def hello_world():
    test_str = "Hello, world!"
    return test_str
```

{% endtab %}
{% endtabs %}

We'll go into the technical details of both next.

### Class-based tasks

Tasks are classes that inherit an abstract class called `PrismTask`. There are two requirements to which all tasks must adhere:

Each task ***must*** have method called `run`. This method must adhere to three requirements:

1. It should **not** use contain any arguments
2. It should encapsulate all the business logic for the task
3. It should return a non-null output.

{% hint style="warning" %}
**Important:** the output of a task's `run` function is what's used by downstream tasks in your pipeline. The return value can be anything – a Pandas or Spark DataFrame, a Numpy array, a string, a dictionary, whatever – but *it cannot be null*. Prism will throw an error if it is.
{% endhint %}

Apart from these two conditions, feel free to structure and define your tasks however you'd like, i.e., add other class methods, class attributes, etc:

```python
# tasks/hello_world.py

from prism.task import PrismTask

class HelloWorld(PrismTask):

    def some_other_function(*args, **kwargs):
        # do something
        
    def run(self):
        test_str = "Hello, world!"
        _ = some_other_function()
        return test_str
```

As you can see, our `HelloWorld` task is lives in the `tasks` directory. It inherits the `PrismTask` class, and it contains a `run` function that returns a non-null string.

{% hint style="warning" %}
**Critical:** The `run` function has two mandatory parameters: [tasks](https://docs.runprism.com/fundamentals/broken-reference), and [hooks](https://docs.runprism.com/fundamentals/broken-reference). Both are critical, and Prism will throw an error if it finds a `run` function without these two parameters.
{% endhint %}

And that's it! Create a class that inherits the `PrismTask` class and implement the `run` method. Prism will take care of the rest.

{% hint style="info" %}
**Good to know:** Although user-defined tasks can be arbitrarily long or complex, it is helpful to think of them as discrete steps or objectives in your pipeline. For example, if you are creating an ETL pipeline, then you may want to split your code into three tasks: an extract task, a transform task, and a load task.&#x20;
{% endhint %}

For additional information, consult the [API reference](https://docs.runprism.com/api-reference/prism.task.prismtask).

### Function-based tasks

You can also define tasks using functions rather than entire classes. There's no real difference between a function-based task and a class-based task — we created the feature so that you could work with what you're most comfortable with.

In order for a function to be a task, it must be decorated with the `prism.decorators.task` function. Similar to a class-based task, functions that are tasks do not accept any arguments and must return a non-null value.

Let's take a look at our original example:

```python
# tasks/hello_world.py

from prism.decorators import task

@task()
def hello_world():
    test_str = "Hello, world!"
    return test_str 
```

The technical specifications for the `@task` decorator can be found in the [API reference](https://docs.runprism.com/api-reference/task-...).

### Task IDs

Every task in a Prism project must be associated with a unique ID. This ID is then referenced by downstream tasks (via `CurrentRun.ref(...)`) to grab the task's output.

User's can specify their own task ID when creating a task:

{% tabs %}
{% tab title="Class-based tasks" %}
When using a class-based task, you can specify a custom task ID using the `task_id` class attribute.

```python
# tasks/hello_world.py

from prism.task import PrismTask

class HelloWorld(PrismTask):
    task_id = "hello-world-task-cls" 
        
    def run(self):
        test_str = "Hello, world!"
        _ = some_other_function()
        return test_str
```

{% endtab %}

{% tab title="Function-based tasks" %}
When using a class-based task, you can specify a custom task ID using the `task_id` keyword argument in the `@task` decorator:

```python
# tasks/hello_world.py

from prism.decorators import task

@task(
    task_id="hello-world-task-fn"    
)
def hello_world():
    test_str = "Hello, world!"
    return test_str 
```

{% endtab %}
{% endtabs %}

If you don't specify a custom task ID, then Prism automatically creates one for you. The format of this task ID will be `<module_name>.<function or class name>`. For example:

{% tabs %}
{% tab title="Class-based tasks" %}

```python
# tasks/hello_world.py

from prism.task import PrismTask

class HelloWorld(PrismTask):
    task_id = "hello-world-task-cls" 
        
    def run(self):
        test_str = "Hello, world!"
        _ = some_other_function()
        return test_str
```

The auto-generated ID for this task will be `hello_world.HelloWorld`.
{% endtab %}

{% tab title="Function-based tasks" %}

```python
# tasks/hello_world.py

from prism.decorators import task

@task()
def hello_world():
    test_str = "Hello, world!"
    return test_str 
```

The auto-generated ID for this task will be `hello_world.hello_world`.
{% endtab %}
{% endtabs %}

{% hint style="warning" %}
**Important**: for readability purposes, we recommend ***always*** setting task IDs in your classes or functions.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.runprism.com/fundamentals/tasks.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
