# Tasks

In its most basic form, any data pipeline can be thought of as a series of discrete steps that run in some sort of sequence. For example, ETL pipelines generally have three steps: extract --> transform --> load.

Prism projects are no different. A Prism project is composed of a set of tasks, and these tasks contain the brunt of the project's core logic.

## What are tasks?

In Prism, tasks can be either classes or functions. Here what they look like:

{% tabs %}
{% tab title="Class-based tasks" %}

```python
# modules/hello_world.py

import prism.task
import prism.target

class HelloWorld(prism.task.PrismTask):
    
    def run(self, tasks, hooks):
        test_str = "Hello, world!"
        return test_str
```

{% endtab %}

{% tab title="Function-based tasks" %}

```python
# modules/hello_world.py

from prism.decorators import task

@task()
def hello_world(tasks, hooks):
    test_str = "Hello, world!"
    return test_str
```

{% endtab %}
{% endtabs %}

We'll go into the technical details of both next.

### Class-based tasks

Tasks are classes that inherit an abstract class called `PrismTask`. There are two requirements to which all tasks must adhere:

1. Each task ***must*** have method called `run`. This method should contain all the business logic for the task, and it should return a non-null output.
2. Tasks must live in their own `*.py` file.

{% hint style="warning" %}
**Important:** the output of a task's `run` function is what's used by downstream tasks in your pipeline. The return value can be anything – a Pandas or Spark DataFrame, a Numpy array, a string, a dictionary, whatever – but *it cannot be null*. Prism will throw an error if it is.
{% endhint %}

Apart from these two conditions, feel free to structure and define your tasks however you'd like, i.e., add other class methods, class attributes, etc.

Let's take a look at our previous example:

```python
# modules/hello_world.py

from prism.task import PrismTask

class HelloWorld(PrismTask):
    
    def run(self, tasks, hooks):
        test_str = "Hello, world!"
        return test_str
```

The `HelloWorld` task is defined in its own `*.py` file in the `modules` folder. It inherits the `PrismTask` class, and it contains a `run` function that returns a non-null string.

{% hint style="warning" %}
**Critical:** The `run` function has two mandatory parameters: [tasks](https://docs.runprism.com/v0.1.9/fundamentals/tasks/tasks), and [hooks](https://docs.runprism.com/v0.1.9/fundamentals/tasks/hooks). Both are critical, and Prism will throw an error if it finds a `run` function without these two parameters.
{% endhint %}

And that's it! Create a class that inherits the `PrismTask` class and implement the `run` method. Prism will take care of the rest.

{% hint style="info" %}
**Good to know:** Although user-defined tasks can be arbitrarily long or complex, it is helpful to think of them as discrete steps or objectives in your pipeline. For example, if you are creating an ETL pipeline, then you may want to split your code into three tasks: an extract task, a transform task, and a load task.&#x20;
{% endhint %}

For additional information, consult the [API reference](https://docs.runprism.com/v0.1.9/api-reference/prism.task.prismtask).

### Function-based tasks (NEW!)

Starting in Prism version `0.1.9rc2`, you can define tasks using functions rather than entire classes. There's no real difference between a function-based task and a class-based task — we created the feature so that you could work with what you're most comfortable with.

In order for a function to be a task, it must:

1. Be decorated with the `prism.decorators.task` function
2. Take two positional arguments: [tasks](https://docs.runprism.com/v0.1.9/fundamentals/tasks/tasks), and [hooks](https://docs.runprism.com/v0.1.9/fundamentals/tasks/hooks). Both are critical, and Prism will throw an error if it finds a task function without these two parameters.

As with class-based tasks, the functions must return a non-null output and tasks and must live in their own `*.py` file.

Let's take a look at our original example:

```python
# modules/hello_world.py

from prism.decorators import task

@task()
def hello_world(tasks, hooks):
    test_str = "Hello, world!"
    return test_str 
```

The technical specifications for the `@task` decorator can be found in the [API reference](https://docs.runprism.com/v0.1.9/api-reference/task-...).

## Why do tasks live in their own modules?

Other orchestration platforms, like Airflow, leave task and module organization to the user. So why do we require tasks to live in their own module?

The answer is pretty simple: **it improves readability and ensures that all members of a data team are speaking the same language.**

Different developers have different coding styles and intuitions, which can make it difficult to maintain consistency across a team. However, when you open a Prism project, you know exactly what to expect, no matter the author. You know that `prism_project.py` will contain all the configurations of the project. You know that the tasks will all live in `modules`, and you know that, for each task, the core logic will be contained in a specific function.

Prism's common project structure helps to keep the code organized and makes it easier to locate specific files and functionality. This can help to prevent issues such as code duplication and can improve the overall quality and reliability of the code.