Profile YML

Profiles are used to connect Prism to data warehouses (e.g., Snowflake, Google BigQuery) and open-source processing systems (e.g., PySpark, dbt). The profile YML file is used to define the configurations for these connections.

Here is an example of a valid profile YML.

# profile.yml

default:
  adapters:
    default_bigquery:
      type: bigquery
      creds: "{{ env('GOOGLE_APPLICATION_CREDENTIALS') }}
      
    postgres_staging:
      type: postgres
      autocommit: True
      host: "{{ env('POSTGRES_STAGING_HOST') }}"
      port: 5432
      database: staging_db
      user: "{{ env('POSTGRES_STAGING_USER') }}"
      password: "{{ env('POSTGRES_STAGING_PASSWORD') }}"
      
    postgres_prod:
      type: postgres
      autocommit: True
      host: "{{ env('POSTGRES_PROD_HOST') }}"
      port: 5432
      database: prod_db
      user: "{{ env('POSTGRES_PROD_USER') }}"
      password: "{{ env('POSTGRES_PROD_PASSWORD') }}"

In this example:

The default profile is created. The profile name is always the top-level key of the YML, and this must match the PROFILE variable in your prism_project.py.
Three adapters are created under the default profile.
- A BigQuery adapter. This requires a path to the your Google application credentials. In the example above, the credentials path is stored in an environment variable as passed into the YML via the env Jinja function.
- Two PostgreSQL adapters, one that connects to our staging database and one that connects to our production database. As with BigQuery, the credentials were stored in environment variables and passed these into the YML via the env Jinja function.

Important: as the above example highlights, each adapter is a {key,value} pair, where the key is the adapter name and the value is the adapter configuration.

Prism supports the following adapters:

BigQuery
dbt
Postgres
PySpark
Snowflake
Redshift
Trino

The configurations for each of the profile types is covered in the Integrations section. Alternatively, you can automatically add adapters to your profile via the prism connect command:

$ prism connect --type [bigquery | dbt | postgres | pyspark | snowflake | redshift | trino]

These connections can be accessed through the hooks argument in each task's run function. More on this later.

PreviousTRIGGERS_YML_PATH / TRIGGERS NextTriggers YML

Last updated 1 year ago