Overview

Prism enables users to connect to a data warehouses via Connector objects. Connector objects are passed into the PrismProject's instantiation via the connectors keyword argument, and tasks can access connectors via CurrentRun.conn.

Note that connector objects are entirely optional. You can just as easily create a separate connection in each of your downstream tasks. However, we like using Connector instances for the following reasons:

  1. Connector instances are thread-safe. It automatically handles creating connection and cursor objects whenever you want to run SQL queries in a multi-threaded project.

  2. Connector instances can be shared across projects, facilitating collaboration and reducing duplicative code.

Executing SQL

The Connector class has one public method: execute_sql. As the name suggests, this method allows users to execute SQL and return the results as a Python object.

Here is the full method definition:

Connector.execute_sql(
    sql: str,
    return_type: Optional[Literal["pandas"]],
) -> Union[pd.DataFrame, List[List[Any]]]:

This method takes two arguments:

  • sql: SQL query to execute

  • return_type: return type — either pandas or None. If pandas, then the data is converted to a Pandas DataFrame. If None, then the data is returned as a list of tuples or dictionaries.

These are current Connector classes:

  • BigQueryConnector

  • PostgresConnector

  • RedshiftConnector

  • SnowflakeConnector

  • TrinoConnector

  • PrestoConnector

Last updated