Prism
v0.1.9rc2
v0.1.9rc2
  • πŸ‘‹Welcome to Prism!
  • Getting Started
    • Installation
    • Creating your first project
    • Why Prism?
  • Fundamentals
    • Tasks
      • tasks
      • hooks
        • hooks.sql
        • hooks.spark
        • hooks.dbt_ref
    • Targets
      • Multiple targets
    • Config files
      • prism_project.py
        • RUN_ID / SLUG
        • SYS_PATH_CONF
        • THREADS
        • PROFILE_YML_PATH / PROFILE
        • PRISM_LOGGER
        • TRIGGERS_YML_PATH / TRIGGERS
      • Profile YML
      • Triggers YML
    • Jinja
      • __file__ and Path
      • prism_project
      • wkdir
      • parent_dir
      • concat
      • env
  • Adapters
    • Overview
    • sql
      • BigQuery
      • Postgres
      • Redshift
      • Snowflake
      • Trino
    • PySpark
    • dbt
  • Agents
    • Overview
    • Docker
    • EC2
  • CLI
    • Command Line Interface
    • init
    • compile
    • connect
    • create
      • agent
      • task
      • trigger
    • graph
    • run
    • spark-submit
    • agent
      • apply
      • run
      • build
      • delete
  • Advanced features
    • Concurrency
    • Logging
    • Triggers
    • Retries
    • Python Client
  • API Reference
    • prism.task.PrismTask
    • @task(...)
    • @target(...)
    • @target_iterator(...)
    • tasks.ref(...)
    • hooks.sql(...)
    • hooks.dbt_ref(...)
  • Use Cases
    • Analytics on top of dbt
    • Machine Learning
  • Wiki
    • DAGs
Powered by GitBook
On this page
  • Why use Prism?
  • What is a Prism project?
  • Guides: Jump right in

Welcome to Prism!

These docs current for version v0.1.9rc2.

Prism is the easiest way to create data pipelines in Python. With it, users can break down their data flows into modular tasks, manage dependencies, and execute complex computations in sequence.

Why use Prism?

Prism was built to streamline the development and deployment of complex data pipelines. Here are some of its main features:

  • Real-time dependency declaration: With Prism, users can declare dependencies using a simple function call. No need to explicitly keep track of the pipeline order β€” at runtime, Prism automatically parses the function calls and builds the dependency graph.

  • Intuitive logging: Prism automatically logs events for parsing the configuration files, compiling the tasks and creating the DAG, and executing the tasks. No configuration is required.

  • Flexible CLI: Users can instantiate, compile, and run projects using a simple, but powerful command-line interface.

  • β€œBatteries included”: Prism comes with all the essentials needed to get up and running quickly. Users can create and run their first DAG in less than 2 minutes.

  • Integrations: Prism integrates with several tools that are popular in the data community, including Snowflake, Google BigQuery, Redshift, PySpark, and dbt. We're adding more integrations every day, so let us know what you'd like to see!

What is a Prism project?

At minimum, a Prism project must contain two things:

  1. A directory of Python modules (called modules), and

  2. A prism_project.py file for managing your Python environment

Here's how a typical Prism project is structured.

prism_project/
  β”œβ”€β”€ data/
  β”œβ”€β”€ dev/
  β”‚   └── dev.ipynb
  β”œβ”€β”€ output/
  β”œβ”€β”€ modules/
  β”‚   β”œβ”€β”€ extract.py
  β”‚   β”œβ”€β”€ transform.py
  β”‚   └── load.py
  β”œβ”€β”€ prism_project.py
  β”œβ”€β”€ profile.yml
  └── triggers.yml

As you can see, projects can contain other folders, like a data folder, an output folder, and a profile.yml file for connecting your project to external sources (more on this later).

Guides: Jump right in

Follow our handy guides to get started on the basics as quickly as possible:

If you have any feedback about the product or the docs, please let us know!

NextInstallation

Last updated 1 year ago

πŸ‘‹
Getting Started
Fundamentals
CLI
API Reference