EC2

Prerequisites

EC2 agents rely on the boto3 and botocore packages. These require you to set up authentication credentials for your AWS account. More information can be found here.

In addition, you'll need to make sure your account has the necessary privileges to create EC2 instances, security groups, and key-pairs. Contact your account administrator to help out with this.

IAM permissions

EC2 agents require your AWS credentials to list, modify, create, and delete certain resources (e.g., key pairs, security groups, EC2 instances, etc.). Use the following policy document to provision your account with the necessary permissions:

IAM policy document
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowGetAccessKeyLastUsed",
            "Effect": "Allow",
            "Action": "iam:GetAccessKeyLastUsed",
            "Resource": "arn:aws:iam::*:user/${aws:username}"
        },
        {
            "Sid": "AllowGetUser",
            "Effect": "Allow",
            "Action": "iam:GetUser",
            "Resource": "arn:aws:iam::*:user/${aws:username}"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:Describe*",
                "ec2:GetConsole*"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:RunInstances",
                "ec2:CreateTags"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:subnet/*",
                "arn:aws:ec2:*:*:network-interface/*",
                "arn:aws:ec2:*:*:instance/*",
                "arn:aws:ec2:*:*:volume/*",
                "arn:aws:ec2:*::image/ami-*",
                "arn:aws:ec2:*:*:key-pair/*",
                "arn:aws:ec2:*:*:security-group/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateKeyPair",
                "ec2:DeleteKeyPair",
                "ec2:CreateSecurityGroup",
                "ec2:DeleteSecurityGroup",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSecurityGroupRules",
                "ec2:DescribeTags"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:RevokeSecurityGroupIngress",
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:ModifySecurityGroupRules",
                "ec2:UpdateSecurityGroupRuleDescriptionsIngress",
                "ec2:UpdateSecurityGroupRuleDescriptionsEgress"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:security-group/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:ModifySecurityGroupRules"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:security-group-rule/*"
            ]
        }
    ]
}

Important: Prism will never touch any resource that it did not directly create. Rest assured, your infrastructure is safe :)

Configuring your EC2 agent

Docker agents are configured using a YML file:

# ec2_agent.yml

agent:
  type: ec2
  instance_type: # instance type, e.g., t2.micro
  requirements:  # path to requirements.txt, relative to this file
  env:
    <env var 1>: 'env var 1 value'

The YML file has a single top-level key, agent. This key contains all the configurations needed for your agent. Specifically:

  • type: this will always be ec2 for EC2 agents

  • instance_type: the EC2 instance type. Note that different instance types are optimized for different use cases. The full list of EC2 instance types can be found here.

  • requirements: path to the requirements.txt file. This path must be relative to the path of the agent YML file.

  • env: {key,value} pairs representing environment variables to add to your Docker image

Example agent

For the remainder of this guide, let's assume that we have the following project.

etl_project/
  ├── prism_project.py
  ├── tasks/
  │   ├── extract.py
  │   ├── transfrom.py
  │   └── load.py
  ├── ec2_agent.yml
  ├── requirements.txt
  └── triggers.yml

And, let's assume that we defined our ec2_agent.yml as follows:

# ec2_agent.yml

agent:
  type: ec2
  instance_type: m7g.medium
  requirements: ./requirements.txt
  env:
    AWS_ACCESS_KEY_ID: "{{ env('AWS_ACCESS_KEY_ID') }}"
    AWS_SECRET_ACCESS_KEY: "{{ env('AWS_SECRET_ACCESS_KEY') }}"

Note the way we specified the requirements key. We can do this because ec2_agent.yml and requirements.txt are in the same directory.

Creating your EC2 agent

You can build your EC2 agent using the command: prism agent apply -f <path to agent YML> (note: logs are truncated for visibility):

$ prism agent apply -f ./ec2_agent.yml
--------------------------------------------------------------------------------
<HH:MM:DD> | INFO  | Creating agent...
 
etl_project_ec2_agent[build] | Created key pair etl_project_ec2_agent
etl_project_ec2_agent[build] | Created security group with ID sg-XXXXXXXXXXXXXXXXX in VPC vpc-XXXXXXXXXXXXXXXXX
etl_project_ec2_agent[build] | Created EC2 instance with ID i-XXXXXXXXXXXXXXXXX
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | ssh: connect to host ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com port 22: Connection refused
etl_project_ec2_agent[build] | SSH connection failed. Retrying in 5 seconds...
The authenticity of host 'ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com (XX.XXX.XXX.XXX)' can't be established.
ED25519 key fingerprint is SHA256:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
etl_project_ec2_agent[build] | Warning: Permanently added 'ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com' (ED25519) to the list of known hosts.
etl_project_ec2_agent[build] | SSH connection succeeded!
etl_project_ec2_agent[build] |    ,     #_
etl_project_ec2_agent[build] |    ~\_  ####_        Amazon Linux 2023
etl_project_ec2_agent[build] |   ~~  \_#####\
etl_project_ec2_agent[build] |   ~~     \###|
etl_project_ec2_agent[build] |   ~~       \#/ ___   https://aws.amazon.com/linux/amazon-linux-2023
etl_project_ec2_agent[build] |    ~~       V~' '->
etl_project_ec2_agent[build] |     ~~~         /
etl_project_ec2_agent[build] |       ~~._.   _/
etl_project_ec2_agent[build] |          _/ _/
etl_project_ec2_agent[build] |        _/m/'
etl_project_ec2_agent[build] | Requirement already satisfied: pip in ./.venv/etl_project/lib/python3.9/site-packages (21.3.1)
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | Successfully installed ...
etl_project_ec2_agent[build] | Updating remote project and file paths
etl_project_ec2_agent[build] | Done updating remote project and file paths
--------------------------------------------------------------------------------

If you check your EC2 console, you'll see an EC2 instance called etl_project_ec2_agent!

Running your Docker agent

There are two commands you can use to run your Docker agent.

prism agent run

The prism agent run command streams your project onto your EC2 instance:

$ prism agent run -f ./ec2_agent.yml
--------------------------------------------------------------------------------
<HH:MM:DD> | INFO  | Streaming agent logs...

etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | Running with prism v0.2.1...
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | Found project directory at /etl_project
etl_project_ec2_agent[run] |
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | RUNNING EVENT 'parsing prism_project.py'................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | FINISHED EVENT 'parsing prism_project.py'............................................... [DONE in 1.42s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | RUNNING EVENT 'task DAG'................................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | FINISHED EVENT 'task DAG'............................................................... [DONE in 0.01s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | RUNNING EVENT 'creating pipeline, DAG executor'......................................... [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | FINISHED EVENT 'creating pipeline, DAG executor'........................................ [DONE in 0.01s]
etl_project_ec2_agent[run] |
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | ======================= tasks 'finicky-macaw-JaEvjyMWtb' =======================
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 1 of 3 RUNNING EVENT 'extract.Extract'...................................................[RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 1 of 3 FINISHED EVENT 'extract.Extract'................................................. [DONE in 120.01s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 2 of 3 RUNNING EVENT 'transform.Transform'.............................................. [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 2 of 3 FINISHED EVENT 'transform.Transform'............................................. [DONE in 791.38s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 3 of 3 RUNNING EVENT 'load.Load'........................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 3 of 3 FINISHED EVENT 'load.Load'....................................................... [DONE in 7.84s]

<HH:MM:DD> | INFO  | Done streaming agent logs...
--------------------------------------------------------------------------------
prism agent build

The prism agent build command updates your project files within your EC2 instance and executes the project onto it.

$ prism agent build -f ./ec2_agent.yml
--------------------------------------------------------------------------------
<HH:MM:DD> | INFO  | Creating agent...
 
etl_project_ec2_agent[build] | Created key pair etl_project_ec2_agent
etl_project_ec2_agent[build] | Created security group with ID sg-XXXXXXXXXXXXXXXXX in VPC vpc-XXXXXXXXXXXXXXXXX
etl_project_ec2_agent[build] | Created EC2 instance with ID i-XXXXXXXXXXXXXXXXX
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | ssh: connect to host ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com port 22: Connection refused
etl_project_ec2_agent[build] | SSH connection failed. Retrying in 5 seconds...
The authenticity of host 'ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com (XX.XXX.XXX.XXX)' can't be established.
ED25519 key fingerprint is SHA256:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
etl_project_ec2_agent[build] | Warning: Permanently added 'ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com' (ED25519) to the list of known hosts.
etl_project_ec2_agent[build] | SSH connection succeeded!
etl_project_ec2_agent[build] |    ,     #_
etl_project_ec2_agent[build] |    ~\_  ####_        Amazon Linux 2023
etl_project_ec2_agent[build] |   ~~  \_#####\
etl_project_ec2_agent[build] |   ~~     \###|
etl_project_ec2_agent[build] |   ~~       \#/ ___   https://aws.amazon.com/linux/amazon-linux-2023
etl_project_ec2_agent[build] |    ~~       V~' '->
etl_project_ec2_agent[build] |     ~~~         /
etl_project_ec2_agent[build] |       ~~._.   _/
etl_project_ec2_agent[build] |          _/ _/
etl_project_ec2_agent[build] |        _/m/'
etl_project_ec2_agent[build] | Requirement already satisfied: pip in ./.venv/etl_project/lib/python3.9/site-packages (21.3.1)
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | Successfully installed ...
etl_project_ec2_agent[build] | Updating remote project and file paths
etl_project_ec2_agent[build] | Done updating remote project and file paths

<HH:MM:DD> | INFO  | Streaming agent logs...

etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | Running with prism v0.2.1...
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | Found project directory at /etl_project
etl_project_ec2_agent[run] |
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | RUNNING EVENT 'parsing prism_project.py'................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | FINISHED EVENT 'parsing prism_project.py'............................................... [DONE in 1.42s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | RUNNING EVENT 'task DAG'................................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | FINISHED EVENT 'task DAG'............................................................... [DONE in 0.01s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | RUNNING EVENT 'creating pipeline, DAG executor'......................................... [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | FINISHED EVENT 'creating pipeline, DAG executor'........................................ [DONE in 0.01s]
etl_project_ec2_agent[run] |
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | ======================= tasks 'finicky-macaw-JaEvjyMWtb' =======================
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 1 of 3 RUNNING EVENT 'extract.Extract'...................................................[RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 1 of 3 FINISHED EVENT 'extract.Extract'................................................. [DONE in 120.01s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 2 of 3 RUNNING EVENT 'transform.Transform'.............................................. [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 2 of 3 FINISHED EVENT 'transform.Transform'............................................. [DONE in 791.38s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 3 of 3 RUNNING EVENT 'load.Load'........................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO  | 3 of 3 FINISHED EVENT 'load.Load'....................................................... [DONE in 7.84s]

<HH:MM:DD> | INFO  | Done streaming agent logs...
--------------------------------------------------------------------------------

Last updated