EC2
Prerequisites
EC2 agents rely on the boto3
and botocore
packages. These require you to set up authentication credentials for your AWS account. More information can be found here.
In addition, you'll need to make sure your account has the necessary privileges to create EC2 instances, security groups, and key-pairs. Contact your account administrator to help out with this.
IAM permissions
EC2 agents require your AWS credentials to list, modify, create, and delete certain resources (e.g., key pairs, security groups, EC2 instances, etc.). Use the following policy document to provision your account with the necessary permissions:
IAM policy document
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowGetAccessKeyLastUsed",
"Effect": "Allow",
"Action": "iam:GetAccessKeyLastUsed",
"Resource": "arn:aws:iam::*:user/${aws:username}"
},
{
"Sid": "AllowGetUser",
"Effect": "Allow",
"Action": "iam:GetUser",
"Resource": "arn:aws:iam::*:user/${aws:username}"
},
{
"Effect": "Allow",
"Action": [
"ec2:Describe*",
"ec2:GetConsole*"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:RunInstances",
"ec2:CreateTags"
],
"Resource": [
"arn:aws:ec2:*:*:subnet/*",
"arn:aws:ec2:*:*:network-interface/*",
"arn:aws:ec2:*:*:instance/*",
"arn:aws:ec2:*:*:volume/*",
"arn:aws:ec2:*::image/ami-*",
"arn:aws:ec2:*:*:key-pair/*",
"arn:aws:ec2:*:*:security-group/*"
]
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateKeyPair",
"ec2:DeleteKeyPair",
"ec2:CreateSecurityGroup",
"ec2:DeleteSecurityGroup",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSecurityGroupRules",
"ec2:DescribeTags"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:AuthorizeSecurityGroupIngress",
"ec2:RevokeSecurityGroupIngress",
"ec2:AuthorizeSecurityGroupEgress",
"ec2:RevokeSecurityGroupEgress",
"ec2:ModifySecurityGroupRules",
"ec2:UpdateSecurityGroupRuleDescriptionsIngress",
"ec2:UpdateSecurityGroupRuleDescriptionsEgress"
],
"Resource": [
"arn:aws:ec2:*:*:security-group/*"
]
},
{
"Effect": "Allow",
"Action": [
"ec2:ModifySecurityGroupRules"
],
"Resource": [
"arn:aws:ec2:*:*:security-group-rule/*"
]
}
]
}
Important: Prism will never touch any resource that it did not directly create. Rest assured, your infrastructure is safe :)
Configuring your EC2 agent
Docker agents are configured using a YML file:
# ec2_agent.yml
agent:
type: ec2
instance_type: # instance type, e.g., t2.micro
requirements: # path to requirements.txt, relative to this file
env:
<env var 1>: 'env var 1 value'
The YML file has a single top-level key, agent
. This key contains all the configurations needed for your agent. Specifically:
type
: this will always beec2
for EC2 agentsinstance_type
: the EC2 instance type. Note that different instance types are optimized for different use cases. The full list of EC2 instance types can be found here.requirements
: path to therequirements.txt
file. This path must be relative to the path of the agent YML file.env
:{key,value}
pairs representing environment variables to add to your Docker image
Example agent
For the remainder of this guide, let's assume that we have the following project.
etl_project/
├── prism_project.py
├── tasks/
│ ├── extract.py
│ ├── transfrom.py
│ └── load.py
├── ec2_agent.yml
├── requirements.txt
└── triggers.yml
And, let's assume that we defined our ec2_agent.yml
as follows:
# ec2_agent.yml
agent:
type: ec2
instance_type: m7g.medium
requirements: ./requirements.txt
env:
AWS_ACCESS_KEY_ID: "{{ env('AWS_ACCESS_KEY_ID') }}"
AWS_SECRET_ACCESS_KEY: "{{ env('AWS_SECRET_ACCESS_KEY') }}"
Note the way we specified the requirements
key. We can do this because ec2_agent.yml
and requirements.txt
are in the same directory.
Creating your EC2 agent
You can build your EC2 agent using the command: prism agent apply -f <path to agent YML>
(note: logs are truncated for visibility):
$ prism agent apply -f ./ec2_agent.yml
--------------------------------------------------------------------------------
<HH:MM:DD> | INFO | Creating agent...
etl_project_ec2_agent[build] | Created key pair etl_project_ec2_agent
etl_project_ec2_agent[build] | Created security group with ID sg-XXXXXXXXXXXXXXXXX in VPC vpc-XXXXXXXXXXXXXXXXX
etl_project_ec2_agent[build] | Created EC2 instance with ID i-XXXXXXXXXXXXXXXXX
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | ssh: connect to host ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com port 22: Connection refused
etl_project_ec2_agent[build] | SSH connection failed. Retrying in 5 seconds...
The authenticity of host 'ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com (XX.XXX.XXX.XXX)' can't be established.
ED25519 key fingerprint is SHA256:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
etl_project_ec2_agent[build] | Warning: Permanently added 'ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com' (ED25519) to the list of known hosts.
etl_project_ec2_agent[build] | SSH connection succeeded!
etl_project_ec2_agent[build] | , #_
etl_project_ec2_agent[build] | ~\_ ####_ Amazon Linux 2023
etl_project_ec2_agent[build] | ~~ \_#####\
etl_project_ec2_agent[build] | ~~ \###|
etl_project_ec2_agent[build] | ~~ \#/ ___ https://aws.amazon.com/linux/amazon-linux-2023
etl_project_ec2_agent[build] | ~~ V~' '->
etl_project_ec2_agent[build] | ~~~ /
etl_project_ec2_agent[build] | ~~._. _/
etl_project_ec2_agent[build] | _/ _/
etl_project_ec2_agent[build] | _/m/'
etl_project_ec2_agent[build] | Requirement already satisfied: pip in ./.venv/etl_project/lib/python3.9/site-packages (21.3.1)
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | Successfully installed ...
etl_project_ec2_agent[build] | Updating remote project and file paths
etl_project_ec2_agent[build] | Done updating remote project and file paths
--------------------------------------------------------------------------------
If you check your EC2 console, you'll see an EC2 instance called etl_project_ec2_agent
!
Running your Docker agent
There are two commands you can use to run your Docker agent.
prism agent run
The prism agent run
command streams your project onto your EC2 instance:
$ prism agent run -f ./ec2_agent.yml
--------------------------------------------------------------------------------
<HH:MM:DD> | INFO | Streaming agent logs...
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | Running with prism v0.2.0...
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | Found project directory at /etl_project
etl_project_ec2_agent[run] |
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | RUNNING EVENT 'parsing prism_project.py'................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | FINISHED EVENT 'parsing prism_project.py'............................................... [DONE in 1.42s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | RUNNING EVENT 'task DAG'................................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | FINISHED EVENT 'task DAG'............................................................... [DONE in 0.01s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | RUNNING EVENT 'creating pipeline, DAG executor'......................................... [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | FINISHED EVENT 'creating pipeline, DAG executor'........................................ [DONE in 0.01s]
etl_project_ec2_agent[run] |
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | ======================= tasks 'finicky-macaw-JaEvjyMWtb' =======================
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 1 of 3 RUNNING EVENT 'extract.Extract'...................................................[RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 1 of 3 FINISHED EVENT 'extract.Extract'................................................. [DONE in 120.01s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 2 of 3 RUNNING EVENT 'transform.Transform'.............................................. [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 2 of 3 FINISHED EVENT 'transform.Transform'............................................. [DONE in 791.38s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 3 of 3 RUNNING EVENT 'load.Load'........................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 3 of 3 FINISHED EVENT 'load.Load'....................................................... [DONE in 7.84s]
<HH:MM:DD> | INFO | Done streaming agent logs...
--------------------------------------------------------------------------------
prism agent build
The prism agent build
command updates your project files within your EC2 instance and executes the project onto it.
$ prism agent build -f ./ec2_agent.yml
--------------------------------------------------------------------------------
<HH:MM:DD> | INFO | Creating agent...
etl_project_ec2_agent[build] | Created key pair etl_project_ec2_agent
etl_project_ec2_agent[build] | Created security group with ID sg-XXXXXXXXXXXXXXXXX in VPC vpc-XXXXXXXXXXXXXXXXX
etl_project_ec2_agent[build] | Created EC2 instance with ID i-XXXXXXXXXXXXXXXXX
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | Instance i-XXXXXXXXXXXXXXXXX is `pending`... checking again in 5 seconds
etl_project_ec2_agent[build] | ssh: connect to host ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com port 22: Connection refused
etl_project_ec2_agent[build] | SSH connection failed. Retrying in 5 seconds...
The authenticity of host 'ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com (XX.XXX.XXX.XXX)' can't be established.
ED25519 key fingerprint is SHA256:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
etl_project_ec2_agent[build] | Warning: Permanently added 'ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com' (ED25519) to the list of known hosts.
etl_project_ec2_agent[build] | SSH connection succeeded!
etl_project_ec2_agent[build] | , #_
etl_project_ec2_agent[build] | ~\_ ####_ Amazon Linux 2023
etl_project_ec2_agent[build] | ~~ \_#####\
etl_project_ec2_agent[build] | ~~ \###|
etl_project_ec2_agent[build] | ~~ \#/ ___ https://aws.amazon.com/linux/amazon-linux-2023
etl_project_ec2_agent[build] | ~~ V~' '->
etl_project_ec2_agent[build] | ~~~ /
etl_project_ec2_agent[build] | ~~._. _/
etl_project_ec2_agent[build] | _/ _/
etl_project_ec2_agent[build] | _/m/'
etl_project_ec2_agent[build] | Requirement already satisfied: pip in ./.venv/etl_project/lib/python3.9/site-packages (21.3.1)
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | ...
etl_project_ec2_agent[build] | Successfully installed ...
etl_project_ec2_agent[build] | Updating remote project and file paths
etl_project_ec2_agent[build] | Done updating remote project and file paths
<HH:MM:DD> | INFO | Streaming agent logs...
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | Running with prism v0.2.0...
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | Found project directory at /etl_project
etl_project_ec2_agent[run] |
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | RUNNING EVENT 'parsing prism_project.py'................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | FINISHED EVENT 'parsing prism_project.py'............................................... [DONE in 1.42s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | RUNNING EVENT 'task DAG'................................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | FINISHED EVENT 'task DAG'............................................................... [DONE in 0.01s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | RUNNING EVENT 'creating pipeline, DAG executor'......................................... [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | FINISHED EVENT 'creating pipeline, DAG executor'........................................ [DONE in 0.01s]
etl_project_ec2_agent[run] |
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | ======================= tasks 'finicky-macaw-JaEvjyMWtb' =======================
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 1 of 3 RUNNING EVENT 'extract.Extract'...................................................[RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 1 of 3 FINISHED EVENT 'extract.Extract'................................................. [DONE in 120.01s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 2 of 3 RUNNING EVENT 'transform.Transform'.............................................. [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 2 of 3 FINISHED EVENT 'transform.Transform'............................................. [DONE in 791.38s]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 3 of 3 RUNNING EVENT 'load.Load'........................................................ [RUN]
etl_project_ec2_agent[run] | <HH:MM:DD> | INFO | 3 of 3 FINISHED EVENT 'load.Load'....................................................... [DONE in 7.84s]
<HH:MM:DD> | INFO | Done streaming agent logs...
--------------------------------------------------------------------------------
Last updated