So far, our application just stores information in a database. In this tutorial, we are going to develop the business logic that is going to recommend us to place long or short positions. Effectively, this is going to be an asynchronous service running in AWS Lambda triggered by AWS Cloudwatch every 30 minutes.
Extract, Process, Notify: Building a AWS Lambda Handler Function
AWS Lambda expects a function that receives an event
and a lambda context
. Start by creating a .py
file and pasting the code below
def handler(event, context):
pass
We won’t even use the function input/output now, but we must keep the arguments there or Python is going to fail when Lambda tries to call it.
TL;DR
Extract from DynamoDB
Initially, the handler method has to extract our DynamoDB Table. As you’ve seen in the first part, in order to call Dynamo, start by defining the resource
import bot3
dynamodb = boto3.resource('dynamodb')
Now, we get all data from our bot
Table by calling its scan
method.
# this goes inside the handler()
data = dynamodb.Table('bot').scan(Limit=1000)
all_symbols = [item["symbol"] for item in data["Items"]]
Extract from Yahoo Finance
I’ll use yahoo finance as our data source. It’s fast, reliable and quite simple to request the data since we are going to use yfinance
and pandas_datareader
. Yeah, we’ll be using pandas
- this is going to add “a layer of complexity” in our deploy process, but it’s going to be worth it in the long run.
Start with pip install yfinance pandas_datareader
and then import it
import yfinance
from pandas_datareader import data as pdr
yfinance.pdr_override()
The last line might look awkward, but it just replaces the pandas_datareader
yahoo downloader with the yfinance
downloader. The yfinance
downloader is faster and more flexible. Now, just add the download command to the handler
function
df = pdr.get_data_yahoo(" ".join(all_symbols), period="1d", interval="30m")
The line above downloads all symbols in our database from 24 hours ago until now using 30 minutes candlesticks. A candlestick is nothing more than a tuple of open, close, low, high, representing how the price behaved within a time interval.
Process the Data
I won’t develop any fancy strategy here. Actually, all we are going to do is to measure a single indicator and make an opinion from its value. The purpose of this tutorial is not to teach technical analysis, so I encourage you to look for some strategy yourself and implement it here instead.
I am going to start with pip install ta
, a simple technical analysis uses pandas
. With it installed, paste the snippet below in the handler
function.
for symbol, close in df["Close"].iteritems():
rsi = RSIIndicator(close, n=14).rsi()
last = rsi.iloc[-1]
if last < 20:
action = "buy"
elif last > 80:
action = "sell"
else:
continue
Notify in Discord with Webhook
As you can see, we are doing nothing with the action
variable. We want to send a notification in discord if we have an action. We cannot use the current bot for this, as this is supposed to run async while the bot is reactive. The solution is a webhook.
A webhook is a URL we can post info to and Discord is going to ship the info to some channel we specified. Getting your webhook url is really simple, here’s a tutorial for it, and it looks like this
https://discord.com/api/webhooks/{webhook_id}/{webhook_token}
This is sensitive information, so you must save it in .env
. Now just import decouple
to use it safely in the function
from decouple import config
DISCORD_WEBHOOK_URL = config("DISCORD_WEBHOOK_URL")
Next, we will import requests
to and, inside the for
loop inside handler
, we make a post request when there’s an action
.
requests.post(DISCORD_WEBHOOK_URL, json={"content": f"time to {action} {symbol}", "tts": False})
Here, we didn’t have to pip install requests
because yfinance
already installs it, that’s why you can import it as well.
🏗️ Deploying the function to AWS Lambda
After programming the function, there are several services we have to coordinate in order to make it work as we expect. Besides the code, I am going to refactor the code structure as well. This is done so we can use Terraform modules and Terraform Cloud in an organized manner. You don’t have to follow it, just put the t code in .tf
files located in the same directory that it should work, but I recommend taking a look at the github project if you find yourself stuck with anything.
Static Files in S3 Bucket
We are going to use AWS S3 to store the code files. This is not required, but I think it simplifies for future deploys if the codebase surpasses the Lambda size limit. That said, provisioning a S3 Bucket is quite simple with terraform.
resource "aws_s3_bucket" "static_files" {
bucket = "backend-static-files"
server_side_encryption_configuration {
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
tags = merge(var.tags, {
name = "backend-bucket"
})
}
None of the fields above are required, but I’d recommend naming the bucket and using server-side encryption anyway. Tags are good practice.
IAM Policy for S3 Bucket
For increased security, we will attach a policy to our S3 Bucket. This policy document tells the bucket to deny all s3
actions if they don’t use SSL, which is very important for any request in the internet. We can write a policy document in a simple manner with the aws_iam_policy_document
resource.
data "aws_iam_policy_document" "s3_bucket_policy_document" {
// denies all s3 access without ssl
// complies with s3-bucket-ssl-requests-only rule
statement {
effect = "Deny"
actions = [
"s3:*"
]
resources = ["arn:aws:s3:::${aws_s3_bucket.static_files.bucket}/*"]
# anonymous user
principals {
identifiers = [
"*"
]
type = "*"
}
condition {
test = "Bool"
values = [
false
]
variable = "aws:SecureTransport"
}
}
}
After creating the policy document, we have to attach it to our bucket using the aws_s3_bucket_policy
resource
resource "aws_s3_bucket_policy" "static_files_policy" {
bucket = aws_s3_bucket.static_files.bucket
policy = data.aws_iam_policy_document.s3_bucket_policy_document.json
}
AWS Lambda Layer for Project Dependencies
AWS Lambda has a low size limit on the code we can use for a single function. Besides, it cannot run code with C bindings. This is something that matters for our project as it uses some libs such as pandas
. The solution is to deploy a Lambda Layer, that is a reusable collection of code. This really adds complexity to our deploy, but we are going forth with it anyway.
For starters, for the function we are deploying, these are the requirements.txt
file
python-decouple
pandas_datareader
requests
ta
yfinance
Despite the fact we are using boto3
in our project, we don’t need to put it in the layer because AWS Lambda injects it when we deploy the function. Now, we have to make sure to install the dependencies inside a path python/lib/python3.8/site-packages/
. I will put it inside the deploy
directory so that we can reference it in terraform easily.
python3 -m pip install \
-r requirements.txt \
-t deploy/lambda_layer/python/lib/python3.8/site-packages/.
With the dependencies installed, create a new lambda_layer.tf
file and let`s start it by adding some local variables just for the sake of organization
locals {
name = "python-layer"
description = "python dependencies from requirements.txt"
source_dir = "${path.module}/lambda_layer/"
filename = "lambda_layer.zip"
}
Next, as we have to send a .zip
file to AWS, we can add a archive_file
data resource to zip the dependencies directory
data "archive_file" "requirements_zip_package" {
type = "zip"
source_dir = local.source_dir
output_path = local.filename
}
Then, we put the lambda_layer.zip
file in our S3 bucket with a aws_s3_bucket_object
resource
resource "aws_s3_bucket_object" "layer_package_s3_object" {
bucket = aws_s3_bucket.static_files.id
key = data.archive_file.requirements_zip_package.output_path
source = data.archive_file.requirements_zip_package.output_path
etag = data.archive_file.requirements_zip_package.output_md5
}
Finally, we provision the layer with aws_lambda_layer_version
. It`s important we tell it our python runtime version, as we can define up to 5 different runtimes for a single layer
resource "aws_lambda_layer_version" "python_dependencies_layer" {
layer_name = local.name
description = local.description
s3_bucket = aws_s3_bucket.static_files.bucket
s3_key = aws_s3_bucket_object.layer_package_s3_object.key
source_code_hash = aws_s3_bucket_object.layer_package_s3_object.content_base64
compatible_runtimes = [
"python3.8"
]
}
AWS Lambda Function IAM Role
When provisioning AWS Services that communicate with each other, it’s mandatory that we specify a IAM role with the required permissions.
First of all, the function has to “assume the role”, what means it needs the permission to use the aws resources. You can read more about it here.
data "aws_iam_policy_document" "lambda_assume_role_policy" {
statement {
actions = [
"sts:AssumeRole"
]
principals {
type = "Service"
identifiers = [
"lambda.amazonaws.com"
]
}
}
}
Just like in our S3 bucket, this is the policy document, so we have to attach it to something: our AWS role.
resource "aws_iam_role" "lambda_role" {
name = "backend-lambda-role"
path = "/system/"
assume_role_policy = data.aws_iam_policy_document.lambda_assume_role_policy.json
lifecycle {
create_before_destroy = true
}
tags = merge(var.tags, {
name = "backend-lambda-role"
})
}
If you find the lifecycle
property strange, I agree with you, but this is done here to avoid a strange bug I faced that my role would not be created. Maybe this is already fixed when you read it. Besides, notice I call .json
when calling the aws_iam_policy_document
to convert it to a JSON file before passing it to assume_role_policy
.
So far, we have our AWS Role ready to be used, but our lambda function still needs permission to access other AWS services, namely AWS DynamoDB for our data and AWS Cloudwatch for logs. We have to create another aws_iam_policy_document
for this.
data "aws_iam_policy_document" "lambda_execution_policy_document" {
// Logs
statement {
actions = [
"logs:CreateLogStream",
"logs:CreateLogGroup",
"logs:PutLogEvents"
]
resources = ["arn:aws:logs:*:*:*"]
}
// DynamoDB Scan
statement {
actions = [
"dynamodb:Scan"
]
resources = ["arn:aws:dynamodb:*:*:*"]
}
}
From our policy document, we make a real aws policy with aws_iam_policy
and attach it to our previously created role with a aws_iam_role_policy_attachment
.
resource "aws_iam_policy" "lambda_execution_policy" {
name = "backend-lambda-execution-policy"
path = "/"
description = "IAM policy for backend lambda function"
policy = data.aws_iam_policy_document.lambda_execution_policy_document.json
}
resource "aws_iam_role_policy_attachment" "lambda_policy" {
role = aws_iam_role.lambda_role.name
policy_arn = aws_iam_policy.lambda_execution_policy.arn
}
AWS Lambda Function
After all this (S3 Bucket, Lambda Layer, IAM Role), we are ready to provision the lambda function. Initially, we are going to follow steps similar to the AWS Lambda Layer:
- create a map variable with configs that I might customize in the future
variable service { type = map(string) # service location default = { name = "backend" description = "webhook discord" filename = "lambda_backend.zip" } }
- archive the code with
archive_file
data "archive_file" "service_zip_package" { type = "zip" source_dir = "${path.module}/../app" output_path = var.service.filename }
- put the
.zip
file in S3resource "aws_s3_bucket_object" "service_package_s3_object" { bucket = aws_s3_bucket.static_files.id key = data.archive_file.service_zip_package.output_path source = data.archive_file.service_zip_package.output_path etag = data.archive_file.service_zip_package.output_md5 }
Be sure to specify the correct source_dir
in the archive_file
resource. I put my function code inside a app/main.py
file, so adapt it to your own case.
Finally, we can define the aws lambda layer object with aws_lambda_function
resource "aws_lambda_function" "service_lambda" {
depends_on = [
aws_s3_bucket_object.service_package_s3_object,
aws_iam_role_policy_attachment.lambda_policy,
]
layers = [aws_lambda_layer_version.python_dependencies_layer.arn]
# S3 bucket must exist with a packaged .zip before terraform apply
s3_bucket = aws_s3_bucket.static_files.bucket
s3_key = aws_s3_bucket_object.service_package_s3_object.key
source_code_hash = aws_s3_bucket_object.service_package_s3_object.content_base64
publish = true
function_name = var.service.name
description = var.service.description
role = aws_iam_role.lambda_role.arn
handler = "main.handler"
memory_size = 256
timeout = 6
runtime = "python3.8"
environment {
variables = {
DISCORD_WEBHOOK_URL = var.DISCORD_WEBHOOK_URL
}
}
tracing_config {
# disables X-Ray
mode = "PassThrough"
}
tags = merge(var.tags, {
name = var.service.name
})
}
From all this, I’d like to call your attention to some fields:
handler
: this is the function name inmain.py
, so it works more or less like a python importmemory_size
: The minimum is 128MB, but I raised it to 256MB because it was running out of memory when I tested. This should be enough, but might need more if you add too many symbols to the DB.timeout
: The maximum amount of time the function is allowed to run. 6 seconds was enough for me, but you might need to raise it if you add too many symbols to the DB (or make the code async).DISCORD_WEBHOOK_URL
: as you might recall, our code expects for this environment variable. Just addvariable "DISCORD_WEBHOOK_URL" {}
somewhere and then the value in Terraform Cloud and you’re set.
Lambda Scheduled Trigger with AWS Cloudwatch
The AWS Lambda function is ready, but we no one is calling it. We will set a schedule in AWS Cloudwatch to call the function periodically.
First of all, we create an aws_cloudwatch_event_rule
to configure the cron job. This works like an alarm, triggering anything we want.
resource "aws_cloudwatch_event_rule" "every_half_hour" {
name = "half-hour-trigger"
description = "Fires every 30 minutes between 8:00 - 17:30 BST from Monday to Friday"
schedule_expression = "cron(0/30 11-21 ? * MON-FRI *)"
tags = merge(var.tags, {
name = "lambda function schedule rule"
})
}
I set it with a cron expression based on Brazil Standard Time, so you might want to adapt it to your use case. Next, we want to add permissions to Cloudwatch so that it can invoke our lambda function.
resource "aws_lambda_permission" "allow_cloudwatch_to_call_function" {
statement_id = "AllowExecutionFromCloudWatch"
action = "lambda:InvokeFunction"
principal = "events.amazonaws.com"
function_name = aws_lambda_function.service_lambda.function_name
source_arn = aws_cloudwatch_event_rule.every_half_hour.arn
}
Finally, the target is defined with aws_cloudwatch_event_target
. Now, Cloudwatch is set to call our function every 30 minutes on weekdays for as long as we want it.
resource "aws_cloudwatch_event_target" "lambda_schedule_target" {
target_id = "TriggerLambda"
rule = aws_cloudwatch_event_rule.every_half_hour.name
arn = aws_lambda_function.service_lambda.arn
}
Conclusions
That sure was a lot of terraform for so little python! On the other hand, the project .py
codebase might yet grow whereas I don’t think the .tf
is going to change much.
With that said, just do terraform apply
, read through the changes and yes
to it. In a few seconds, you’ll find your function deployed and, with luck, soon you’ll receive a Webhook with some hot investment tip!
Thanks for reading and, as always, Happy Coding! 🧑🏻‍💻