phvv.me

So far, our application just stores information in a database. In this tutorial, we are going to develop the business logic that is going to recommend us to place long or short positions. Effectively, this is going to be an asynchronous service running in AWS Lambda triggered by AWS Cloudwatch every 30 minutes.

Extract, Process, Notify: Building a AWS Lambda Handler Function

AWS Lambda expects a function that receives an event and a lambda context. Start by creating a .py file and pasting the code below

def handler(event, context):
    pass

We won’t even use the function input/output now, but we must keep the arguments there or Python is going to fail when Lambda tries to call it.

TL;DR

```python import boto3 import requests import yfinance from decouple import config from ta.momentum import RSIIndicator from pandas_datareader import data as pdr DISCORD_WEBHOOK_URL = config("DISCORD_WEBHOOK_URL") yfinance.pdr_override() dynamodb = boto3.resource('dynamodb') def handler(event, context): data = dynamodb.Table('bot').scan(Limit=1000) all_symbols = [item["symbol"] for item in data["Items"]] df = pdr.get_data_yahoo(" ".join(all_symbols), period="1d", interval="30m") for symbol, close in df["Close"].iteritems(): rsi = RSIIndicator(close, n=14).rsi() last = rsi.iloc[-1] if last < 20: action = "buy" elif last > 80: action = "sell" else: action = "hold" # notify in discord with webhook requests.post(DISCORD_WEBHOOK_URL, json={"content": f"time to {action} {symbol}", "tts": False}) ```

Extract from DynamoDB

Initially, the handler method has to extract our DynamoDB Table. As you’ve seen in the first part, in order to call Dynamo, start by defining the resource

import bot3

dynamodb = boto3.resource('dynamodb')

Now, we get all data from our bot Table by calling its scan method.

# this goes inside the handler()
data = dynamodb.Table('bot').scan(Limit=1000)
all_symbols = [item["symbol"] for item in data["Items"]]

Extract from Yahoo Finance

I’ll use yahoo finance as our data source. It’s fast, reliable and quite simple to request the data since we are going to use yfinance and pandas_datareader. Yeah, we’ll be using pandas - this is going to add “a layer of complexity” in our deploy process, but it’s going to be worth it in the long run.

Start with pip install yfinance pandas_datareader and then import it

import yfinance
from pandas_datareader import data as pdr

yfinance.pdr_override()

The last line might look awkward, but it just replaces the pandas_datareader yahoo downloader with the yfinance downloader. The yfinance downloader is faster and more flexible. Now, just add the download command to the handler function

df = pdr.get_data_yahoo(" ".join(all_symbols), period="1d", interval="30m")

The line above downloads all symbols in our database from 24 hours ago until now using 30 minutes candlesticks. A candlestick is nothing more than a tuple of open, close, low, high, representing how the price behaved within a time interval.

Process the Data

I won’t develop any fancy strategy here. Actually, all we are going to do is to measure a single indicator and make an opinion from its value. The purpose of this tutorial is not to teach technical analysis, so I encourage you to look for some strategy yourself and implement it here instead.

I am going to start with pip install ta, a simple technical analysis uses pandas. With it installed, paste the snippet below in the handler function.

for symbol, close in df["Close"].iteritems():
    rsi = RSIIndicator(close, n=14).rsi()

    last = rsi.iloc[-1]
    if last < 20:
        action = "buy"
    elif last > 80:
        action = "sell"
    else:
        continue

Notify in Discord with Webhook

As you can see, we are doing nothing with the action variable. We want to send a notification in discord if we have an action. We cannot use the current bot for this, as this is supposed to run async while the bot is reactive. The solution is a webhook.

A webhook is a URL we can post info to and Discord is going to ship the info to some channel we specified. Getting your webhook url is really simple, here’s a tutorial for it, and it looks like this

https://discord.com/api/webhooks/{webhook_id}/{webhook_token}

This is sensitive information, so you must save it in .env. Now just import decouple to use it safely in the function

from decouple import config

DISCORD_WEBHOOK_URL = config("DISCORD_WEBHOOK_URL")

Next, we will import requests to and, inside the for loop inside handler, we make a post request when there’s an action.

requests.post(DISCORD_WEBHOOK_URL, json={"content": f"time to {action} {symbol}", "tts": False})

Here, we didn’t have to pip install requests because yfinance already installs it, that’s why you can import it as well.

🏗️ Deploying the function to AWS Lambda

After programming the function, there are several services we have to coordinate in order to make it work as we expect. Besides the code, I am going to refactor the code structure as well. This is done so we can use Terraform modules and Terraform Cloud in an organized manner. You don’t have to follow it, just put the t code in .tf files located in the same directory that it should work, but I recommend taking a look at the github project if you find yourself stuck with anything.

Static Files in S3 Bucket

We are going to use AWS S3 to store the code files. This is not required, but I think it simplifies for future deploys if the codebase surpasses the Lambda size limit. That said, provisioning a S3 Bucket is quite simple with terraform.

resource "aws_s3_bucket" "static_files" {
  bucket = "backend-static-files"

  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "AES256"
      }
    }
  }

  tags = merge(var.tags, {
    name = "backend-bucket"
  })
}

None of the fields above are required, but I’d recommend naming the bucket and using server-side encryption anyway. Tags are good practice.

IAM Policy for S3 Bucket

For increased security, we will attach a policy to our S3 Bucket. This policy document tells the bucket to deny all s3 actions if they don’t use SSL, which is very important for any request in the internet. We can write a policy document in a simple manner with the aws_iam_policy_document resource.

data "aws_iam_policy_document" "s3_bucket_policy_document" {
  // denies all s3 access without ssl
  // complies with s3-bucket-ssl-requests-only rule
  statement {
    effect = "Deny"
    actions = [
      "s3:*"
    ]
    resources = ["arn:aws:s3:::${aws_s3_bucket.static_files.bucket}/*"]

    # anonymous user
    principals {
      identifiers = [
        "*"
      ]
      type = "*"
    }

    condition {
      test = "Bool"
      values = [
        false
      ]
      variable = "aws:SecureTransport"
    }
  }
}

After creating the policy document, we have to attach it to our bucket using the aws_s3_bucket_policy resource

resource "aws_s3_bucket_policy" "static_files_policy" {
  bucket = aws_s3_bucket.static_files.bucket
  policy = data.aws_iam_policy_document.s3_bucket_policy_document.json
}

AWS Lambda Layer for Project Dependencies

AWS Lambda has a low size limit on the code we can use for a single function. Besides, it cannot run code with C bindings. This is something that matters for our project as it uses some libs such as pandas. The solution is to deploy a Lambda Layer, that is a reusable collection of code. This really adds complexity to our deploy, but we are going forth with it anyway.

For starters, for the function we are deploying, these are the requirements.txt file

python-decouple
pandas_datareader
requests
ta
yfinance

Despite the fact we are using boto3 in our project, we don’t need to put it in the layer because AWS Lambda injects it when we deploy the function. Now, we have to make sure to install the dependencies inside a path python/lib/python3.8/site-packages/. I will put it inside the deploy directory so that we can reference it in terraform easily.

python3 -m pip install \
		-r requirements.txt \
		-t deploy/lambda_layer/python/lib/python3.8/site-packages/.

With the dependencies installed, create a new lambda_layer.tf file and let`s start it by adding some local variables just for the sake of organization

locals {
  name = "python-layer"
  description = "python dependencies from requirements.txt"

  source_dir = "${path.module}/lambda_layer/"
  filename = "lambda_layer.zip"
}

Next, as we have to send a .zip file to AWS, we can add a archive_file data resource to zip the dependencies directory

data "archive_file" "requirements_zip_package" {
  type = "zip"
  source_dir = local.source_dir
  output_path = local.filename
}

Then, we put the lambda_layer.zip file in our S3 bucket with a aws_s3_bucket_object resource

resource "aws_s3_bucket_object" "layer_package_s3_object" {
  bucket = aws_s3_bucket.static_files.id
  key = data.archive_file.requirements_zip_package.output_path
  source = data.archive_file.requirements_zip_package.output_path
  etag = data.archive_file.requirements_zip_package.output_md5
}

Finally, we provision the layer with aws_lambda_layer_version. It`s important we tell it our python runtime version, as we can define up to 5 different runtimes for a single layer

resource "aws_lambda_layer_version" "python_dependencies_layer" {
  layer_name = local.name
  description = local.description

  s3_bucket = aws_s3_bucket.static_files.bucket
  s3_key = aws_s3_bucket_object.layer_package_s3_object.key
  source_code_hash = aws_s3_bucket_object.layer_package_s3_object.content_base64

  compatible_runtimes = [
    "python3.8"
  ]
}

AWS Lambda Function IAM Role

When provisioning AWS Services that communicate with each other, it’s mandatory that we specify a IAM role with the required permissions.

First of all, the function has to “assume the role”, what means it needs the permission to use the aws resources. You can read more about it here.

data "aws_iam_policy_document" "lambda_assume_role_policy" {
  statement {
    actions = [
      "sts:AssumeRole"
    ]

    principals {
      type = "Service"
      identifiers = [
        "lambda.amazonaws.com"
      ]
    }
  }
}

Just like in our S3 bucket, this is the policy document, so we have to attach it to something: our AWS role.

resource "aws_iam_role" "lambda_role" {
  name               = "backend-lambda-role"
  path               = "/system/"
  assume_role_policy = data.aws_iam_policy_document.lambda_assume_role_policy.json

  lifecycle {
    create_before_destroy = true
  }

  tags = merge(var.tags, {
    name = "backend-lambda-role"
  })
}

If you find the lifecycle property strange, I agree with you, but this is done here to avoid a strange bug I faced that my role would not be created. Maybe this is already fixed when you read it. Besides, notice I call .json when calling the aws_iam_policy_document to convert it to a JSON file before passing it to assume_role_policy.

So far, we have our AWS Role ready to be used, but our lambda function still needs permission to access other AWS services, namely AWS DynamoDB for our data and AWS Cloudwatch for logs. We have to create another aws_iam_policy_document for this.

data "aws_iam_policy_document" "lambda_execution_policy_document" {
  // Logs
  statement {
    actions = [
      "logs:CreateLogStream",
      "logs:CreateLogGroup",
      "logs:PutLogEvents"
    ]
    resources = ["arn:aws:logs:*:*:*"]
  }
  // DynamoDB Scan
  statement {
    actions = [
      "dynamodb:Scan"
    ]
    resources = ["arn:aws:dynamodb:*:*:*"]
  }
}

From our policy document, we make a real aws policy with aws_iam_policy and attach it to our previously created role with a aws_iam_role_policy_attachment.

resource "aws_iam_policy" "lambda_execution_policy" {
  name        = "backend-lambda-execution-policy"
  path        = "/"
  description = "IAM policy for backend lambda function"
  policy      = data.aws_iam_policy_document.lambda_execution_policy_document.json
}

resource "aws_iam_role_policy_attachment" "lambda_policy" {
  role       = aws_iam_role.lambda_role.name
  policy_arn = aws_iam_policy.lambda_execution_policy.arn
}

AWS Lambda Function

After all this (S3 Bucket, Lambda Layer, IAM Role), we are ready to provision the lambda function. Initially, we are going to follow steps similar to the AWS Lambda Layer:

create a map variable with configs that I might customize in the future

  variable service {
 type = map(string)
 # service location
 default = {
   name        = "backend"
   description = "webhook discord"
   filename    = "lambda_backend.zip"
 }
  }

archive the code with archive_file

  data "archive_file" "service_zip_package" {
 type        = "zip"
 source_dir  = "${path.module}/../app"
 output_path = var.service.filename
  }

put the .zip file in S3

  resource "aws_s3_bucket_object" "service_package_s3_object" {
 bucket = aws_s3_bucket.static_files.id
 key    = data.archive_file.service_zip_package.output_path
 source = data.archive_file.service_zip_package.output_path
 etag   = data.archive_file.service_zip_package.output_md5
  }

Be sure to specify the correct source_dir in the archive_file resource. I put my function code inside a app/main.py file, so adapt it to your own case.

Finally, we can define the aws lambda layer object with aws_lambda_function

resource "aws_lambda_function" "service_lambda" {
  depends_on = [
    aws_s3_bucket_object.service_package_s3_object,
    aws_iam_role_policy_attachment.lambda_policy,
  ]

  layers = [aws_lambda_layer_version.python_dependencies_layer.arn]

  # S3 bucket must exist with a packaged .zip before terraform apply
  s3_bucket        = aws_s3_bucket.static_files.bucket
  s3_key           = aws_s3_bucket_object.service_package_s3_object.key
  source_code_hash = aws_s3_bucket_object.service_package_s3_object.content_base64

  publish       = true
  function_name = var.service.name
  description   = var.service.description
  role          = aws_iam_role.lambda_role.arn
  handler       = "main.handler"
  memory_size   = 256
  timeout       = 6
  runtime       = "python3.8"

  environment {
    variables = {
      DISCORD_WEBHOOK_URL = var.DISCORD_WEBHOOK_URL
    }
  }

  tracing_config {
    # disables X-Ray
    mode = "PassThrough"
  }

  tags = merge(var.tags, {
    name = var.service.name
  })
}

From all this, I’d like to call your attention to some fields:

handler: this is the function name in main.py, so it works more or less like a python import
memory_size: The minimum is 128MB, but I raised it to 256MB because it was running out of memory when I tested. This should be enough, but might need more if you add too many symbols to the DB.
timeout: The maximum amount of time the function is allowed to run. 6 seconds was enough for me, but you might need to raise it if you add too many symbols to the DB (or make the code async).
DISCORD_WEBHOOK_URL: as you might recall, our code expects for this environment variable. Just add variable "DISCORD_WEBHOOK_URL" {} somewhere and then the value in Terraform Cloud and you’re set.

Lambda Scheduled Trigger with AWS Cloudwatch

The AWS Lambda function is ready, but we no one is calling it. We will set a schedule in AWS Cloudwatch to call the function periodically.

First of all, we create an aws_cloudwatch_event_rule to configure the cron job. This works like an alarm, triggering anything we want.

resource "aws_cloudwatch_event_rule" "every_half_hour" {
  name                = "half-hour-trigger"
  description         = "Fires every 30 minutes between 8:00 - 17:30 BST from Monday to Friday"
  schedule_expression = "cron(0/30 11-21 ? * MON-FRI *)"

  tags = merge(var.tags, {
    name = "lambda function schedule rule"
  })
}

I set it with a cron expression based on Brazil Standard Time, so you might want to adapt it to your use case. Next, we want to add permissions to Cloudwatch so that it can invoke our lambda function.

resource "aws_lambda_permission" "allow_cloudwatch_to_call_function" {
  statement_id = "AllowExecutionFromCloudWatch"
  action       = "lambda:InvokeFunction"
  principal    = "events.amazonaws.com"

  function_name = aws_lambda_function.service_lambda.function_name
  source_arn    = aws_cloudwatch_event_rule.every_half_hour.arn
}

Finally, the target is defined with aws_cloudwatch_event_target. Now, Cloudwatch is set to call our function every 30 minutes on weekdays for as long as we want it.

resource "aws_cloudwatch_event_target" "lambda_schedule_target" {
  target_id = "TriggerLambda"
  rule      = aws_cloudwatch_event_rule.every_half_hour.name
  arn       = aws_lambda_function.service_lambda.arn
}

Conclusions

That sure was a lot of terraform for so little python! On the other hand, the project .py codebase might yet grow whereas I don’t think the .tf is going to change much.

With that said, just do terraform apply, read through the changes and yes to it. In a few seconds, you’ll find your function deployed and, with luck, soon you’ll receive a Webhook with some hot investment tip!

Thanks for reading and, as always, Happy Coding! 🧑🏻‍💻

Pedro Valois