Paid-by-Request Content Delivery with AWS

06.01.2016
Mathias Münscher

What happened? AWS has a new HTTPS gateway

With the release of the AWS API Gateway in July 2015, Amazon has recently opened the gates to server-less web-apps, utilizing Lambda service as their computation component. This opens an exiting perspective for scalable microservices, as AWS offers a wide range of other applications services such as storage, transactional mails, messaging, queuing etc. The real game changer is the pricing model though: you only pay for what is transferred, stored or computed, not for the provisioning of some resource.


What's the story? Content delivery with opt-in

Our client offers static content to the public with a steep demand during it's introduction phase. While the demand is dropping again after a short period, the content still needs to be served for much longer.

The content is essentially static. However, the consumer may adapt it slightly and send it by mail to other customers. For legal reasons the user needs to prove ownership of the email address used with a double opt-in via mail.

What we need What we have (AWS Service)
logic to control the workflow Lambda
REST-API API Gateway
store the content S3
web delivery Cloudfront CDN
serve emails SES

Concept: The workflow with AWS

Overview elements

Overview

Load the frontend

Alt text

  1. User He/she sends a request for a static page that points at Cloudfront.
  2. Cloudfront resolves to an S3 bucket.
  3. S3 The data is stored as key/value (i.e. path/data) and holds meta data (e.g. content-type, access, ..)

Process the user input

Alt text

  1. Frontend The static content uses a little JS app that allows the user to adapt the content.
  2. Frontend The respective data is send to the Api-gateway.
  3. API Gateway The gateway validates and transforms the data (into json) before it's passed on to the Lambda function
  4. Lambda Processes the input: Loads, manipulates and saves further data from S3

Double opt-in step 1

Alt text

  1. Lambda Subsequently, Lambda initiates the double opt-in as it triggers an email to the user
  2. SES The AWS emailing service is called only with the essential data for the email.

Double opt-in step 2

Alt text

  1. User He/She receives an email with a verification link.
  2. Api-Gateway Again, pre-process and passes on the data to Lambda
  3. Lambda completes the double opt-in and executes the follow-up logic.
  4. Lambda triggers sending of emails to a third party as the user is now authenticated - in order to prevent a manipulation of the authentication links or requiring a database for tokens we pass data between these steps encoded as signed JSON Web Tokens.

Implementation

Lambda function

Lambda functions may be written in Javascript (which we choose), Python or Java 8. AWS provides the necessary SDK. The SDK includes the API to all AWS services.

The code is executed through a dedicated handler function which gets passed an event as well as a context object. The first contains the input data (in this case provided by the API Gateway). The latter contains the execution context (fail- / success-handler, etc.)

The code has essentially the same power as any Node.js process. It can load, save, process data with the SDK or any other Node library. It only interacts with the container to

However, there are more options (e.g. get meta-infos). For a quick start we suggest to have a look at Tim B's blog entry on developing with Lambda and grunt.

API Gateway

Currently the AWS component may only be programmed through the AWS management console (i.e. the web-interface), not the CLI/Cloudformation.

As it is a "gateway" there are essentially two tasks:

  1. Input Control, pre-process and pass on the input.
  2. Output Receive and post-process the output.

Input

Setting up REST resources and methods is trivial. Let's have a look at the more advanced settings.

All changes start at the Method Execution panel. (BTW: The test feature, a bit hidden at the right, has proven to be very handy.)

Alt text

Query parameter Working with url query parameter requires a mapping of the json object that is passed on to Lambda.

{
  "parameter1": "$input.params('parameter1')",
  "parameter2": "$input.params('parameter2')",
  "parameter3": "$input.params('parameter3')"
}

CORS As API Gateway requires https, which in turn requires certificate management we decided to use the AWS domain for API access. Since our static content origins from a different domain we needed to enable CORS on the API Gateway.

Output

Alt text

Static error response

A lambda function fails either checked (containing a deliberate error code) or subsequently to an unhandled exception (containing the error code Process exited before completing the request). In both cases we choose to return a http 404 response to avoid revealing any information about the system.

  1. To accomplish this we add an integration response (in Method Execution / Integration Response)
  2. The gateway uses a regex to assign the http response. For checked fails Lambda returns a "404" string. Hence the regex is simply 404|Process exited before completing request
  3. To suppress any values in the response body we add a mapping template with content-type application/json and a template that only hold the mandatory json brackets: {}. An empty template or a single space would not work.

HTTP Redirect When the the user follows the verification link we redirect the request to a static response.

The gateway maps the error code only, what forces Lambda to fail in order to redirect. This is not an intuitive way, but we haven't found anything better, yet.

  1. Lambda fails with context.done("301-redirect-distribution-sent");
  2. The api-gateway maps respectively and returns a 301 response.
  3. Eventually we add a static location header (as output mapping is not yet possible for a header template). Hence, if you need to redirect dynamically, you're stuck at the moment!
  4. Error prone: It is possible to set different location for your stages by manually changing the value before you deploy.

S3 and Cloudfront CDN

S3 itself comes with basic web-delivery. Particular, you can set to every file:

Cloudfront links a domain to one/multiple S3 buckets. We use Cloudformation to setup a CDN for staging and production:

SES

Sending limitations

Amazon handles SES (for legal reasons) more restrictive than the its other services. SES implements two quotas which are region- and recipient-specific. You start out in the SES Sandbox and may apply for increased limits. The effects of exceeding the limits are detailed e.g. in the SES Blog.

Limitiation Sandbox Opened Account
Sending quota Max sum recipients, last 24h 200 Individually, usually 50k
Maximum Send Rate Max sum recipients/second (Short exceeding bursts are tolerated.) 1 Individually, usually 15
Confirmed Emails Only All Email addresses need to be verified first. Applies

BTW: There's a mailbox simulator for virtual testing.

Queueing with SNS

Since we run a distributed solution with Lambda we cannot tackle the problem through scheduling. However, the AWS toolbox offers several queueing solutions. We chose SNS.

Workflow

Alt text

  1. Lambda Each attempt to send an email starts as a message to an SNS topic.
  2. SNS again calls back Lambda through the api-gateway to trigger the emailing.
  3. Lambda places the email at SES.
  4. SES (potentially) rejects a sending request.
  5. Lambda returns a 503 error. SNS requires a 5XX error for retries.
  6. SES Now, we can use SNS build-in back-off strategy and requests/timeframe limitation to avoid throttling.

Our setup

To join a topic Lambda needs to execute a confirmation-link once. This is a bit cumbersome, as we need to add a POST method to the api-gateway and process a single call.

{
  "http": {
    "defaultHealthyRetryPolicy": {
      "minDelayTarget": 5,
      "maxDelayTarget": 60,
      "numRetries": 60,
      "numMaxDelayRetries": 60,
      "numNoDelayRetries": 0,
      "numMinDelayRetries": 0,
      "backoffFunction": "linear"
    },
    "disableSubscriptionOverrides": true,
    "defaultThrottlePolicy": {
      "maxReceivesPerSecond": 14
    }
  }
}

Summary

Mathias Münscher

Mathias, Dipl. Volkswirt, hat als Business Consultant im Bereich IT Landscaping gearbeitet, als er sich für mehr Technik entschieden hat. Nach einem Bachelor in Angewandter Informatik hat er als DevOp an großen verteilten Web Applikationen mit CLD bei Immobilienscout gearbeitet.…

Mehr Lesen ...
comments powered by Disqus