Paid-by-Request Content Delivery with AWS

With the release of the AWS API Gateway in July 2015, Amazon has recently opened the gates to server-less web-apps, utilizing Lambda service as their computation component.
06.01.2016
Mathias Münscher
Tags

What happened? AWS has a new HTTPS gateway

With the release of the AWS API Gateway in July 2015, Amazon has recently opened the gates to server-less web-apps, utilizing Lambda service as their computation component. This opens an exiting perspective for scalable microservices, as AWS offers a wide range of other applications services such as storage, transactional mails, messaging, queuing etc. The real game changer is the pricing model though: you only pay for what is transferred, stored or computed, not for the provisioning of some resource.


What’s the story? Content delivery with opt-in

Our client offers static content to the public with a steep demand during it’s introduction phase. While the demand is dropping again after a short period, the content still needs to be served for much longer.

The content is essentially static. However, the consumer may adapt it slightly and send it by mail to other customers. For legal reasons the user needs to prove ownership of the email address used with a double opt-in via mail.

What we needWhat we have (AWS Service)
logic to control the workflowLambda
REST-APIAPI Gateway
store the contentS3
web deliveryCloudfront CDN
serve emailsSES

Concept: The workflow with AWS

Overview elements

flow1

  • Left The user operates a browser and an email client.
  • Right The amazon cloud with its services.

Load the frontend

flow2

  1. User He/she sends a request for a static page that points at Cloudfront.
  2. Cloudfront resolves to an S3 bucket.
  3. S3 The data is stored as key/value (i.e. path/data) and holds meta data (e.g. content-type, access, …)

Process the user input

flow3

  1. Frontend The static content uses a little JS app that allows the user to adapt the content.
  2. Frontend The respective data is send to the Api-gateway.
  3. API Gateway The gateway validates and transforms the data (into json) before it’s passed on to the Lambda function
  4. Lambda Processes the input: Loads, manipulates and saves further data from S3

Double opt-in step 1

flow4

  1. Lambda Subsequently, Lambda initiates the double opt-in as it triggers an email to the user
  2. SES The AWS emailing service is called only with the essential data for the email.

Double opt-in step 2

flow5

  1. User He/She receives an email with a verification link.
  2. Api-Gateway Again, pre-process and passes on the data to Lambda
  3. Lambda completes the double opt-in and executes the follow-up logic.
  4. Lambda triggers sending of emails to a third party as the user is now authenticated - in order to prevent a manipulation of the authentication links or requiring a database for tokens we pass data between these steps encoded as signed JSON Web Tokens.

Implementation

Lambda function

Lambda functions may be written in Javascript (which we choose), Python or Java 8. AWS provides the necessary SDK. The SDK includes the API to all AWS services.

The code is executed through a dedicated handler function which gets passed an event as well as a context object. The first contains the input data (in this case provided by the API Gateway). The latter contains the execution context (fail- / success-handler, etc.)

The code has essentially the same power as any Node.js process. It can load, save, process data with the SDK or any other Node library. It only interacts with the container to

  • receive input
  • return output
  • return an error

However, there are more options (e.g. get meta-infos). For a quick start we suggest to have a look at Tim B’s blog entry on developing with Lambda and grunt.

API Gateway

Currently the AWS component may only be programmed through the AWS management console (i.e. the web-interface), not the CLI/Cloudformation.

As it is a “gateway” there are essentially two tasks:

  1. Input Control, pre-process and pass on the input.
  2. Output Receive and post-process the output.

Input

Setting up REST resources and methods is trivial. Let’s have a look at the more advanced settings.

All changes start at the Method Execution panel. (BTW: The test feature, a bit hidden at the right, has proven to be very handy.)

api-gateway-panel

Query parameter Working with url query parameter requires a mapping of the json object that is passed on to Lambda.

  • Register the respective query parameters in Method Execution / Method Request

  • The content-type must be set to application/json as the result is a json (Method Execution / Integration Request).

  • The mapping is pretty straightforward (Method Execution / Integration Request):

    • The $input and the $context object provide access to further information about the calling HTTP request.
    • Be aware of the JSON notation: The " surrounding the input works for string values. You don’t need those for an Integer.
{
  "parameter1": "$input.params('parameter1')",
  "parameter2": "$input.params('parameter2')",
  "parameter3": "$input.params('parameter3')"
}

CORS As API Gateway requires https, which in turn requires certificate management we decided to use the AWS domain for API access. Since our static content origins from a different domain we needed to enable CORS on the API Gateway.

Output

api-gw-erromapping

Static error response

A lambda function fails either checked (containing a deliberate error code) or subsequently to an unhandled exception (containing the error code Process exited before completing the request). In both cases we choose to return a http 404 response to avoid revealing any information about the system.

  1. To accomplish this we add an integration response (in Method Execution / Integration Response)
  2. The gateway uses a regex to assign the http response. For checked fails Lambda returns a "404" string. Hence the regex is simply 404|Process exited before completing request
  3. To suppress any values in the response body we add a mapping template with content-type application/json and a template that only hold the mandatory json brackets: {}. An empty template or a single space would not work.

HTTP Redirect When the the user follows the verification link we redirect the request to a static response.

The gateway maps the error code only, what forces Lambda to fail in order to redirect. This is not an intuitive way, but we haven’t found anything better, yet.

  1. Lambda fails with context.done("301-redirect-distribution-sent");
  2. The api-gateway maps respectively and returns a 301 response.
  3. Eventually we add a static location header (as output mapping is not yet possible for a header template). Hence, if you need to redirect dynamically, you’re stuck at the moment!
  4. Error prone: It is possible to set different location for your stages by manually changing the value before you deploy.

S3 and Cloudfront CDN

S3 itself comes with basic web-delivery. Particular, you can set to every file:

  • Content-Type To resolve files in a REST-style manner we leave out the file-extension:
    http://www.domain.tld/something/about/ returns the path /something/about. To do this, we need to specifically set the content-type to e.g. text/html for the file
  • Access Rights needs to set to public-read
  • TTL/Caching might be set to max-age=0 for staging purposes

Cloudfront links a domain to one/multiple S3 buckets. We use Cloudformation to setup a CDN for staging and production:

  • Linked Buckets
  • Allowed REST Methods GET, HEAD
  • Path Pattern that resolve to assets or user manipulated content
  • Static Error Pages are set to a /static/.. path
  • Default root file typically index.html

SES

Sending limitations

Amazon handles SES (for legal reasons) more restrictive than the its other services. SES implements two quotas which are region- and recipient-specific. You start out in the SES Sandbox and may apply for increased limits. The effects of exceeding the limits are detailed e.g. in the SES Blog.

LimitiationSandboxOpened Account
Sending quotaMax sum recipients, last 24h200
Maximum Send RateMax sum recipients/second (Short exceeding bursts are tolerated.)1
Confirmed Emails OnlyAll Email addresses need to be verified first.Applies

BTW: There’s a mailbox simulator for virtual testing.

Queueing with SNS

Since we run a distributed solution with Lambda we cannot tackle the problem through scheduling. However, the AWS toolbox offers several queueing solutions. We chose SNS.

** Workflow **

lambda-sns-ses

  1. Lambda Each attempt to send an email starts as a message to an SNS topic.
  2. SNS again calls back Lambda through the api-gateway to trigger the emailing.
  3. Lambda places the email at SES.
  4. SES (potentially) rejects a sending request.
  5. Lambda returns a 503 error. SNS requires a 5XX error for retries.
  6. SES Now, we can use SNS build-in back-off strategy and requests/timeframe limitation to avoid throttling.

Our setup

To join a topic Lambda needs to execute a confirmation-link once. This is a bit cumbersome, as we need to add a POST method to the api-gateway and process a single call.

  • SNS We set a delivery policy on the respective SNS topic:
{
  "http": {
    "defaultHealthyRetryPolicy": {
      "minDelayTarget": 5,
      "maxDelayTarget": 60,
      "numRetries": 60,
      "numMaxDelayRetries": 60,
      "numNoDelayRetries": 0,
      "numMinDelayRetries": 0,
      "backoffFunction": "linear"
    },
    "disableSubscriptionOverrides": true,
    "defaultThrottlePolicy": {
      "maxReceivesPerSecond": 14
    }
  }
}
  • API-gateway Add a POST method and register a 503 response.
  • Lambda: subscribing to topic Currently the gateway may only map input to a custom event if the Content-type header is set to application/json. This is not the case for SNS message, so we passed the input directly to Lambda. Subsequently the event object equals the SNS message JSON. We extract and call the value of event.SubscribeURL.
  • Lambda: handling messages We attempt to send the email using the SDK. If we get a throttling exception we return a 503 response to SNS.

Summary

  • Pay-Per-Request We consider it the concept’s most revolutionary aspect that you pay for what you actually need. This is a business-side friendly perspective, even if the actual cost calculation is still quite difficult.
  • API-Gateway inflexible, yet The configuration of the API-Gateway is currently quite complex and hard to automate. However, we expect CloudFormation support in the foreseeable future.