Terraform 101 - protect your origins

In this post I'd like to showcase this particular feature. As a followup I'll talk a little bit more about our reasons to migrate to Terraform.
30.08.2016
Tags

With the advent of Terraform we’ve been migrating more and more of our custom executable infrastructure tooling away from different custom toolchains (in Ruby, Python and Node) and CloudFormation stacks towards Terraform. This journey has recently reached another milestone since we’ve got our first contribution to Terraform merged: the new release 0.7.1 features the ability to query Fastly and AWS IP ranges.

In this post I’d like to showcase this particular feature. As a followup I’ll talk a little bit more about our reasons to migrate to Terraform.

Protect your origins

In many scenarios you plan a content distribution network (CDN) as the primary means of accessing the content of your platform. In most of these case, you explicitly want to prevent users from accessing your origins directly: likely you’ll depend on the cache offloading features of your CDN at some point and your origins won’t be able to handle 100% of the requests hitting the edges. When using Fastly it’s sometimes even feasible to depend on your origins to deliver a cache miss just once.

Typically there are many venues for leaking the URL of the origins to malicious actors (who could use those for denial of service purposes) or to regular visitors (who might just trash your origins by accident). Examples include leaked configurations, misconfigured services with absolute URLs to your origins or DNS records which might be easily discoverable (depending on their type and zone).

The solution in Terraform now is simple. You declare the IP ranges of the edge nodes as a “Data Source”, which is a kind of Resource but read-only. You then use the resulting CIDR blocks as arguments for your firewall rules (security groups in AWS). Since these blocks do not change on a daily basis, they make perfect sense in an exectuable infrastructure tooling environment (as opposed to being part of e.g. a λ function).

data "aws_ip_ranges" "cloudfront" {
  services = [ "cloudfront" ]
}

resource "aws_security_group" "from_cloudfront" {

  name = "from CloudFront"

  ingress {
    from_port = "443"
    to_port = "443"
    protocol = "tcp"
    cidr_blocks = [ "${data.aws_ip_ranges.cloudfront.cidr_blocks}" ]
  }

  ingress {
    from_port = "80"
    to_port = "80"
    protocol = "tcp"
    cidr_blocks = [ "${data.aws_ip_ranges.cloudfront.cidr_blocks}" ]
  }

  tags {
    CreateDate = "${data.aws_ip_ranges.cloudfront.create_date}"
    SyncToken = "${data.aws_ip_ranges.cloudfront.sync_token}"
  }

}

More information can be found in the documentation for the respective data sources for AWS and Fastly IP ranges.

Please note that for a completely successful strategy of origin protection you will also need to consider the way your request identifies are build / which parameters you want to pass on to your origins as well as potentially signing valid urls and checking them on the edges. Good luck!

Image credits for the cover image go to Ryan Glenn.