This blog post is part of a series about the AWS Well-Architected Framework, what it is, why it makes sense, and how we at kreuzwerker do it. In this entry, we will focus on the Cost optimization Pillar.
What it is - a quick recap
Using their architects’ and clients’ collective knowledge and experience, AWS is continuously working on a Well-Architected Framework, which consists of critical concepts, design principles, and best practices for architecting and running workloads in the AWS Cloud. AWS developed a Well-Architected Framework to understand what makes some customers succeed in the cloud while others fail. They also wanted to identify common problems, decisional and architectural patterns, and anti-patterns. In other words, what is well-architected and what is not, and to make this knowledge available to all, regardless of whether someone is just considering migrating to the cloud or is already running thousands of workloads there?
The Well-Architected Framework is built on six pillars
- operational excellence 👨🏽💻
- security 🔒
- reliability 💪🏾
- performance efficiency 🚀
- cost optimization 💵
- sustainability 🌳
The AWS Well-Architected Review process provides a consistent approach for customers and partners to evaluate architectures and implement scalable designs.
It’s important to note that the Well-Architected Review is not an audit. It’s nothing to be afraid of; there are no penalty points for not getting things right the first time. A Well-Architected Review is a way of working together to improve your architecture. The process leads through several foundational questions and checks. It has been derived from years of experience working with the AWS cloud regarding security, cost efficiency, and performance. Hence, it provides sound advice on improvements. It helps you to build secure, high-performing, resilient, and efficient infrastructure for your applications and workloads.
The hard facts about AWS Well-Architected reviews in 2022 are:
- it consists of 58 questions in total across all pillars
- it takes around 4-6 hours for one workload (without tool support)
- the goal is to remediate 45% of the high-risk findings with a minimum of 20 questions answered.
We describe the process from our perspective in more detail here.
How we do it at kreuzwerker
Why should you do it with us?
How do we perform such a review?
For us, it’s an interactive process: we inspect and adapt every time we do it by requesting feedback from our clients and doing a short internal retrospective. As of now, we perform it as follows:
- We do it in 2 blocks from 09:00-12:00 and 13:00-15:00 with a lunch break. However, we can continually adapt if we are faster, e.g., we shift the gap, and we are also flexible whether doing it remotely or at your office.
- We do it in an interactive, story-telling mode. This means: you talk, we listen, and then dig deeper into specific areas while being able to cover multiple questions.
- Our process is supported by tools (more in the other part of the blog post series 🥳)
We do not just handle the questioning but give guidance to answering them. We can tell you how and why there could be improvements to be made. Alright, enough for the introduction. Let’s jump right into it.
Cost optimization Pillar
In a nutshell
For many organizations, cost-benefit is the primary factor when migrating to the Cloud. After all, we have all heard that the Cloud is “cheaper than on-premise.” But is that so?
In reality, the Cloud is what you make it. Depending on how you implement it and then how you use it, it can be cheaper, cost the same, or a lot more expensive than keeping your operations confined to your premises. Let’s face it: when it comes to cost, there is no Shared Responsibility Model. AWS is responsible for maximizing profit for its shareholders. While they will guide you on how to operate effectively and efficiently in the Cloud, they will not stop you from giving them money. The responsibility for the budget rests entirely on your shoulders as a customer. But not just yours as the person reading this blog: everyone in your organization who has access to the Cloud should know that cost optimization is their responsibility, too.
So what is the main cost driver in the Cloud? Running the Cloud in the same way as we used to run our on-premise environment.
Traditionally, we would procure resources every few years. Therefore, it made sense to over-provision for the future: to buy the biggest, the fastest, the most powerful system out there, and then enhance it over time according to our demands. If there was spare CPU or memory capacity, that was, in fact, good news because it meant that we still had room to expand before we needed to spend more money.
In the Cloud, the opposite is true: we want the smallest and cheapest resource that still does the required job.
But that’s not the only difference. Traditionally, we saw a clear connection between provisioning new resources and their cost. A lengthy procurement process often requires filling out a business case, justifying the cost, researching the options, and balancing budgets. Still, the Finance department could come back with a firm “No!”
This all changed in the Cloud. The Finance department can no longer stop the purchase before it happens. The procurement process got shortened from months to minutes. Even the way we talk about new resources has changed. We no longer buy - we create, we provision, we launch, we deploy. And those are not the words we easily identify in our heads with having to pay for something!
*Photo: We no longer buy new servers. These days we launch them. (Screenshot: own)
So the first thing we should do is make it clear to everyone, not only the budget holders, but also to the administrators, developers, testers, project managers, data scientists - anyone within our organization with authority to create, develop or launch - that those operations are effectively purchases and incur costs. Once we’re all aware that everything we provision costs us money, it will be much easier to understand the need for cost optimization, leading to establishing and maintaining a culture of cost awareness.*
AWS identified principles of cost optimization and gathered them together under the Cost Optimization Pillar. These are as follows:
- Implement Cloud financial management: To be financially successful in the Cloud, you need to bridge the gap between your Technology and Finance departments. Consider creating a cross-functional team responsible for establishing and maintaining cost awareness across your organization.
- Adopt a consumption model: Take advantage of Cloud flexibility and do not spend time trying to guess your future needs. Instead, increase and decrease usage (and thus, your spending!) in step with your business requirements.
- Measure overall efficiency: Contrast resource cost with business output and consider the general worth of the solution. It may not always be straightforward; higher bills to pay may be offset by faster time-to-market. This is where it will pay off to have a cross-functional team that can realize opportunities and trade-offs from an eagle-eye view.
- Stop spending money on undifferentiated heavy lifting: When considering the difference between running a resource in the Cloud and purchasing an equivalent solution to run on-premise, remember to consider additional costs, such as electricity, cabling, dedicated hardware room, or person-hours spent on maintaining physical infrastructure. This time and money may be better spent in the Cloud.
- Analyze and attribute expenditure: When monitoring and analyzing your Cloud spending, consider attributing individual workloads to their functions and owners. This way, you can quickly identify gains and losses and make improvements.
Practice Cloud Financial Management
It may be that your Technology department doesn’t concern themselves with price tags, and your Finance department doesn’t realize the opportunities that come with the Cloud. Instead of having various business functions pulling in opposite directions, bring everyone together so they can deliver a solution that everyone in the organization is happy with. Members of this cross-functional team should meet regularly to discuss spending goals and targets, and then advocate Cloud cost optimization within their respective business functions.
Expenditure and Usage Awareness
As previously mentioned, the ease with which we can provide solutions in the Cloud often leads to forgetting that these cost money. Often the best solution is to establish guardrails that will not allow over-provisioning.
AWS Organizations and AWS Config can help establish policies that will help to reduce spending. You may want to implement a policy that only allows using lower tier low-cost resources in the Development environment, reserving the costly tools for the Production environment.
Another great piece of advice from AWS regarding cost awareness is establishing a tagging policy. Tags allow pairing Cloud solutions with their functions and owners. You can then use AWS Cost Explorer to analyze department, project, or even person spending.
Cost Effective Resources
Because of the abundance of choice, it’s sometimes challenging to pick the right size resource to run your workload. The “cheapest per hour” option isn’t always the most efficient - a low-capacity EC2 instance may require 10 hours to complete the task, which can be finished within 2 hours by a more capable and expensive virtual machine. In such a case, it would be effectively cheaper to pick a solution that is more expensive on the surface.
AWS also recommends pairing your problem with a correct solution. It may be cheaper to use Amazon Simple Email Service to send emails on a per-need basis rather than run a mail server year round. Likewise, a cron job that only triggers once daily may be a task for the Lambda function, not necessarily an EC2.
While on-demand provisioning that charges by the hour will work in most cases, long-term workloads consider Savings Plans and Reserved Instances. On the other hand, ad-hoc jobs can be great candidates for Spot Instances.
Manage Demand and Supply Resources
One of the best ways to optimize Cloud cost is to pay only for what you need and only when you need it, and AWS offers several ways to reduce the bill. Auto Scaling can add or remove resources based on schedule (if you anticipate an increase in demand, for example, a seasonal sale that will drive more customers to your business) or on current orders. Serverless solutions such as Amazon API Gateway or Amazon SQS offer throttling, buffering, or queueing that automatically adapt to demand changes, ensuring matching supply and cost reduction.
AWS Cloud is constantly evolving. New products and services are continually added, and existing ones are continuously improved. And so, cost optimization is not a project with an end date but an ongoing process. To operate efficiently, it’s essential to constantly evaluate and reassess your services - how you use them and how much you spend on them. And it’s essential that everybody is aware of and an active part of this process. And if it all sounds scary and daunting, we’re here to help! At kreuzwerker, cost optimization and FinOps are one of our favorite challenges; we even did a talk a the AWS Community Day DACH in 2020 with the title: “Cloud Cost gone wild - A story about AWS costs management.” You can find the video here. So do not hesitate to reach out to us!
You want to know more about the AWS Well-Architected Framework, here are the other parts of our series: