The customer
The customer has developed a comprehensive, manufacturer-independent farm management software solution. Their vision is to become the world’s most prominent agricultural platform, from facilitating documentation to providing operational insights to aid decision-making.
The challenge
The customer needed to implement a shift in their software architecture. Our goal was to support the customer in:
- implementing an AWS account separation in which silos are no longer present,
- empowering the teams to have full ownership of their platform, while
- enhancing the security posture and
- improving the cost tracking of the entire AWS landscape.
The solution
We implemented a Landing Zone to help our customer set up a secure, multi-account AWS environment based on AWS best practices that later contributed to the development of superwerker, our free and open-source jumpstart for AWS secure Landing Zone creation.
A Landing Zone describes the setup of an AWS account structure in a way that addresses two different classes of requirements: the workload infrastructure required by engineers to orchestrate their workloads, and the infrastructure and/or infrastructural constraints required by the organization itself.
The Cloud Center of Excellence (from now on, CCoE), is the team that institutionalizes cloud best practices, governance standards, automation, and drives the change throughout the organization. When done right, a CCoE inspires a cultural shift to innovation and a change-is-normal mindset.
The CCoE is responsible for:
- providing environments in which workload stakeholders can work / deploy their software
- mapping organizational requirements into these environments
In this scenario, the CCoE is both an internal service provider and a consulting unit within the organization. This leads to two challenges: how can the CCoE maintain a position of authority in order to influence workload stakeholders to do the “right thing,” while maintaining enough velocity to very quickly react to requests from organizational or workload stakeholders?
While training, proper process frameworks and tooling, as well as external consultancy support can help with the first challenge; these aspects are best addressed using a collaborative approach. To enable collaboration, the CCoE makes curated configurations available to the whole organization as infrastructure-as-code (e.g. using CloudFormation / CDK or Terraform), which are deployable through continuous deployment pipelines. Such pipelines allow everybody in the organization to contribute pull-requests to the configurations, while reserving the right to merge-and-deploy to CCoE members, which results in:
- workload stakeholders can contribute to changes, additions and bug fixes in the infrastructure
- CCoE stakeholders maintain full ownership over the platform.
One of the company’s challenges was to increase the synergies between the development and operations teams. We decided to shift from the traditional Operations team approach to adopt the CCoE, a multidisciplinary team in charge of defining and enforcing the infrastructural boundaries between the environments used by the workload stakeholders. We started by defining these boundaries and discussing them in the group.
The next step was to define the domains of concern of the different development teams, which we then used as a baseline to define the account separation strategy.
Using AWS Control Tower allowed to simplify provisioning of new AWS accounts in conformance with the organization constraints and the implementation of AWS best practices. In fact Control Tower works together with AWS Organizations, which simplifies the management of AWS accounts and Organizational Units.
Thanks to the smooth integration between Control Tower and AWS Single Sign-On, the integration of the customer’s Azure Active Directory users and groups and created Custom Permission Sets allowed to match our specific use cases.
Each domain (referred to as workload) has their own Organizational Unit in AWS with their own set of AWS accounts representing the different environments (Sandbox, Stage and Prod). As a consequence, this allows for flexibility in establishing specific Guardrails on the different workloads on a case-by-case basis.
Thanks to the Tagging Policies and Cost Allocation Tags defined in the Organization, it is possible to keep better track of the costs per Service/Environment.
We also wanted to empower the development teams by giving them freedom with a sense of ownership, but we didn’t want to overwhelm them with responsibilities that were out of their scope, for example: networking. The development teams became owners of their platform. Topics such as networking were owned and maintained by the CCoE and provided to the workloads “as a Service.”
Following best practices, the network layout was created in a dedicated account managed by the CCoE. The network components of each workload and environment were then shared with the corresponding accounts; having the network defined in a central location proved to reduce complexity and increase the maintainability of the Terraform code used to define it.
Additionally, the CCoE provides the teams with a set of CDK constructs and libraries to set up commonly used resources with sensible defaults.
The next challenge ahead of us was to provide an easy, consistent and centralized way to deploy and enforce the infrastructural constraints into the workloads. We decided to adopt the AWS deployment framework, which offers an extensive and flexible solution for staged, parallel, multi-account, cross-region deployments of applications or resources via the structure defined in AWS Organizations while taking advantage of services such as AWS CodePipeline, AWS CodeBuild and AWS CodeCommit to alleviate the heavy lifting and management compared to a traditional CI/CD setup.
We could implement select guardrails and their automatic remediation in completely independent pipelines, keeping the overhead of the CCoE team to a minimum and the complexity still manageable. It also provided the teams with a set of blueprints that could be extended in the future.
Lastly, we implemented a set of Communities of Practice concerning several cross-domain topics such as DevOps, Monitoring and Logging to create room for collaboration and ideas exchange and by that, to improve the synergies between teams.
Note: Foundational accounts can be thought of as a building block for establishing the rest of the multi-account strategy; these accounts are provisioned by Control Tower and have specific configurations and guardrails to prevent the users from executing certain actions that could jeopardize the Landing Zone configuration.
Conclusion
The core infrastructure and the organizational constraints are now centrally managed and deployed to the different workload accounts consistently and with full-automation in place. Changes are peer-reviewed (4-eyes principle) and rolled out automatically to the entire Organization.
Provisioning a new workload environment is done with a few clicks, and it is ready for the development team in less than two hours.
Understanding the costs and usage of the different workloads/environments is now straightforward.
By the adoption of the AWS Landing Zone, the customer has seen an increased interaction between the CCoE and development teams. The development teams are now fully responsible for the services they are running in their own accounts and can focus on their core activities, trusting that the platform they are using is secure, reliable and homogeneous across environments.