Approximately 13.4 million people travel with Deutsche Bahn every day. Many of these passengers rely on the information provided to them on station display boards, platform train displays, at the DB Information desk, on www.bahn.de or in the DB Navigator. It is therefore particularly important that this data is reliable and consistent so that passengers are not inconvenienced.
One of the goals of the Traveler Information department at Deutsche Bahn is to ensure the quality of this data. The Group started developing digital traveler information more than twenty years ago. A lot has happened since then: from creating the central traveler information server, to connecting the DB channels to the server, to intelligent, automated evaluation, as well as the consistent development of modern platforms that supply the channels with data and services. Information for travelers should be up-to-date, reliable, consistent on all channels and available in real time. To process and provide the gigantic volume of data, which includes more than 1 billion events per day, “single source of truth” solutions were created in an application ecosystem. The traveler information application ecosystem is built and operated entirely in AWS.
With the use of this technology, Traveler Information was a DB Group pioneer and has long benefited from the possibilities of the Cloud: being able to flexibly and automatically adapt and scale storage, instances and services to the respective requirements, for example, during the transition from local pilot projects to Germany-wide productive systems.
However, the software architecture must also be set up for this form of operation: That is, it must not be hardware-dependent and must support flexible scaling in the Cloud with the appropriate technologies.
The many advantages of the Cloud come at a price - quite literally, because the costs incurred by the Cloud service providers become a notable budget item. This is not only the case for Deutsche Bahn, but also for other Cloud services users from a wide range of industries. For this reason, the initial euphoria about the new technical possibilities is currently giving way to a pragmatic mindset that is increasingly focusing on Cloud computing costs: FinOps, Cloud Financial Management, is the term for this mindset and the corresponding measures to optimize the cost-benefit ratio of the Cloud. To master this challenge, kreuzwerker’s FinOps team was engaged as an external partner.
Inform, optimize, control
At the beginning of FinOps was the realization that many companies spend too much money for the use of Cloud services - resources that could be saved and possibly used more effectively elsewhere. FinOps, the financial management across the development to operation of software lifecycle, can address several points. First, it is important to analyze and understand Cloud costs: Cloud services are “rented” on a per-unit basis, so there is a time-based, typically hourly rent due for a resource. The billing models here are complex and dynamic.
Together with the Ops team, which manages Cloud operations, the kreuzwerker team first analyzed AWS usage and identified so-called “low hanging fruits”, i.e. measures that can be easily implemented and lead to significant savings.
In order to provide the cost managers with a simple overview of the AWS costs incurred, it was necessary to implement a uniform tagging structure, which made subsequent monitoring and chargeback of the costs possible in the first place.
Critical to this collaboration is that the product owners and all team members are involved and take responsibility to enable continuous optimization of usage and costs.
A next step is the critical cost-benefit analysis of each application. In this process, the teams jointly define a strategic optimum in the conflicting goals of cost, quality, flexibility and speed.
Steep learning curve, big savings
kreuzwerker’s FinOps measures have shown impressive savings potential in traveler information. The AWS resources used since April 2019 have been reduced by around one-sixth - and the costs for this fell even more sharply, by up to 27.5 percent. This was achieved by making optimal use of managed services, better utilizing resources, and migrating many RI services to other instance types.
This is because AWS also has “special offers”, the so-called “Spot instances”. These are temporary overcapacities that the provider offers at a favorable price. The catch: the availability of such spot instances can change within two minutes. Therefore, such cost reduction opportunities can only be exploited with the appropriate IT architecture. In close cooperation with the development teams, the kreuzwerker team evaluated relevant services and supported the expansion of Kubernetes usage. In addition to better resource utilization, this was also intended to create a reliable basis for the use of Spot instances, which led to further savings of up to 90% compared to the on-demand price.
Customized reporting and monitoring ensured that all stakeholders receive a near real-time overview of usage and costs. This makes a quick reaction to anomalies and cost traps in the future possible.
FinOps is not a one-off project. Rather, it requires continuity. It was particularly important to sustainably train the traveler information teams. Through lunch & learn sessions and workshops, the kreuzwerker team ensured that everyone involved understood the FinOps principles and would apply them in the future when using the Cloud.
The workshops were built on the following principles:
- Teams need to collaborate
- Business value of the Cloud drives decisions
- Everyone takes ownership of their Cloud usage
- FinOps reports should be accessible and timely
- A centralized team drives FinOps
- Take advantage of the variable cost model of the Cloud.
With the help of the new FinOps practice, Deutsche Bahn’s traveler information has not only managed to reduce AWS costs, but also to gain transparency in the complexity of Cloud billing through clear structures and processes. Thanks to the improved cost transparency, both developers and employees from finance and controlling are able to react quickly and proactively to changes and ensure sustainable efficiency.