Why do you need Platform Engineering?
People are at the core of the move to platform engineering.
Platform Engineering makes DevOps Attrition-proof
When it comes to DevOps, most of us want to hire DevOps unicorns - somebody who understands everything including the whole functionality of an application from front to back, the underlying infrastructure, application itself and how the application consumes from the infrastructure. Yet, DevOps unicorns like this are only 1 in a million.
Another challenge that we have with DevOps/DevSecOps is if we look at it in its pure sense - it's not homogenous which introduces a certain amount of risk which can potentially slow us down, especially once we see these developers leaving. Remember, there is an industry-wide 23% attrition rate and an average developer tenure of 2 years, so we need to start looking at DevOps in a more sustainable way.
But if you have people who are constantly coming in and leaving, if they're so ingrained in a single team, it is really hard to make sure that you have that continuity as a business and make sure that everyone's always up to speed.
We need to abstract this out and say: “okay, we still want people to be able to run their own applications and whatever, but we just want to look at this at a bigger scale. We want to make sure that all of the underlying infrastructure is pretty homogenous, that it responds to the same needs and that if something goes wrong, we know that there is a single team who's capable of following this from end to end to make sure that they can either make it better or chase out any bugs, anything like that. So I don't think that DevOps is dead per se. This is just the next iteration of that and it's looking at that on a bigger scale and a bigger, more scalable sense of it.”
Centralising DevOps into a Platform Team will improve Developer Experience
I know one of the major pain points for engineers that I've worked with and who have been on my teams is that when they go into a new organisation it's really difficult to figure out where everything is, how do you consume it, what's allowed, what's not allowed. So by centralising all of these things and making sure that they know and it's documented where they can go to consume these different things and how. That makes it a lot easier. It reduces that cognitive load, and to a certain extent, it reduces that risk of attrition, at least in the speed that you would maybe conceive. It also means that maybe when their two years are up with that team, then they're more likely to stay because they like how they're being able to consume and they feel like they're adding value in terms of business logic and building out applications and innovation.
How to get Started with Platform Engineering
Tobias asked Sarah how the CTO (leading 100+ FTE with a DevOps team) would get started. This is what she said.
Understand what is running in your ecosystem and understand why things are functioning as they are. Is it because we're dealing with legacy? Is it because we have technology constraints? So where are my bottlenecks and what is causing those bottlenecks?
Once you understand the holistic picture, you can start working on strategy.
How does a platform team decide what to build
Platform engineering is not just technical engineering, it's also people engineering and processes engineering - all culminating in a platform which is really how do we do all of these things together
The Goals of Platform Engineering
The output of a platform team goes beyond the creation of a tech product. It includes a holistic blend of people, processes, and technology, which forms the platform's backbone. The platform serves as an embodiment of how these elements interconnect, driven by the directives set by the organisation and the CTO. These directives aim to deliver business value through increased velocity, cost reduction, and risk mitigation, factors that ultimately define business success.
Most platform teams' goals are about pushing velocity and becoming more efficient on the cost side. But there is a slight variation between different verticals and the different industries. Different industries prioritise different goals
If you are working in a retail or start-up space, you are going to be a lot more concerned with velocity.
- How do we get to market faster,
- How do we get there faster than our competitors
- how do we have a better product in the long run than our competitors And your platform team will follow that lead and improve velocity
If you look towards the financial services industries like insurance, they are going to be a lot more risk averse.
They are going to be very concerned about making sure that these processes are all aligned so that if we have a cybersecurity attack (e.g. , log4j, solar winds etc.), we can go in immediately. Furthermore, the supply chain is a growing concern, especially with the rise of AI. Therefore, building these platforms should focus on modularity and adaptability, even more so than simply using the latest technology.
Strategy and Maturity in choosing the central tech stack
In the dynamic realm of platform engineering, one frequent challenge that arises is the variance in technology stacks across different teams. Some teams might lean towards one cloud provider, some towards another. Some might harness the power of Kubernetes, while others might prefer the simplicity of Heroku.
How can you respect the team’s autonomy while still having a tech stack a platform team can support? If you want to create a central platform while respecting team autonomy , then you need to work to get together to create a ‘product’. Creating a 'product' from these different services will spark a completely different conversation. It's a conversation that interweaves strategy and maturity—two key elements that should be at the forefront of your decision-making process.
First you need to be aligned on strategy.
A lot of the time, if I go into customers and I ask them what their strategy is, they'll say, “oh, we want to be multi-cloud”. The next question I will ask systematically is “why? Why do you want to be multi-cloud? What is the driver behind that? Are you an accidental multi-cloud and you need to make sure that you just have control over this. Are you using different clouds for different purposes because Google's better at analytics and AWS is more developer friendly?”
You need to really understand the why as that is going to inform what your low hanging fruit is. What's easiest to automate and make a product out of that is also easily consumable from the base? What are the high value opportunities there?
It's also going to depend on the maturity of the team. Automating a platform like Kubernetes and making sure it's consumable by everyone is complex. Do you want to do that? Perhaps there might be easier things to automate? So if your team is just gaining maturity, they're starting to understand what these different streamlines look like and how to make a product out of these different services.
Streamline your teams
Based off your strategy, decide how you want to rearrange the teams to make sure that they're able to deliver the underlying infrastructure e.g. I want to be sure that the team who delivers compute can deliver compute from like A to Z and once it hits Z, i need to make sure that that “DevOps team” ( whoever is working on that application and that business logic) understands how to consume that. This means you want to have ‘specialists’ in place. Specialists are really solid engineers who have great communication skills and can build trust to make that liaison between the actual infrastructure teams, the streams and the application team itself.
How to divide between application and platform teams
First, define what the autonomous application teams are using and what is the centralised DevOps team creating. Is there overlap? What can be reused for other teams? Can you see commonality across the organisation in general? This is going to help you determine what can be used as product in and of itself. Is it load balancing? Firewalls? All of these things typically are consumed by other applications and there's a real business value to making sure that those are a little bit more centralized and centrally consumable, if for no other reason than we want to make sure that, in terms of risk and in terms of velocity, we can consume these at speed.
How to get the application and platform team to work together
One way to get the 2 team types to work better together is for them to a joint task together. In some corporate structures it is typical for those groups of people that do an evaluation of a certain technology. For instance, you could assemble a diverse group to evaluate your architecture, asking questions like, 'How should this product be used? Does it meet your needs? Can you identify potential flaws in this setup if we construct the architecture in this way?' It is sort of like threat modeling but for architecture. These discussions can help validate the platform’s product fit for the platform team and foster cooperation among all team members. Of course, developers are diverse. About 10% of your developers will be very mature, forward thinking and want to innovate. So one of the ways you can do this is just to have those people who are really about the innovation use them as sparring partners.
Reconciling Autonomy with a Central Platform team
Platform engineering is a rapidly evolving domain, often interlinked with the development of an Internal Developer Platform (IDP) that caters to an entire organization. A question I often encounter is whether it's realistic or beneficial for a large organization to utilize a single platform across all areas, and if so, how this integrates with the concept of autonomous teams. Many perceive a conflict between a centralized platform and the autonomy of teams. However, my experiences suggest that these two concepts can indeed coexist harmoniously. Landing zones, designated environments where teams can spin up their architecture, have proven to be highly effective.
Using single entry points to reduce the risk of Shadow IT in autonomous teams
In addition to these landing zones, you'll typically want a system that connects to your Version Control System (VCS) or your IDP to manage the actual deployment of the code. If you want to be able to scale things reliably, reduce risk and cost, it is important to make sure that you have a single point of entry where you can track everything that is happening. It could be a 1 or 2 entry points or an overarching internal developer platform for the entire organization - depending on you needs.
This will reduce Shadow IT. I've observed the high costs that shadow IT can incur. I've seen cloud bills escalate astronomically because of shadow IT systems operating on the sidelines, born from an excessive degree of autonomy and ad hoc working arrangements - with no way of tracking it.
If we move that then into that more centralized entry point, we can start tagging with metadata that says “okay, all of this came in through that centralized point of entry. Therefore, anything that is not tagged via that point of entry, we're considering that shadow IT and you have to get rid of that.” When we make sure to put it all together in an automated way with policy as code and guardrails in place, the developers will have clarity on what they're allowed to do with the flexibility to follow best practices - but in a contained way.
This approach can create a symbiotic relationship between autonomous teams and a centralized platform, reducing risks and costs, and enhancing overall efficiency. In conclusion In conclusion, the migration towards platform engineering represents an evolution in the way we approach DevOps. It combines the tenets of people, processes, and technology to foster a system that is attrition-proof, centralized, and efficient. By understanding bottlenecks, strategizing, and streamlining teams, we can tap into the full potential of platform engineering. The harmonious coexistence of autonomy and a central platform, facilitated through single-entry systems, is a testament to the scalability and versatility that platform engineering can provide. With a focus on mitigating risks, reducing costs, and enhancing overall efficiency, platform engineering is poised to be a game-changer in the tech industry.