Why You Need Self-Service Infrastructure
Engineering teams’ autonomy and agility are vital in achieving efficient software development. However, manual infrastructure provisioning is a major source of inefficiency and bottleneck. As the developers wait for Ops teams to provision complex infrastructure, they cannot bring the creativity, speed, and agility expected of them. This is the reason successful companies are quickly adopting self-service infrastructure. Self-service of cloud infrastructure is the panacea to all the problems introduced by manual infrastructure provisioning. Today we will discuss why you need self-service, what benefits it provides, and how Qovery can help you with self-service of infrastructure.
Morgan PerryNovember 30, 2022 · 6 min read
Check out our dedicated article on what self-service infrastructure is.
Self-service infrastructure is the ability of end-users (i.e., developers) to provision and deprovision the infrastructure and application components independently without relying on the operation team. Through self-service infrastructure, developers can create on-demand infrastructures based on some pre-defined approved infrastructure template. It is a web application that developers use to manage the infrastructure on their own.
The lack of self-service cloud infrastructure brings many challenges. Some of these are:
The dependency of the development team on the operations team is a major bottleneck in the workflow. The nature of developers’ work is such that they must constantly make changes and preview the results of the changes on production/staging infrastructure. For each iteration, developers might need a different infrastructure, e.g., a better EC2 instance with different security groups. While the operations team is preparing the infrastructure, the developer is just waiting, wasting precious time and, ultimately, cost.
Provisioning a full-fledged environment takes much time for complex applications involving many cloud services and integrations. Take the example of a fintech application having many payment integrations and highly secure infrastructure. Replicating such infrastructure will incur a lot of time. There should be a way to automate this so that a new infrastructure or group of infrastructure elements can be easily replicated or provisioned automatically based on some trigger.
Most IaC tools are CLI based; as a result, No granular role-based access control can be applied to the team member who is provisioning the infrastructure. In some cases, developers have access to not just the API keys and other secrets but to the cloud console on the organization level instead of the individual level. Also, most IaC tools cannot integrate with existing SSO providers like SAML, Okta, etc., which is another security concern because many organizations prefer to use their existing identity and user management solution for centralized security control.
Imagine a team provisioned a complex and cost-intensive group of infrastructure components for an important demo. After the successful demo, they forgot to shut down the infrastructure. Till the time they realized it had already cost money. Not just the provisioning but the de-provisioning of infrastructure should be automated to save costs.
The solution is to develop a self-service platform that will allow internal teams to self-service the infrastructure provisioning while limiting their access to resources they do not need. It will be a web application that will provision and configure all the environment resources like cloud security groups, Kubernetes namespaces, Kubernetes service accounts, databases, etc. This portal should have the following qualities to solve the challenges mentioned earlier. :
The solution should not be confined to just one cloud provider. It should also not be limited to certain integrations. It should be extendable such that it will accept any new integrations without any impact on the existing workflow. The solution must be free of any vendor lock-in; otherwise, technical debt will be accumulated in your solution.
You need to find the sweet spot between the developer’s freedom and security. The more secure it will be, the less user-friendly it will be for developers. It is implicit to say that the security aspects cannot be compromised even if internal teams will use the solution. However, you need to understand that any security lapse in the solution will eventually reach the product that is being developed through this solution.
As mentioned in the challenges section, the provisioning and removal of infrastructure elements must be free of manual human intervention. It is absolutely critical for the solution to have support for preview and ephemeral environments. This will allow developers to easily clone an existing staging, UAT, or production environment without depending on the operations team. Similarly, developers can instantly preview their code branch changes in a temporary preview environment in isolation. The capabilities of preview and on-demand environments are at the heart of every successful solution for self-serving infrastructure.
The solution should have strong monitoring and analytical capabilities. It should monitor all the components of the solution and alert in the case of any anomaly. Similarly, it should provide rich analytics with meaningful insights. A good solution will have built-in alerts and an analytics dashboard that will fulfill most of your monitoring and analytics needs.
The self-service infrastructure allows teams to deploy faster. In a traditional approach, developers wait for the operations team to make the environment and infrastructure ready for deployment. In the case of self-service infrastructure, developers can provision and deprovision the infrastructure whenever needed. As a result, they can quickly roll out the releases. Reverting the deployment is also easy if something goes wrong. Your ability to release faster gives a competitive edge to your product.
Self-service infrastructure is the core of any automation workflow. Manual infrastructure provisioning is prone to human error. Any small mistake in configuration or infrastructure provisioning might take a lot of time to be identified and fixed. In the case of self-service infrastructure, the process of spinning the infrastructure is automatic, reducing human error and minimizing operational complexity.
When developers have control over infrastructure, they can try innovative ideas and perform quick iterations on new features. Now that they have less dependency on the operation team, they feel more autonomous regarding deployments and releases. Using a self-service portal, developers can quickly provision and de-provision any infrastructure on their own, resulting in increased agility.
The on-demand model of self-service infrastructure is inevitable for companies who want to save costs. In a traditional approach, valuable time was spent not only in setting up different infrastructures manually but in debugging different issues. Now this time is saved, and so is your bill.
Self-service infrastructure is based on on-demand and ephemeral infrastructures compared to traditional and permanent infrastructure. So similarly, you also save cost on the infrastructure because ephemeral infrastructures are short-lived; all the associated infrastructure is also destroyed as soon as the ephemeral environment is closed. This results in improved cost optimization.
Now that you understand the importance of self-service infrastructure, you must be thinking about how to actually do that. Qovery can help you with the self-service infrastructure for your cloud-native application. Qovery’s preview environments and automatic infrastructure provisioning features can let you achieve the sweet spot between developers’ autonomy and strict control over their access. Qovery will automatically spin up the underlying infrastructure and all related Kubernetes nodes, pods, and other required components. It contains all the needed features which must be present in an efficient self-service portal for infrastructure automation.