Jun 30, 2021. By S V Aditya
IT Operations is maturing and automating at a faster pace than ever before. The continued growth of cloud computing, containerization, and automation orchestration systems is accelerating the speed at which software moves from development to production. With more applications being developed in microservices architectures deployed in containers, there is no need for physical infrastructure to serve applications that are so highly portable. It is a good time now to consider if the shift from a DevOps culture to a completely NoOps culture is viable or worthwhile. If so, how does an enterprise approach this transformation, burdened as it is with decades of legacy infrastructure?
What is NoOps?
NoOps, in essence, stands for a software environment that requires no dedicated IT Operations team to manage it. In such an environment, developers can build code, test it, and deploy it into production themselves. They don't need to file tickets with ITOps to change to a production system, sit in meetings figuring out capacity forecasts, or require changes to the data center. Part of this transformation is already a reality with cloud computing systems abstracting away a lot of these problems with PaaS models. However, this is not enough. IT Operations roles are more than helping bring an application into production - they are also responsible for monitoring, stability, and reliability. When something inevitably breaks, they have to diagnose and even fix the issues in concert with the developers. These elements are not so easily automated - they require diagnosis and decision-making capability not found in traditional tools. This requires the adoption of AI-based technologies like Incident Recognition and Auto-remediation. But it doesn't stop there - it also requires a cultural shift in the organization itself.
Transformation into "NoOps"
Switching to a NoOps ecosystem is not as simple as getting a cloud service provider to manage serverless computing while an AIOps solution manages the platform and remediates incidents. There are several cultural and process changes that pave the way to NoOps. We mention three of the most important here:
The first is making developers responsible for the testing and release into production while also shrinking these processes to the maximum extent possible. In DevOps, the release stage is short with Developers and Operations teams collaborating on it. NoOps compresses this stage by automating the entire release configuration. Further automating testing in a production-like environment shrinks this cycle down even further. It also forces developers to think more in operational terms and incorporate self-healing and monitoring techniques into their code.
Second, companies need to bring AI literacy to developers as well. In our experience in building Incident Recognition, the biggest challenge proved to be adequately preprocessing and vectorizing log data in a way that made them easy to cluster with machine learning algorithms. The computational requirement of preprocessing, especially at a large scale, can be reduced immensely if logs are written in a way that makes this process easier - which comes down to the developer. By understanding how machine learning algorithms work, developers can build their applications to work better with AIOps solutions for monitoring and repair. This greatly increases the resource effectiveness of auto-healing solutions powered by AIOps in production.
Third, use a zero-trust approach in the matter of security. Simply put, monitor everything. No network, device, or user is inherently trusted while all are observed for deviations from expected behavior. Naturally, AIOps Anomaly Detection paired with Auto-Remediation workflows is the approach to handling this with a NoOps mentality.
Is "NoOps" really NoOps?
Let's be clear - as technology (including AI technology) stands today, NoOps is just aspirational. There are a lot of hurdles that are simply not surmountable with intelligent automation alone. Legacy services still play a huge part in most enterprises' IT stack along with associated infrastructure like data centers. Migration is one solution - but more often the costs of migration are too high and it is simply just cost-efficient to continue managing them in their current state. AIOps tools have progressed far from their beginnings as basic application monitoring solutions to mature applications like Incident Recognition and Auto-Remediation. But they still have a long way to go - IT Operations personnel have to oversee the tool's actions, handle edge-cases, and train/retrain AI models built into AIOps systems. They also have to use their understanding of the IT systems to create context-sensitive automated workflows that do not break anything. Moreover, ITOps teams have the right skill set to plan, design, and provision infrastructure that can meet enterprise goals - especially in areas like private clouds, VPCs, and edge computing.
What we realistically see happening is a collaborative approach between ITOps teams and AIOps tools to create a Low-touch ITOps environment. While AIOps automates the mundane and the slightly challenging elements, ITOps focuses on design, oversight, and the complex elements of their role. The enterprises that want to benefit most from this collaboration should begin training their brightest ITOps teams in designing AI solutions. This will enable them to take full advantage of the domain knowledge of ITOps and combine it with data science to create truly powerful and unique solutions that benefit the enterprise.
At Algomox, we believe in the strong potential of AIOps to bring Low-touch IT Operations to the enterprise. To learn more, please visit Algomox AIOps Platform Page.