Rakesh Malhotra July 11, 2019
Back in the late 90s, I was working with a customer who needed desperately to modernize an application responsible for scheduling television commercials at local affiliates across the country. Time was of the essence because the application was not “Y2K compliant” (remember that?) and the platform upon which it was built was not Y2K compliant. It would need to be rewritten using, for the time, modern tools and frameworks.
In scoping out the level of effort and cost required to accomplish this, the estimated cost and effort were an order of magnitude larger than the original cost of building the application! Twenty years later and I would venture a guess that if the same app were re-built from scratch today using microservices and cloud-native patterns, the cost would be much higher still.
How is it that the cost and user-facing complexity of nearly every aspect of technology has plummeted over the past 20 years but enterprise platforms and application development continue to move in the other direction? Sure, salaries have increased, enterprises have significant legacy integration challenges and planet-scale is now accessible and required by almost everybody but this still doesn’t justify the cost explosion.I believe that there are many factors but two in particular that stand out.
Lack of Cost Transparency
You can’t manage what you can’t measure and most organizations struggle with tracking hard dollar costs, let alone more insidious drags on efficiencies such as downtime, employee turnover, maintenance, and attention distraction. These are huge factors in overall cost and complexity but are often neglected or inaccurately captured. With large public cloud providers, for the first time, many organizations are getting a clear view of hard dollar costs.
Frankly, most are shocked at how high it is relative to what they thought their internal costs were. Unless you are operating at gigantic scale levels (and even then I’d be skeptical), you’re kidding yourself by thinking you could run more cost-effectively than AWS, Azure, or GCP when considered from all dimensions. The holistic view of cost and value has finally come front and center as part of architecture and design discussions. Over time this should help make innovation faster and less expensive. We’re already seeing this amongst our clients who can run entire operations on major cloud providers for less than $2K per month in fixed costs using managed services.
Engineers Love Designing for Optionality
Optionality always has a cost and normally that cost is paid in the form of complexity. Having the ability to swap out your cloud provider or storage/networking subsystem without having to re-architect your platform or application is a noble goal. As a practical matter, it’s also unachievable and extremely expensive to get right and test regularly. Many people leave out the testing part. If you want proof, just observe how much of the internet becomes available when a major cloud provider has an outage.
There have been debates recently in the Kubernetes community (Kubernetes provides an abstraction over cloud infrastructure) as to whether it has become too complicated and will suffer the same fate as Open Stack. It’s convenient to recast Kubernetes as a framework to build a platform rather than a platform itself (which has always been true by the way) but it dodges the question of complexity. In a twisted way, the more complicated something is to use and manage, the more technical credibility it tends to get in the engineering community. When I was at Microsoft, we used to do usability studies on Windows Server vs Linux and invariably found that system administrators could set up DNS servers or web servers on Windows faster. However, the level of satisfaction
amongst the users was higher on Linux because admins felt a sense of accomplishment in getting it to work. After all, there’s no pride in clicking through an install wizard right?
Look, it’s okay to architect optionality into a design as long as the added cost and associated complexity are accounted for and worth it.
Change is Upon Us
The good news for customers is that new cost and operational models in cloud (such as serverless) are breaking some of these trends and bending the cost and complexity curve back in their favor. They are getting a new variety of cost and consumption models that can be applied to different contexts more efficiently depending upon the technical and business use case. Cost optionality can also create complexity, of course. Deciphering and optimizing your cloud provider bill can take an advanced degree in economics but at least you can now attack it. In a world where the risk of not innovating fast enough is bigger than the risk of failure, this is a welcome development.