The above bit of sagely financial advice was offered to me by a financial professional. Certain assets and items make no financial sense when you buy them, renting is the better option in many cases. Why should technology be any different?
I strongly believe that the days of buying physical servers at Capex cost is a business model that is dead for many enterprises. Why invest all that hard earned money in a dead platform, why not just rent what you need, elastically? Need more, rent more. Need less, rent less. Not only will your expenses match your requirements, but your get better proportional use from those rented assets. Some recent reports puts the average utilization of servers running virtualization hypervisors in the enterprise datacentre, at between 20% and 40%. This implies that even “enterprise” virtualization is not delivering the value promised.
How do we solve this utilization issue? It needs to be solved as it implies that we are spending money on resources that we do not use. But getting benefit from this model means that we have to have modern application and infrastructure management technologies, so that we can “right size” our resources. Managing tech resources need to move beyond the “is it on or is it off” mindset, coupled with technology silos. No offense, but I do have a giggle when enterprises who get tools like Microsoft’s SCOM for free in their enterprise license agreements, think that these basic tools tell them anything about how the app is performing. No, today we need technology that will map our business rules and processes across infrastructure, showing us impact on business processes if a port on a device, or process on a server misbehaves. The issue here is cost. Most of these platforms need to gather various forms of data, including SNMP, WMI and packet level data. The best systems will even run a small agent on your .Net, SQL and Java systems, instrumenting these down to code level. But, in South African terms, a project like this could be anywhere from R 5 Million to R 10 Million, even for relatively small environments, with around 20 app servers and around 100 servers in total.
Solving this issue has been my mission. It is one of the reasons why our cloud platform can be called “enterprise grade”. Let me explain. The systems used to monitor the packet level data are dedicated hardware devices, capable of some serious data collection and analysis. However, when buying this technology, companies have to not only think about their data rates today, but also try and guess what the data rates will be 3-5 years down the line. Typically these assets get “sweat” a long time, so invariably, an enterprise buys a bigger box than what they need. Secondly, the tech to instrument your code gets sold in certain license batches, so you end up having to buy another 10 licenses, even if you only want to roll out another two servers, taking your total to 12. Having a cloud platform enabled that has this tech built in, makes it super easy for enterprises and software developers to have this technology “baked in” to their infrastructure. Now we get to a point, where we can deliver the following info:
- How fast is my application for the end user using it, with total response time in milliseconds instrumented from the end user device, right down all the tiers of my application and infrastructure.
- If my response is below par (my SLA requires a 400ms response time, but I am delivering a 900ms time), where is the delay? Network, server, app, code etc?
- In multi-tiered applications, where we have a web front-end connected, to an app server, which in turn talks to a database, we can see the delay and details for performance between servers. So, a slow app may be slow because the connection between the web servers and app tier is slow, as a result of a bad configuration on a load balancer.
- A new update was pushed for a .Net or Java based app, and now, certain modules of the app is slow. We can pinpoint these, and help developers debug and fix performance issues, as we can see exactly which piece of the app and code is causing an issue.
- We can tie memory, CPU and storage system performance together, and see how changes in resource quantities (add more RAM, add more vCPU) is positively or negatively affecting app performance. You can also see if a bigger server is needed, or if two or three smaller servers, running with a load balancer will work better.
- The network performance can be instrumented and modelled to the n-th degree. Is adding more capacity going to improve my performance, or will switching to a lower latency fibre optic link from my ISP improve my performance? Is accessing the service via Internet ok, or do I need to think about a dedicated point-to-point link to the cloud, or can I simply extend my MPLS service?
Understanding the impact of resource and their behaviour is key. With the right tools, you can rent just what you need. The right sizing job for CIO/CTO level managers just got so much easier…