As organizations have moved to the cloud at scale, they face many challenges—directly reducing cloud costs or the indirect approach of improving performance. While you can efficiently lift and shift your existing on-premises systems to the cloud, the same level of performance will not be visible without careful system design. In this post, we will examine specific attributes of a given cloud solution and how they may improve performance once you arrive.
Selecting the Right Location
In the cloud world, the first key to performance is reducing network latency, which means choosing the region closest to the customers. While the cloud provides a lot of automation, high availability, and flexibility, it is just a lot of computers (and storage) in a data center. Technology can’t beat the laws of physics; the less distance between the two, the lower the latency. All the major cloud providers have regions worldwide, with best-in-class networks between those regions. While you cannot place the virtual machines or services at the consumer’s front door, you can at least get them close as possible. It is essential to understand the network topology as well. Many options like content delivery networks (CDN) and direct network connections to the cloud provider can help reduce application latency.
It is also vital to ensure that the chosen region will provide the cloud services needed to support any planned deployments. Cloud providers often roll out new features to select regions first and then globally over time. If that feature is critical to the success of the cloud implementation, its availability in a region may outweigh performance concerns. Also, it is possible that a given region may not offer a particular feature due to other limitations—an excellent example of this is the various government clouds that do not support some services due to regulatory restrictions. All major cloud providers do their best to offer all their services across all their regions, but global availability is never guaranteed.
Choosing the Right Service Offering
One of the challenges of cloud architecture is that many services can often perform the same task—for example, there are 17 (or more) ways to run a container on Amazon Web Services (AWS). The overwhelming number of services can be dizzying without a deeper understanding of various platform options. While there are nuances to address performance specifically, the initial decision is ultimately about control. We’ll briefly discuss Infrastructure as a Service (IaaS) and Platform as a Service (PaaS).
Infrastructure as a Service (IaaS)
IaaS is the virtual machines, networks, and storage services public cloud providers provide. IaaS will closely mimic an on-premises virtual machine implementation. IT administrators will have complete control of the operating system and choice of hardware configuration (this also means they will likely be responsible for patching and backing up those workloads). The administrators no longer need access to the data center and the physical equipment. Still, they can adjust and configure networking, storage, and the versions of the operating systems and additional software installed in the environment.
This more granular control of the environment allows for more flexibility in terms of performance and compatibility with third-party software; however, IaaS can cause increased complexity, leading to additional administrative time focused on tasks like patching operating systems. Organizations must weigh each facet carefully to determine which is best suited for them.
Platform as a Service (PaaS)
PaaS provides less granular control than IaaS. In most PaaS services, the service is responsible for backups, high availability, and system updates. Administrators no longer have visibility or access to the underlying operating system. All service management tasks occur through the cloud’s control plane, implemented through APIs. It is also essential to understand the hardware configuration of any PaaS service offering—some services are more transparent than others regarding the underlying hardware and performance characteristics. Another crucial facet is the scaling model for the services—while IaaS VMs can typically scale storage and compute independently, however, in some PaaS services, these two dimensions may be tied together, dramatically increasing costs if storage requirements increase.
Performance Choices
Depending on existing workloads and application requirements, you have a choice regarding performance configurations. For example, in IaaS, you can choose optimal virtual machine sizes for workload needs. There are options for minimal workloads that offer low CPU core counts and small amounts of memory. Or you can choose a virtual machine size that offers copious amounts of both CPU cores and memory. Each option comes with a cost, the latter being significantly more expensive than the former, so that you can choose appropriately.
In the PaaS realm, there is a similar concept. In Microsoft Azure specifically, the lower service tiers of Azure SQL Database (not Managed Instance) allow for Database Transmission Units (DTUs), a metric derived from a combination of CPU cores and memory. You can scale the database to a higher tier if you require higher performance, providing increased CPU cores, IOPs, and memory. The General Purpose tier offers network-attached storage, while the Business Critical tier offers locally attached solid state drives (SSDs), which increases overall throughput and lower latency. One caveat, while IaaS services are nearly always fixed-price offerings, many PaaS services have usage-based pricing, leading to unpleasant surprises if your utilization patterns change suddenly.
Performance Monitoring
Understanding performance through monitoring is a critical function of any IT organization. An on-premises implementation offers the most expansive field of view regarding available metrics simply due to closer proximity to the infrastructure components. When you move to the cloud, appropriate monitoring can become more complex because the proximity to the infrastructure has increased. If you use PaaS services, your observability targets may have changed completely. All cloud providers offer native monitoring; however, in many cases, these offerings are very high-level and may not capture the detailed metrics you need. You should also note that collecting and storing these metrics over time may come with additional costs from your cloud provider. There is frequently several minutes of latency in these cloud monitoring services, so if you need real-time monitoring, you may need to invest in third-party tools.
For example, when running SQL Server on an Azure virtual machine, baseline metrics like CPU, memory, and disk utilization are surfaced to the Azure control plane, allowing for performance visibility within the Azure portal. However, specific SQL Server related metrics are unavailable through Azure Monitor natively, so you may need additional tools or processes deployed to capture critical metrics. If you implement a PaaS solution, gathering deeper-level metrics proves to be more complex because lack of access to the underlying operating system and database service, so you are more at the mercy of your cloud provider.
Choosing the Right Tool
Anytime organizations move into a new ecosystem, whether Azure, AWS, or Google Cloud Platform (GCP), selecting the appropriate performance monitoring tool can be cumbersome. In the early days of the cloud, the options were limited, and often self-derived tools and scripts were born out of necessity. Now that the cloud has become more broadly adopted and mature, the market offers more options, and each tool provides varying capabilities, such as SQL Diagnostic Manager for SQL Server (DM). Idera’s SQL Diagnostic Manager provides in-depth SQL Server monitoring functionality that covers the performance of the entire SQL Server ecosystem and provides the utmost comprehensive diagnostics in the market.
Summary
Before leaping into the cloud, or even if you are already in the cloud, choose the appropriate target location, service tier, and monitoring tools correctly. Improper selection of any facet can lead to overall dismal database performance and potentially higher costs. This poor choice would result in undesirable results for downstream consumers, namely the customers. While reversing bad decisions can be accomplished, doing so can be painful and costly. Take the appropriate time to do it right the first time. To learn more about SQL Diagnostic Manager, or to start a free trial please visit our site.