cloud-optimization

Introduction To Cloud Optimization

TABLE OF CONTENTS

  1. What is cloud optimization
  2. Why should you care about cloud optimization?
  3. The 5 levels of cloud optimization
    1. Cloud optimization at the infrastructure level
    2. Cloud optimization at the operating system level
    3. Cloud optimization at the web server (configuration) level
    4. Cloud optimization at the database level
    5. Cloud optimization at the application level

 

With the ever-increasing amount of data being shared, used, and processed, web app performance is more important than ever. 

At the same time, users expect applications to run seamlessly, and a bad experience can potentially damage business productivity.  

Slow and unreliable web applications also mean higher resource consumption, which can lead to higher costs, diminished profits, and even serious damage to the company’s image.

However, all these shortcomings can easily be avoided through cloud optimization. Specifically, through the continuous optimization of web applications at each critical level. 

This article is part of a Cloud Optimization Series where we discuss how you can configure your cloud environment to achieve the best results. 

 

What is cloud optimization?

Every web app is unique, and so are its infrastructure requirements. Plus, these requirements change over time. 

Cloud optimization refers to the process of balancing these requirements and allocating the right resources so that an application can run efficiently. 

Some of these optimizations can also be automated through AI/ML, with little to no developer intervention. This means fewer resources to manage and a more streamlined infrastructure.

 

Why should you care about cloud optimization?

There are 3 main reasons why you should care about cloud optimization:

1. Site performance & speed

Web app optimization isn’t just nice to do, but a necessity. 47% of internet users expect websites to load in 2 seconds or less, and 40% will abandon a page that takes 3 or more seconds to load.

An Aberdeen study also found that a 1-second delay in page load time equals 11% fewer page views, a 16% decrease in customer satisfaction, and 7% loss in conversions.  

2. Site ranking & SEO

Your website’s SEO ranking is closely related to its loading speed. There are many factors that contribute to how high or low Google ranks a page (like how much time users spend on it and whether they find what they’re looking for), but among them, page speed is one of the most important.

3. Cost optimization

On one hand, Financial Departments want to keep cloud spend in check. On the other hand, developers want to make sure their apps are always working at 100%, so simply reducing the allocated resources is not an option.

The more complex your web app is, the more resources it will need, and they don’t come cheap. The efficient use of the tools and resources you have at your disposal can translate into savings and overall better management of your IT budget.

 

The five levels of cloud optimization

cloud-optimization-6

Cloud optimization techniques can be applied at each level of the cloud environment. We can, therefore, talk about 5 levels of cloud optimization, one at every level. Let’s dive into each of them.

Cloud optimization at the infrastructure level

cloud-optimization-2

What it means

Optimal application infrastructure is scalable, highly available, and specifically built for that application. 

Why it matters

Having a scalable infrastructure means that it doesn’t matter whether you have 1 active user or 1 million – any spikes in traffic will be addressed immediately and automatically. 

At the same time, when the traffic is low, some of the resources will be de-allocated so you only pay for what you actually use.

Scalable infrastructures rarely fail – they are highly available and reliable and, should a catastrophic failure occur, data loss is prevented. They are also easy to manage, as updates, modifications, and diagnosis can be performed seamlessly. 

How you can implement it

There are two ways you can scale your cloud infrastructure: vertically and horizontally.

  • Vertical scaling refers to expanding one component’s ability to handle an increase in load. In terms of hardware, this means adding processing power and memory; in terms of software, this means optimizing algorithms and application code. But this is not the best form of optimization.
  • Horizontal scaling, on the other hand, refers to the addition of new machines (with the same configuration) to your current architecture. This is by far a better alternative both performance and cost-wise. 

As cloud infrastructure encompasses both hardware and software components, optimization at infrastructure level refers to optimizing these components for efficiency, performance, and cost.

The only way DevOps can optimize an infrastructure is by using the right measuring and monitoring tools. Without clear and complete visibility, you don’t know what you need more of, and what you don’t.

Once you have measurement and monitoring systems in place, here are a few things that can help you optimize your infrastructure:

  • Load balancing and autoscaling: if loads vary significantly over time, load balancing and auto-scaling can help ensure the performance of your application is never affected, without burning through your budget;
  • Microservices and serverless architectures: newer cloud infrastructure technologies don’t require any up-front capacity and bypass environment configuration challenges;
  • Cloud governance: as you grow and your infrastructure gets more and more complicated, you cannot get away without well-defined processes.

Cloud optimization at the operating system level

cloud-optimization-3

What it means

This type of optimization refers to how efficiently the web app uses the resources it has at its disposal. For example, computing and storage should be equally distributed instead of having a few systems running at full capacity while the others remain idle. 

Why it matters

Operating system management isn’t usually a priority for engineers. But you should focus on it because an efficient use of computing power makes a significant difference in how a web app performs.

How you can implement it

When it comes to optimizing the operating system of a production environment, there are 3 things you should keep in mind:

1. IOPS

A fundamental criterion of an efficient production environment is the speed at which data circulates between the various components. This is directly reflected in the loading speed of an application.

From an OS point of view, any type of storage device (be it HDD or SSD) is evaluated in IOPS (Inputs/Outputs per Second). This metric is greatly influenced by multiple factors such as:

  • how the data is being read from or written to the storage system; 
  • the speed of the storage network fabric and storage controller on the server-side; 
  • the efficiency of the storage software. 

Businesses have various storage needs. If data access and recording speed are critical, SSD is the more costly but more efficient solution. However, for archives or backups that do not require real-time data response, HDD may be more suited. 

2. RAID

Another powerful tool where data availability and performance is concerned is RAID (Redundant Array of Independent Disks). This technology enables various ways to combine disk drives in patterns that enable various functionalities. 

In this sense, RAID can be viewed as a layer between the drives or partitions and the file system. When data availability is a priority, RAID can be used for parallel storage.

3. File system

Last but not least comes the decision on what file system to go with. A file system manages the information on a disk, providing a structure and logic to the data by breaking it into pieces and assigning a name to each piece, thus making it available to the operating system. 

An important feature of some file systems is known as journaling. This is the ability of the file system to keep track of files written on the disk so, in case of power failure or system crash, it can resume the job until completion. Without journaling, unplanned outages can result in catastrophic data corruption. 

In short, cloud optimization at the operating system level means choosing a system configuration that enables your web app to perform at full potential.

Cloud optimization at the web server (configuration) level

cloud-optimization-5

What it means

A growing number of users can lead to the server being overloaded. The proper configuration of the web server can ensure that the web app performs as expected no matter the number of requests.

Why it matters

Users have a very low tolerance for slow applications. That’s why developers are expected to build and optimize web apps in a way that enables HTTP requests to be serviced with the minimum response time.

How you can implement it

Two ways you can measure response time is by looking at the number of HTTP requests per second and the end-to-end response time during high-usage periods.

No matter what web server you use, you should tune it to ensure application performance. Here are a few things you can optimize:

1. Access logging – buffer entries in memory and write requests to disk as a group, instead of writing a log for each one of them.

2. Buffering – responses that don’t fit in memory are written to disk, which can slow the application performance. Buffering makes communication with the client more efficient by holding part of a response in memory until the buffer fills.

3. Client keepalives – keepalive connections can help reduce overhead, especially if you use SSL/TLS.

4. Upstream keepalives – connections to application servers, database servers, etc. can also be optimized using keepalive connections. Increasing the number of idle keepalive connections that remain open for each worker process can help reduce the number of new connections, thus increasing connection reuse.

5. Limits – restricting the number of resources clients can use improves performance and security.

6. Socket sharding can reduce lock contention and improve performance.

7.. Thread pools – by using thread pools, slow operations are assigned to a separate set of tasks, while the main processing loop keeps running faster operations. Once the disk operation completes, the results go back into the main processing loop.

Cloud optimization at the database level

cloud-optimization-4

What it means

Along with the server response time, a high-availability database can also be optimized to respond to queries more quickly. This enables the app as a whole to be faster and perform its functions seamlessly.

Why it matters

If you’re working with large amounts of data, even the slightest change can have a huge impact. For example, you may have never considered this but even placing your app’s database, along with the rest of the infrastructure, on a different floor than your development team can slow the app down. 

Now imagine if your team and your app’s infrastructure were on different continents. If you can make your app even a 10th of a second faster, do it. 

How you can implement it

Slow performance can be attributed to poor database design. Databases can be optimized at both configuration level and query level. 

From a configuration point of view, databases should be:

1. Secure

You can protect your stored data through:

  • firewalls;
  • limited access (only one IP or an IP range);
  • the use of SSH for connections;
  • SSL connections between the application and the database.

2. Well configured

Instead of using a default configuration, focus on optimizing server performance. This includes the server’s available memory, number of CPUs, and the types of queries you want to execute.

3. Highly available

Thanks to load balancing, applications usually span on multiple servers and incoming requests are equally distributed. However, no matter how many app servers you have, if you only have one database server, your application is not designed for high availability.

By duplicating your data on multiple database servers you can prevent single-point failure. 

4. Scalable

As applications experience more traffic, database queries slow down. To avoid this, you can use read-replicas that allow data for incoming requests to be served from multiple servers, while the writes are retained on the master server.

You can also use a monitoring solution together with a load testing tool to better estimate your database and application scalability and prevent crashes.

5. Reliable

Backups ensure that data can be retrieved in case of corrupted data or server loss. A good backup should include:

  • all the stored data;
  • server configuration files;
  • database user accounts;
  • certificates;
  • any other external files that may be referenced from within the database.

Make sure you also verify your backups before you use them as a restoration point.

From a query level point of view, you should avoid:

  • coding loops;
  • correlated SQL subqueries; 
  • overusing Select;
  • temporary tables;
  • using Count () – use Exists () instead;
  • Monitor slow query logs and deadlocks.

Cloud optimization at the application level

cloud-optimization-7

Last but not least, the optimization of a cloud environment can also be performed at the application level.

What it means

In simple terms, application optimization means monitoring and analysis. You first need to know what causes your web app to run slow, then you can address that issue.

Applications can run slowly because of many different factors – some of which we’ve covered in the previous sections. Others include the code itself not being optimized (in which case different pieces of code might need to be rewritten).

Why it matters

Application optimization is directly linked to performance, so how fast or slow an app loads its content or responds to requests made should be taken seriously.

86% of users have uninstalled or stopped using an app because of performance issues, while 38% of them have switched to a competitor. 

How you can implement it

Load time, time to first byte, perceived performance, API requests, caching, and UI responsiveness all affect a user’s experience. Each of these metrics contributes to application performance in different ways, so it is important to optimize them one by one.

Load time

Load time can be affected by a variety of factors, including network speeds, server load, long-running API requests, and how much data must be downloaded in order to display the page. 

Don’t get carried away when using images, CSS, HTML, fonts, and JavaScript – your users need to download all this data to use your application and now everyone has a fast internet connection. 

Time to first byte

TTFB measures how long it takes the client to receive the first byte of data from the moment he made the request. It’s usually included in the load time, but it can also be optimized separately. 

Time to first byte is highly dependent on server hardware, users’ network speed, and any network congestion between the user and the server and the only way you can improve it is by using a CDN.

Perceived performance

Although load time is important, users don’t think in terms of milliseconds. If most of the page loads upfront, users will think the application has fully loaded even if it hasn’t.

API requests

Look for any requests that take more than 100 milliseconds and try to find out why. API requests are usually not affected by network conditions, so you should check whether the requests are being made concurrently or in series.

Caching

Optimizing cache is one of the easiest ways to improve application performance. All images, stylesheets, fonts, and javascript files to be cached to avoid slow loading speed. 

Store in cache (memache, redis) any database query result that takes a very long time to load, or trigger a database load, when possible.

UI responsiveness

The response time is crucial. Whether a user is requesting a page or interacting with the UI, we want those to respond as quickly as possible.

Touch or click should show feedback within 100 milliseconds, animations should be rendered at 60 frames per second, maintenance should be completed in small chunks during idle times, and pages should load in under 1 second.

 

Conclusion

Cloud computing can provide compute resources, storage resources, and applications. But only through proper management and optimization will you be able to obtain maximum satisfaction from these capabilities.

By following the cloud optimization best practices we’ve described above, you’ll be able to streamline your processes, improve visibility and security, and spend your IT budget more efficiently. And most importantly, you’ll ensure your app performance is 100%. 

Whether you’re an Ecommerce focused on hyper-growth, a SaaS looking to increase agility by reducing delivery cycle times, or a Dev Agency that needs to access centralized project maintenance, Bunnyshell can help you conquer your most challenging obstacles. 

 

Book a demo