Friday, May 10, 2024
HomeProgrammingServer-Side Caching Strategies: How Do They Work?

Server-Side Caching Strategies: How Do They Work?

Cache aside, read through, write through, write back: quite simple at first glance, but let’s go a little deeper!

As I’m sure you know, caching is a technique that allows you to store data in a way that is incredibly efficient to access.

Faster reading means faster applications – so almost every application that needs to deliver high performance uses some kind of cache.

In this article, we will learn four caching strategies used in application level: Cache-aside, Read through, Writing throughAnd Write-behind. We will also learn the pros and cons of each strategy.

Caching exists at different levels

In the previous sentence I indicated that we are going to tackle caching application level. Why did I specify it?

Well, because we tend to think of caching as a functionality that only runs on the server. It’s not.

We may cache data on the client (example: browser), on the infrastructure (example: CDN), and on the application. They all have some reasons for existing.

You can cache resources in the browser to prevent excessive calls to a server. For example, if you’ve already downloaded a JavaScript file, you don’t have to download it every time you navigate to a different page; if the content is not stale you can just reuse the ones you’ve already downloaded.

CDN is a way to store images and, more generally, static assets efficiently – browsers still have to call a server to download remote resources, but instead of calling the “main application”, they call the CDN to retrieve static assets – allowing our main application to process requests that require dynamic data.

Then we are finally back to the application cache: If you need to serve data that doesn’t change often, you can cache it to prevent excessive (and avoidable) access to the data source, which is usually a database or an external API. We will focus on this type of caching.

To deal with data, we need to choose the strategies of writing and reading. Let’s take a look at the most common strategies we have.

Cache aside: cache and DB DO NOT work together

Cache aside, also known as Lazy caching, is probably the most commonly used strategy: reading from the cache; If the item does not exist, retrieve it from the source and add it to the cache, so that the next time the application tries to retrieve the same item, it is already present in the cache.

It is called Cache-aside because the cache layer does not interact directly with the database: it is kept aside.

Benefits of caching aside

Cache aside is ideal for reading-heavy workloads: Every time there is a Cache Miss, add the missing data to the cache so that the next time you try to access the same data it will already be available.

Also, if the cache is not available, the application still works: instead of querying the cache, access the database each time; the application will certainly become slower, but at least it will remain available.

Finally, remember that not everything has to be accessible. When you use Cache-aside, you only cache data that someone needs.

Disadvantages of cache-aside

Since the first time you need a key it will not be present in the cache, the first access is slower (you will have to call both the Cache and the DB).

More importantly, if the data in the database changes (for example, because another application updates the same tables), you inconsistent data – this means you have to find a way to delete data updated by other services.

Read through: The cache knows how to query the DB

It’s similar to Cache-aside, but now the cache also knows how to deal with the source to obtain missing data.

The cache has a component that allows it to reach the database and retrieve the missing data.

To wait! How can the cache query the database? 😖

You need to install modules or plugins on it: for example, if you use Redis, you can install a plugin called RedisGear🔗 which allows you to communicate with a DB.

For example, if you use Hibernate to access a database, you can install a plugin called Rghibernate🔗 to configure Redis to query the database using structures defined in XML files, such as:

<?xml version="1.0" encoding="UTF-8"?>
<hibernate-mapping xmlns="http://www.hibernate.org/xsd/hibernate-mapping"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.hibernate.org/xsd/hibernate-mapping
    http://www.hibernate.org/xsd/hibernate-mapping/hibernate-mapping-4.0.xsd">

  <class entity-name="Student" table="student">
          <tuplizer entity-mode="dynamic-map" class="org.hibernate.tuple.entity.DynamicMapEntityTuplizer"/>
          <id name="id" type="integer" length="50" column="id"/>
          <property name="firstName" column="first_name" type="string"/>
          <property name="lastName" column="last_name" type="string"/>
          <property name="email" column="email" type="string" not-null="true"/>
          <property name="age" column="age" type="integer"/>
  </class>

</hibernate-mapping>

Other cache vendors, such as NCache, allow you to implement such modules in your application; with NCache you can create a class that implements IReadThruProvider and which has access to the DB, and configure NCache to use such classes when reading data.

Benefits of reading through

Just like cache-aside, read-through is optimal for reading-heavy workloads: Data is always available and expired data is automatically updated.

Reading through makes your application code smaller and easier to manage: your code will no longer query the DB and everything will be managed by the cache.

Disadvantages of reading through

Obviously, since every time you need to retrieve some data you have to go through the cache, we just created a One point of failure: If the cache is unavailable for some reason, your application will not be able to retrieve such data from the database.

We also introduced link between the cache and the DB: If the model in the database changes because we added a new column or changed a data type, we need to update the way the cache queries the database.

Write Through: The cache can write to the DB SYNCHRONOUSLY

Data is written first to the cache and then to the source in sync.

The application asks the cache to store some data. The cache then stores that data in internal storage and then writes to the database using some configurations and modules (as we saw for the readthrough pattern).

When both the cache and DB have updated the value, then return to the application to say everything is fine.

Again, the application does not write data directly to the database, but it is the cache itself that knows how to write data to the database.

Advantages of rewriting

Since all data written through the cache passes through, we will not have outdated dataand since all read queries pass through the cache, our application is always in the most updated state.

Disadvantages of rewriting

If the cache fails or is unavailable, we will lose the data we try to write. This happens because, once again, we have introduced a Single Point of Failure. Of course, if the application goes down, we can’t even read the data – everything is stuck!

Since these operations are done synchronously, we have now done that increased latency: Before continuing, the application must wait for both the cache and DB to complete their work and write the new data.

Write-behind: the cache can write ASYNCHRONOUSLY to the DB

Write-behind is similar to Write-through, but all writes to the source are the same asynchronous.

Benefits of backwriting

Since we now don’t have to wait for the DB to complete its writes, we did that improved overall performance because we shortened our latency.

We can too batch the writes to the databasereducing the number of round trips to the data store.

Disadvantages of backwriting

Again, if the cache fails, the data will not be saved to the database. And again, we have this new Single Point of Failure.

And if the cache updates the internal state correctly, but the DB for some reason cannot update the new data, we will do that too inconsistent data. And of course the Application does not know whether the data is stored correctly in the DB.

Further readings

This entire article exists thanks to another interesting blog post I recently read and decided to expand on a bit:

🔗 Caching strategies in practice | Rajeev

Both AWS and Azure offer a way to use caching on their systems.

AWS’s caching functionality is called ElastiCache. They’ve published a nice guide to best practices for using distributed caching:

🔗 Best practices for caching| AWS

Similarly, Azure gives you a Redis instance. And they also published their set of best practices:

🔗 Caching Guidance | Microsoft documents

Then Alachisoft, the makers of NCache, has a nice section you might want to read.

🔗 Advantages of read-through and write-through compared to cache-aside | Alachisoft

Finally, we have already talked about Caching in .NET: in particular, we learned how to use the Decorator design pattern to add a caching layer to our applications:

🔗 How to add a caching layer in .NET 5 with Decorator pattern and Scrutor

RELATED ARTICLES

1 COMMENT

  1. I came across this wonderful site a couple days back, they produce splendid content for readers. The site owner knows how to provide value to fans. I’m pleased and hope they continue creating excellent material.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments

бнанс Створити акаунт on The Interview Process in Clark, Pampanga
Binance注册奖金 on Venus in Scorpio