Memcache Invalidation through Zones


On a recent project I worked on, we had an ongoing issue with caching lists of data. The problem arose because the user would interact with the items in a list and the items individually. If we had a list of 100 items we would cache items 1-10 and 11-20 separately so we could easily load and display these to the user. We also had cached the items by themselves for when a user was viewing an item outside of a list. A lot of times we did not know what list an item belonged to so it became increasingly hard to update the lists in the cache when a user changed an item.

We decided to create what we referred to as zones. Zones were related items in the cache that if we updated part of the zone then the whole zone needed to be updated. This happened when we updated or deleted an item from a list, but did not know what list it was in. It may have also been stored in multiple related lists and also by itself. Keeping track of where an item was cached became increasingly difficult and the zones made it very easy to simplify updating or invalidating the cache.

For example, if we had a list of books that we allowed users to rate, mark as a book they owned, share with their friends as favorites, we could create zones for these different elements. We would have a book zone, rating zone, owned zone, and favorite zone. When a user changed a rating for a book we invalidated the rating zone so there were no stale copies of the old rating in there. The other zones are not invalidated because nothing changed in those zones.

Using Zone Keys

When caching a list we originally used <List>-<UserID>-<Page> as our key. When caching an item we used <ItemType>-<UserID>-<ItemID> as the key.

To add zones we appended our keys with <ZoneName>-<ZoneVersion>.
The <ZoneVersion> was a datetime which we stored as a separate key in the cache. When requesting or saving an item to the cache we we would first take the Zone Name and look at <ZoneName>-Version to see if an existing <ZoneVersion> was stored there. If not we created the zone with the current datetime. If there was a datetime we used that as our <ZoneVersion>.
This was 1 extra hit to the cache, but it was a lot quicker than going through cached elements and trying to update old data.
When a change was made to an item in a zone we simply invalidated the old zone. This was done by updating <ZoneName>-Version to the current datetime. Now when someone tried to find something at the new <ZoneName>-<ZoneVersion> there would be a miss and the cache would be updated with the new data from the database. The old data was never deleted from memcache and would simply expire after time.

This proved to work well for our situation. We had been caching lots of lists and items separately and it became way too messy to try to keep track of all the places an item could be cached. The zones made invalidating and the eventual update of data very easy.

Comments are closed.