We started using Memcache for a project some time ago and the sessions of our web users had been one of the first functions using it. Here is how to use Memcache to reduce database load and speed up your websites.
I prefer file-based sessions over database-based because they’re much faster and take less resources but things start getting complicated as soon as more than one server is involved. Sessions over NFS is not that good and fast at all but sessions could add a lot of load to a database.
Using memcache sounds perfect: Keep sessions in memory, no high-level database protocol, no huge indexing and they expire automatically but they’re also some drawbacks. Memcache is no database. Any data “stored” in the cache may be lost at any time without any crash, bug or problem simply because the cache is full and has to free some old objects to get space for new ones. Users may be logged out way too early in this case or they’re unable to login at all if the cache is unreachable.
I always tell people “never ever assume that Memcache is usable” and this sounds quite hard. Why should someone use a server or service which isn’t there? Is is there, most of the time, but there is no storage guarantee and no guarantee that any data will stay available, every single script or module must run without Memcache but may have a performance impact if the cache is missing. The Memcache modules for Perl, found on CPAN, never die on any Memcache server error letting the application run without the cache.
Read an exisiting session
- Try to read the session id from Memcache including a prefix to make them unique and avoid a injection chance by sending another Memcache key as fake session id.
- If step 1 failed, try to read the session from the database, update Memcache on success
- If step 2 failed, consider the user as not having any session (and probably show the login page)
Create or update a session
- Write the new value into Memcache, the cache timeout is the remaining session timeout
- Write the new value into the database
Most sessions should expire sooner or later but any user action will renew the timeout. Doing so will force a database write on every single user action (=user accessing any page or using any Ajax function). Imaging a Google-like autocomplete feature: Every single keypress of the user will trigger another database write which could easily slow down the database.
I like to update the timeout timestamp only if at least 10% (or any other suitable value) of the timeout expired, for example:
- User loggs in at 12:00, timeout is set to 13:30 (90 minutes)
- User clicks on a page at 12:01 and has 89 minutes left, no timeout update
- User clicks on another page at 12:10 and has 80 minutes left = less than 90% of the original 90-minutes-timeout, timeout is updated to 12:10 + 90 minutes = 13:40
This strategy will limit database updates for the timeout to one every 9 minutes (10% of 90 minutes). Parallel requests might still do more updates (if each parallel request tries to update the timestamp at the same time), but that’s a rare case and things can’t get worse than updating on every request.
- Check Memcache first, don’t check the database on success
- Copy each session update to both, Memcache and database
- Get a reduced database load and better website speed