Home

PHP Sessions with Memcached and a Database (Sessions in the Cloud Done Right)

This article was about to be published almost a year ago when we were dealing with this issue, I hope somebody will still find it useful. We have been looking for a solution and experimenting on our own for quite some time, this is the result which actually works. We have stumbled upon lots of non-working, incomplete or completely misleading approaches. This has actually been tested successfully under heavy load.

 

When dealing with horizontal scaling of your web applications in cloud environment, you need to solve the problem with sessions. You can no longer store sessions in files on one machine's filesystem as usual. Let's imagine a topology like this one.

The point of this is to have dumb and quick database and scale by adding more servers (Apache+PHP in our case) which will do all the calculations and real work. To be able to do this you also need a load balancer which divides all the traffic between all your "worker" machines.

We use Amazon Web Services which utilise Elastic Load Balancing. In 04/2010 Amazon introduced a new feature called Session Stickness which means all requests coming from the user during the session will be sent to the same instance. In this situation you can store your sessions in the filesystem like you would do in a single-server setup. But what happens if you need to divert your users away from one particular machine because of the heavy load or you simply need to shut down one instance? Sessions of those users bound to this instance will be lost and you'll have to ask your users to log in again or perform some kind of autologin, probably losing some user data in the progress. This isn't a true stateless nor comfortable solution, not mentioning a setup where you don't even have such a smart load balancer. We try to fix that.

 

Memcached

We do cache a lot to save database time and the best way to cache is using Memcached. If you don't use it (or even don't know it at all) you should. Memcached is a distributed caching system which uses free RAM provided by any number of your instances thus acting as a superfast storage for your cache data. Memcached can run on one machine as well as in a cluster, giving you a single abstraction which can be accesed equally easily from every instance. Our figure now looks like this...

 

Doing it wrong

The best way how to deal with sessions in multiple instance environment is to use a centralised database as the main sessinon storage. It doesn't have to be the database you use for the rest of your application data, you might choose something more lightweight. Compared to filesystem stored sessions the database access is pretty slow. You will need to watch the database load, as the centralised storage could be the bottleneck of your application. So do not use DB as the only session storage.

Memcached could be used as a session storage directly, but you should never do so! Like any other cache the data (session in our case) could be deleted anytime. This would lead to session loss.

Even if you combine the DB and Memcache together (which is what we are about to do in the next step), it might be pretty hard to find the right ballance between the minimal necessary reading from and writing to the database, cleaning up the garbage and making the whole thing stable.

 

The solution

The solution is to store your sessions in the database and use Memcached as a write-through cache. Memcached reduces the number of requests to the database dramatically but can not be trusted when it comes to data storage. Records both in the database and the Memcached need to be updated every time you make any changes to the session data (the $_SESSION variable). We need to adjust the lifetime of the session record in the database, used by our garbage collector, which takes care of expired records. Lifetimes of our Memcached records are updated automatically with every r/w operation.

The next figure describes what we are doing here...

After opening the session we look in the Memcached and try to find client's SID. If it's not there we need to read it from the database. If the record is found in the DB, we load the data into Memcache and PHP's $_SESSION variable (or the equivalent session abstraction of your framework). If the SID wasn't found even in the database, we need to start a completely new seesion and send the new SID to the client.

The whole session object is stored in the memcache, including the information about the last update of the expiration timestamp of the database record. With each read from the Memcached we decide, whether the lifetime of the session in the database needs to be updated because it expires soon. The interval of this update is significantly shorter that session's lifetime, yet long enough to efficiently protect the database from the unnecessary updates with every request. In the example below we use 5 minutes for lifetime update and 20 minutes of database session's lifetime itself. This means the database gets accessed only once in 5 minutes - if the session data is not changed. This mechanism gives us a protection from the database overload and keeps garbage collection in place, so we don't get millions of dead sessions in the database.

When the PHP script is being executed, it might change some data in the $_SESSION variable. When the script is about to finish, write method is called. At this point we check whether any data in the session has changed during the execution. If so, we need to write them down into the DB and put copy into Memcached aswell. If nothing changed, we don't need to do anything else.

It is obviously not a very good idea to store data changing with every client's request in the session since we would need to write them down to the database with each request.

 

The code

Let's now assume our DB table for storing sessions looks like this (it's MySQL in our case):

CREATE TABLE `sessions` (
	`sessionId` varchar(32) NOT NULL,
	`expiration` int(10) UNSIGNED NOT NULL,
	`data` text NOT NULL,
	PRIMARY KEY (`sessionId`),
	KEY `expiration` (`expiration`)
) ENGINE=InnoDB;

 

PHP will call all the necessary things for us, we just need to register our own session handler and manage the session correctly. The session handler will look like this:

<?php
/**
 * Memcache & MySQL PHP Session Handler
 * 
 * @author Jakub Matějka <jakub@keboola.com>
 * @see http://pureform.wordpress.com/2009/04/08/memcache-mysql-php-session-handler/
 */
class App_SessionHandler
{
	/**
	 * @var int
	 */
	public $lifeTime;

	/**
	 * @var Memcached
	 */
	public $memcache;
 
	/**
	 * @var string
	 */
	public $initSessionData;
 
	/**
	 * interval for session expiration update in the DB
	 * @var int
	 */
	private $_refreshTime = 300; //5 minutes
 
	/**
	 * constructor of the handler - initialises Memcached object
	 *
	 * @return bool
	 */
	function __construct()
	{
		#this ensures to write down and close the session when destroying the handler object
		register_shutdown_function("session_write_close");
 
		$this->memcache = new Memcache;
		$this->memcache->connect(MEMCACHE_HOST, MEMCACHE_PORT);
 
		$this->lifeTime = intval(ini_get("session.gc_maxlifetime"));
		$this->initSessionData = null;
 
		return true;
	}
 
	/**
	 * opening of the session - mandatory arguments won't be needed
	 * we'll get the session id and load session data, it the session exists
	 *
	 * @param string $savePath
	 * @param string $sessionName
	 * @return bool
	 */
	function open($savePath, $sessionName)
	{
		$sessionId = session_id();
		if ($sessionId !== "") {
			$this->initSessionData = $this->read($sessionId);
		}
 
		return true;
	}
 
	/**
	 * closing the session
	 *
	 * @return bool
	 */
	function close()
	{
		$this->lifeTime = null;
		$this->memcache = null;
		$this->initSessionData = null;
 
		return true;
	}
 
	/**
	 * reading of the session data
	 * if the data couldn't be found in the Memcache, we try to load it from the DB
	 * we have to update the time of data expiration in the db using _updateDbExpiration()
	 * the life time in Memcache is updated automatically by write operation
	 *
	 * @param string $sessionId
	 * @return string
	 */
	function read($sessionId)
	{
		$now = time();
		$data = $this->memcache->get($sessionId);
		if ($data===false) {
			#the record could not be found in the Memcache, loading from the db
			$sessionIdEscaped = mysql_real_escape_string($sessionId);
			$r = mysql_query("SELECT data FROM sessions WHERE sessionId='$sessionIdEscaped'");
			if ($r) {
				#record found in the db
				$row = mysql_fetch_row($r);
				$data = $row[0];
 
				$this->_updateDbExpiration($sessionId, $now);
			} else {
				#record not in the db
			}
		} else {
			#time of the expiration in the Memcache
			$expiration = $this->memcache->get('db-'.$sessionId);
			if($expiration) {
				#if we didn't write into the db for at least
				#$this->_refreshTime (5 minutes), we need to refresh the expiration time in the db
				if($now-$this->_refreshTime > $expiration-$this->lifeTime) {
					$this->_updateDbExpiration($sessionId, $now);
				}
			} else {
				$this->_updateDbExpiration($sessionId);
			}
		}
 
		$this->memcache->set($sessionId, $data, false, $this->lifeTime);
		return $data;
	}
 
	/**
	 * update of the expiration time of the db record
	 *
	 * @param string $sessionId
	 * @param int $now UNIX timestamp
	 */
	private function _updateDbExpiration($sessionId, $now=null)
	{
		if(!$now) {
			$now = time();
		}
		$sessionIdEscaped = mysql_real_escape_string($sessionId);
		$expiration = $this->lifeTime + $now;
 
		$r = mysql_query("UPDATE sessions SET expiration='$expiration' WHERE sessionId='$sessionIdEscaped'");
		#we store the time of the new expiration into the Memcache
		$this->memcache->set('db-'.$sessionId, $expiration, false, $this->lifeTime);
	}
 
	/**
	 * cache write - this is called when the script is about to finish, or when session_write_close() is called
	 * data are written only when something has changed
	 *
	 * @param string $sessionId
	 * @param string $data
	 * @return bool
	 */
	function write($sessionId, $data)
	{
		$now = time();
 
		$sessionId = mysql_real_escape_string($sessionId);
		$expiration = $this->lifeTime + $now;
 
		#we store time of the db record expiration in the Memcache
		$result = $this->memcache->set($sessionId, $data, false, $this->lifeTime);
 
		if ($this->initSessionData !== $data) {
			$sessionData = mysql_real_escape_string($data);
			$r = mysql_query("REPLACE INTO sessions (sessionId, expiration, data) VALUES('$sessionId', $expiration, '$sessionData')");
			$result = is_resource($r);
 
			$this->memcache->set('db-'.$sessionId, $expiration, false, $this->lifeTime);
		}
		return $result;
	}
 
	/**
	 * destroy of the session
	 *
	 * @param string $sessionId
	 * @return bool
	 */
	function destroy($sessionId)
	{
		$this->memcache->delete($sessionId);
		$this->memcache->delete('db-'.$sessionId);
		$sessionId = mysql_real_escape_string($sessionId);
		mysql_query("DELETE FROM sessions WHERE sessionId='$sessionId'");
 
		return true;
	}
 
	/**
	 * called by the garbage collector
	 *
	 * @param int $maxlifetime
	 * @return bool
	 */
	function gc($maxlifetime)
	{
		$r = mysql_query("SELECT sessionId FROM sessions WHERE expiration < " . time());
		if (is_resource($r) && (($rows = mysql_num_rows($r)) !== 0)) {
			for ($i=0;$i<$rows;$i++) {
				$this->destroy(mysql_result($r,$i,"sessionId"));
			}
		}
 
		return true;
	}
}

 

To use the handler in PHP you need just to register it with something like this:

require_once 'App/SessionHandler.php';
$sessionHandler = new App_SessionHandler();
session_set_save_handler(array (&$sessionHandler,"open"),array (&$sessionHandler,"close"),array (&$sessionHandler,"read"),array (&$sessionHandler,"write"),array (&$sessionHandler,"destroy"),array (&$sessionHandler,"gc"));
session_start();

 

And that's it. We hope this will help somebody. We're looking forward to your comments and possible enhancements :)

 

Our sources and inspiration:

 

Written by Mirek Burkoň, April 5, 2011