When the Experience Database (xDB) was released, there was suddenly a lot of information floating around about session state. In this blog post, I will defeat this nemesis of mine by covering:
- The role played by session state in the xDB
- What ‘private’ and ‘shared’ session state means
- The effect your choice of server architecture has on the way session state must be configured (specifically in-proc versus out-proc)
At time of writing, the most recent release of Sitecore is 8.1 Update-2.
What is session state, again?
If you already know what session state is, skip ahead – or refresh your memory by reading the MSDN article (https://msdn.microsoft.com/en-us/library/ms178581.aspx).
This is how I understand session state: HTTP is a stateless protocol. That means that when I request https://mhwelander.net/ (request #1) and subsequently request https://mhwelander.net/about/ (request #2), the server treats me as a magical and interesting stranger each time. ASP.NET session state is a mechanism by which the server remembers who I am by keeping a short-term record of me. It gives me a session ID when I first start browsing, puts it in a cookie (usually), and hey presto – the server now has a way to identify me as me each time I make a request. Additional data can be added into session state as I make requests – for example, if I use my name to fill in a form, the application might store that as a session state value so that it can add my name into marketing campaigns: “Martina, get 20% off motherboards!”
When we talk about session state in an xDB context, this is what we are talking about – the ASP.NET session state mechanism.
What is ‘out-proc’ and ‘in-proc’?
You can either store session state data in memory (in-proc, or ‘in process’), or you can store it somewhere else – such as in a SQL database (out-proc, or ‘out-of-process’).
In a standard ASP.NET application, there are different out-proc session state providers to choose from – you can see a comparison of the most popular providers here: https://blog.devopsguys.com/2013/07/26/best-performing-asp-net-session-state-providers-2013/.
At time of writing, there are two session state providers available for Sitecore and xDB – one that uses SQL, and one that uses MongoDB. xDB requires that session state providers support the Session_End event, which rules out Redis Cache for the version of Sitecore that is currently available (8.1 Update-2). You should choose the provider that uses a technology that you are able to support and optimize.
What role does session state play in xDB?
Before xDB, there was Sitecore’s Digital Marketing Suite – or DMS (aww, memories). The DMS was a very “chatty” application that made frequent trips to read from and write to the SQL analytics database during the course of a visitor’s session. This had performance implications – particularly for high-traffic, geographically distributed sites where large amounts of data had to travel across huge distances.
By contrast, Sitecore accesses the xDB’s collection database twice during the course of a visitor’s session – once at the start, to identify the visitor and load data into session if the visitor is already known, and once at the end, to flush session data into the collection database. Whilst the session is ongoing, all data – pages visited, goals triggered, patterns matched – are held in session.
Relying on session state to hold important data about visitors and their interactions creates the following requirements:
- Session state should be managed out of process – this session state resilient against ASP.NET errors and IIS restarts, and it is a hard requirement in a clustered environment with multiple content delivery servers (I will cover this later)
- Every content delivery cluster should have a dedicated session state database server
- The session database server should be as physically close to the CD servers as possible, and on the same network
The following example shows two content delivery clusters, each with a dedicated session state server:
In this scenario:
- A visitor hits a cluster (e.g. Europe) – a request goes out to the collection database to identify that contact
- If they are known, data about the contact is loaded into session and a lock is placed on the contact against the current cluster until the end of the session
- The visitor browses, bouncing between content delivery servers within the cluster (sticky sessions are not required if session state is managed out of process) – data about the visitor’s interaction is written to the session state database
- After a period of inactivity, the session expires – analytics data is written to the collection database and removed from the session state database, and the lock on the contact against the cluster is released
Private and shared session state
There are two types of session state – private and shared. An easier way to think of this is that there are two types of session state data being stored; data that is private to a particular interaction and data that is shared between interactions that overlap. Private session state contains information about the interaction – such as pages visited and goals triggered on that device. Shared session state contains information about the contact – such as their e-mail address and engagement plan state.
When you install Sitecore locally, private and shared session state are set to ‘in-proc’. Private session state is configured where you would configure regular old ASP.NET session state – in web.config. Shared session state is configured in Sitecore.Analytics.Tracking.config.
This distinction (shared vs private) is required to support a single contact with two concurrent sessions on two different devices – for example, you might start browsing a website on your laptop and continue on your phone whilst the laptop session is still ongoing (maybe you had to pee during the last few moments of an E-Bay bidding war). What happens if you moved to a different engagement plan state during the laptop session, or update the e-mail address that is stored in xDB? The laptop session is still ongoing, so those changes have not made it to the collection database yet.
This is where shared session state comes in, flexing and looking impressive. Contact data needs to be shared across multiple device sessions, and is therefore stored in shared session state. If I moved into a different engagement plan state or changed my name on my laptop, my toilet phone needs to know about it immediately – before that information hits the collection database.
Why is it important to share certain data between sessions? From a purely technical standpoint, you do not want data from session 1 to be lost because session 2, which expires after session 1, loaded out-of-date information from the collection database before it started. The diagram below (from http://doc.sitecore.net/) shows the lifetime of shared session:
From a usability point of view, you are offering a seamless experience across multiple devices – imagine how unimpressed you would be if the personalization rule for a particular engagement plan state was active on your laptop, but your toilet phone acted like it was living in the past.
Read more about private and shared session state on doc.sitecore.net: https://doc.sitecore.net/sitecore_experience_platform/setting_up__maintaining/xdb/session_state/session_state
A little note about cluster-forwarding
You may wonder what happens if toilet phone is routed to a different cluster of content delivery servers (let’s say you are in between an east coast and west coast cluster). Worry not – at the start of your laptop session, a lock was placed on your contact data against a particular cluster. When your second session starts, Sitecore checks the xDB and forwards you to the cluster that your pre-existing session is on.
Read more about cluster forwarding on doc.sitecore.net: https://doc.sitecore.net/sitecore_experience_platform/setting_up__maintaining/xdb/server_considerations/server_clusters_and_transferring_contact_sessions
When and why do I have to use out-proc session state management?
If you are using the xDB and you have more than one content delivery server per cluster, you MUST USE OUT-PROC SESSION STATE. In-proc is categorically not an option, not even if you use sticky sessions. This is all because of the need to support the sharing of session state information for a contact with two concurrent session on different devices. Let’s look more closely at why you cannot solve the problem with sticky sessions:
I visit a website from my laptop and log in. My login details happen to be my unique identifier in the xDB, and it loads my contact information from the collection database. This website is backed by three content delivery servers – my session sticks to content delivery server #1 and will stay there for the duration of my session. Session state data is managed in-proc; private and shared session state data is written to memory. So far, so good.
I abandon my laptop and switch to my phone – I log in, am identified a second time, and a new session starts. Or does it? If my phone session sticks to a different content delivery server, Sitecore has no way of knowing what is happening to my contact data on content delivery server #1 – because it has no way of accessing the memory (and therefore the shared session data) of another machine. I could have changed my name to Myrtle McMuffin; until that information makes its way into the collection database, my mobile phone has no idea that this has happened.
At time of writing, I do not know what will happen in this scenario (edit: but Tauqir Malik does!) – either the second session simply cannot get a lock on the contact and the session hangs, or you open yourself up to data inconsistencies when session B overwrites session A, and Myrtle McMuffin disappears forever.
This is why you must use out-proc session state management in an environment with multiple content delivery servers.
Read more about session state and the xDB on doc.sitecore.net: https://doc.sitecore.net/sitecore_experience_platform/setting_up__maintaining/xdb/session_state/session_state.
Hey – can I mix in-proc for private and out-proc for shared?
Technically yes, since you can configure different providers for shared and private session state, but I cannot point to proof that you gain anything at all from doing so. Shared session state must always be configured to use out-proc session state in a multi-content delivery server environment, and will be your limiting factor.
Update: What happens to session state data if the collection database becomes unavailable?
Thank you in particular to Dmitry Kostenko and Todd Mitchell for helping me make sense of session state. Diagrams courtesy of SimpleDiagrams.
This is a confusing topic, and I welcome corrections and clarifications! Either post a comment or contact me on Twitter: @mhwelander