Sakai Session Clustering Problems and Ideas
Update: 10/29/08
The information provided on this page, is provided for historical and background purposes. This page explains the other options that were explored prior to settling on an approach that uses Terracotta to enable cluster failover support. For information on the current approach please review the Sakai Session Clustering via Terracotta High Level Design page.
That Sakai user sessions do not fail over from node to node in a Sakai custer has been a source of some frustration for systems administrators and end users. It is also something of a blocker issue for commercial interests hoping to host content and assessment delivery engines on a Sakai platform. Recent discussion on the sakai-dev list indicates that this issue is still open. Excessively large object serialization/replication overhead seems being the biggest obstacle. Additionally, Sakai does not use container-managed sessions either exclusively or by default, meaning that "traditional" Servlet container-managed session replication likely wouldn't prove to be a viable solution, even if session footprint could be successfully reduced. Instead, as Ian indicated, replacing Sakai sessions' attribute store(s) with a distributed cache might allow us to exercise fine-grained control over the scope and size of replication events while avoiding the need to completely re-imagine/re-write Sakai session management. Thus it seemed natural to collect notes on session replication (See also, the discussion of distributing Sakai caches in general).
Currently, this document consists primarily of an under-the-covers look at "sessions" in Sakai intended to help better understand the problem space. As with most framework abstractions, Glenn Golden wrote the authoritative document on Sakai sessions. Glenn's paper on Sakai's request processing is also helpful, especially for understanding the relationship between Sakai and HTTP sessions.
Update: 7/30/08
Distributed caching and other solutions described in this document run afoul of Sakai ClassLoaders
. Terracotta is one alternative and is the option we're currently pursuing. Development activity is tracked by subtasks below SAK-13324.
Sakai Sessions
vs HttpSessions
vs Sakai UsageSessions
vs Sakai Presence
Sessions
The Sakai RequestFilter
wraps container-managed HttpServletRequests
with a custom decorator which caches the container-implemented HttpSession
, but allows Sakai to effectively preclude direct access to the latter. Schematically (draft diagram):
A Sakai Session
, then, can be thought of as an HttpSession
with somewhat more ambiguous semantics. That is, depending on configuration, invoking HttpServletRequest.getSession()
may return any of four session "types":
- A container-implemented
HttpSession
scoped to a particularServletContext
, i.e. web application (the "first one" encountered by the user, typically/portal
). - A Sakai-implemented
Session
(which also happens to implementHttpSession
) scoped to all web application invocation paths fronted by the SakaiRequestFilter
, i.e. for all of Sakai - A Sakai-implemented
ContextSession
(which also happens to implementHttpSession
) scoped to a single webapp (i.e.ServletContext
). - A Sakai-implemented
ToolSession
(which also happens to implementHttpSession
) scoped to a particular tool placement.
Note that in all cases except one ("org.sakaiproject.util.RequestFilter.http_session"=
CONTAINER_SESSION
), the container-provided HttpSession
is effectively replaced by a Sakai-constructed Session
of configurable scope. The default configuration ("org.sakaiproject.util.RequestFilter.http_session"=
TOOL_SESSION
) gives clients access to tool placement-scoped sessions, falling back to sessions associated with groups of tools if no placement session exists. Thus, without configuration, objects stored in user sessions are not visible to the Servlet container. As such, enabling container-managed session replication will typically have no effect, or at least not the intended effect.
Forcing Container-Managed Sessions
Whether or not Sakai would behave properly under a container-managed session configuration is not entirely known. Informal experimentation indicates that such configuration does not cause Sakai to break in obvious ways. However, there's nothing "important" in container-managed sessions. For example, after creating a worksite and manipulating the Forums tool in a variety of ways, the container session has only one attribute: portalskin=defaultskin
. Contrast this with the Sakai-scoped Session
:
sakai.portal.site.2e3f26ab-d3bd-4881-90a2-c6d02b28c6bc=3678ff14-e7cf-4545-8145-9fb563596fd7, sakai.locale.admin=en_US, org.sakaiproject.event.api.UsageSessionService=[], Access.Copyright.Accepted=[], sakai.portal.site.94b0c44e-539f-44fa-b204-78a3318b36ec=7c5e75aa-2033-436c-8871-d7a54f5e64c1, org.sakaiproject.util.EditorConfiguration.enableResourceSearch=false, sakai.portal.site.!gateway=!gateway-100, sakai.portal.site.~admin=~admin-360, sakai.portal.site.d7f18e57-cc7b-4371-8972-aaccb0416c3a=d4bf7ff3-4cb4-4752-80c0-d3b0f6f4506f, attr_preference_is_null=true
The collection of ToolSessions
is quite a bit larger yet.
This behavior occurs because Sakai-managed "session" objects of all three scopes are always available from the SessionManager
API, even if the RequestFilter
has not been configured for Sakai-scoped sessions. That is, under certain configurations, HttpRequest.getSession()
and SessionManager.getCurrentSession()
may not return the same object. Thus, even with container-managed sessions enabled, there is no guarantee that the container will in fact have access to all attributes relevant for a user's current interaction, thereby (potentially) obviating the usefulness of container-managed session replication. Static analysis of the Sakai trunk at r41592 (plus SASH) shows 41 references to Session.setAttribute()
, 7 references to ContextSession.setAttribute()
(most from RSF libraries), and 307 references to ToolSession.setAttribute()
. By contrast, there are only 27 references to HttpServletRequest.getSession()
and only 39 references to HttpSession.setAttribute()
.
Session IDs and Cookies
Sessions
are bound to an application server by their IDs (and Path), which are stored client-side in JSESSIONID cookies. For example, upon first accessing http://sakai.university.edu/portal, the browser will receive a Set-Cookie
header such as the following, where "sakai-host-01" is the value of sakai.serverId
. That property is typically configured in local.properties
:
Set-Cookie: JSESSIONID=503d1bcb-c3a5-4f53-8d1f-18bec76d00b2.sakai-host-01; Path=/
The Sakai SessionManager
(a SessionComponent
by default) maintains an in-memory data structure to track Sessions
. This data structure is keyed by Session
IDs, excluding the embedded serverId
. Thus, the serverId
exists simply for request routing at the container level and higher. Elements in this data structure are not persisted to the database in any form. It does not contain direct references to ContextSessions
nor ToolSessions
A Note on JSESSIONID Handling
Because Sakai sets the JSESSIONID cookie's Path attribute is set to the root, the browser will send this cookie on any request to the same domain, in this case sakai.university.edu
. This contrasts with "normal" usage of JSESSIONIDs in a Tomcat Servlet container in which JSESSIONID cookies are scoped to paths representing ServletContexts
. For example, consider two web applications, webapp-A
and webapp-B
deployed to a single Tomcat instance. Each webapp consists of a single servlet which simply outputs a hello world message after ensuring that a HttpSession
exists for the current request. Successive requests to each webapp result in the following header exchange:
*http://localhost:8080/webapp-A*
GET /webapp-A HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.12) Gecko/20080207 Ubuntu/7.10 (gutsy) Firefox/2.0.0.12
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
HTTP/1.x 200 OK
Server: Apache-Coyote/1.1
Set-Cookie: JSESSIONID=4443AAF15E28F35D799CBF46B583A740; Path=/webapp-A
Content-Length: 13
Date: Fri, 14 Mar 2008 00:39:46 GMT
----------------------------------------------------------
*http://localhost:8080/webapp-B*
GET /webapp-B HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.12) Gecko/20080207 Ubuntu/7.10 (gutsy) Firefox/2.0.0.12
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
HTTP/1.x 200 OK
Server: Apache-Coyote/1.1
Set-Cookie: JSESSIONID=A7FFE44805F9323EF0B2EBC2D5887377; Path=/webapp-B
Content-Length: 13
Date: Fri, 14 Mar 2008 00:39:55 GMT
----------------------------------------------------------
Session Attribute Storage
As implemented, each Sakai Session
, ContextSession
, and ToolSession
maintains its own in-memory attribute storage data structure (ConcurrentHashMap
). That is, although the latter two objects always have a parent Session
, there is no single data structure in which all attributes relevant to an end-users current interaction state are stored.
HttpSession
Contract
Although Session
, ContextSession
and ToolSession
default implementations all satisfy the HttpSession
interface, their implementations do not satisfy the Servlet's specification's requirement that all HttpSessionAttributeListeners
"in the web application" be notified of setAttribute()
and removeAttribute()
calls.
UsageSessions
Sakai UsageSessions
are (non-serializable) attributes of Sessions
. They are part of the event management API and are typically created at user login. A UsageSessionService
, implemented by UsageSessionServiceAdapter
is responsible for creating and persisting UsageSessions
. So, while Sessions
and UsageSessions
are one-to-one, it is the latter that is written to the SAKAI_SESSION table. Thus the values of SAKAI_SESSION.SESSION_ID do not correlate with JSESSIONID cookie value substrings. UsageSession
persistent fields:
Field |
Note |
---|---|
SESSION_ID |
UUID allocated at |
SESSION_SERVER |
A concatenation of the configured server ID and a timestamp generated at the last initialization of |
SESSION_USER |
A pseudo-foreign key to SAKAI_USER_ID_MAP.USER_ID. Represents the end user associated with this |
SESSION_IP |
The remote IP associated with this |
SESSION_USER_AGENT |
A remote client descriptor, e.g. a browser identifier |
SESSION_START |
|
SESSION_END |
|
SESSION_ACTIVE |
Optimization for queries seeking "open" sessions |
Records in SAKAI_SESSION are used to correlate SAKAI_EVENT records with the remote user's identity, location and platform. Interestingly, not all SAKAI_EVENT records seem to have corresponding SAKAI_SESSION records. For example, in a development database, this query returns 1078 rows, 558 of which have null SAKAI_SESSION.SESSION_ID values:
SELECT E.SESSION_ID event_session_id, S.SESSION_ID session_id FROM SAKAI_EVENT E left outer join SAKAI_SESSION S on E.SESSION_ID = S.SESSION_ID where S.session_id is NULL
UsageSessions
are also used in administrative tools for browsing active sessions on an installation-wide basis.
"Logout" of a UsageSession
invalidates the current Sakai Session
(which technically may or may not actually be the Session
to which the UsageSession
is bound).
UsageSessions
can be used to determine the "actual" user's identity for any Sakai interaction. Tools and services which allow users to impersonate other users do so by modifying fields on Session
, but leave UsageSession
fields untouched.
Presence
"Presence" denotes the transient intersection of a user, a site, a page, and a tool, i.e. the navigational context targeted by the user's last "click." A site, page, and tool triplet is referred to as a "location ID". Page and tool identifiers are optional fields in a location ID. Internally, the default PresenceService
implements "presence" as a protected, non-static inner class on org.sakaiproject.presence.impl.BasePresenceService
. Instances of BasePresenceService.Presence
have references to a UsageSession
and a string representation of a location ID. These Presence
objects are bound into Sakai Sessions
and are (sort of) persisted to the Sakai database in the SAKAI_PRESENCE table. Since Presence
objects reference UsageSessions
they take on the semantics of an application server-bound object. That is, given a Presence
, one could theoretically determine not only the tool with which a user is interacting, but the application server on which a user interaction is occuring. That said, not only is Presence
not actually a part of the public PresenceService
API, but the Presence
toUsageSession
binding in the Presence
class does not appear to be necessary since Presence
only actually has need of the UsageSession's
ID. Also note that the Admin On-Line tool (sakai.online
) does not report on location-to-app server relationships.
When a user logs in to Sakai and the Presence tool (sakai.presence
) is enabled, an iframe below the the current site's page navigation causes the browser to issue a request to a "special" PortalHandler
, encoding the current site ID into the request's URL. For example, the super user's MyWorkspace page navigation includes the following markup:
<div id="presenceWrapper"> <div id="presenceTitle">Users present:</div> <iframe name="presenceIframe" id="presenceIframe" title="Users present:" frameborder="0" marginwidth="0" marginheight="0" scrolling="auto" src="http://localhost:8080/portal/presence/%7Eadmin" > </iframe> </div>
Upon receiving the browser's request for this iframe's content, the portal's PresenceHandler
constructs a "virtual" Placement
instance, suffixing "-presence" to the current site ID to construct the ID field, and forwards control to the ActiveTool
instance representing the sakai.presence
tool. The sakai.presence
tool, implemented by org.sakaiproject.presence.tool.PresenceTool
, then invokes PresenceService.setPresence()
, passing the Placement's
ID as the location ID. setPresence()
finds the current UsageSession
, writes its ID and the received location ID through to SAKAI_PRESENCE, constructs a new BasePresenceService.Presence
and places that object into the current "sakai.presence.service" ToolSession
, using the location ID as the attribute key. Subsequent calls to setPresence()
with the same location ID are short-circuited if the current "sakai.presence.service" ToolSession
has a Presence
attribute cached under that location ID. If the check for a cached Presence
object in the session is successful, the Presence's
TTL is extended. In all cases, setPresence()
also walks the "sakai.presence.service" ToolSession
, checking for and removing "expired" Presence
instances. Removing a Presence
object from the session has the side-effect of deleting the corresponding SAKAI_PRESENCE record.
When a new Presence
object is created, BasePresenceService.setPresence()
fires a "pres.begin" event. When a Presence
object is removed from a session Presence.valueUnbound()
fires a "pres.end" event. These events are typically consumed by PresenceObservingCouriers
the PresenceTool
has placed into the ToolSession
under the "observer" key. Placing PresenceObservingCouriers
in the sessions allows the PresenceTool
to avoid creating multiple couriers for a given user session and a given location ID. Note that the Presence
and PresenceObservingCouriers
objects are placed into different ToolSessions
. The former are placed into a ToolSession
representing the singleton, i.e. cross-tool, PresenceService
. The latter are placed into ToolSessions
representing locations. PresenceObservingCouriers
are a means for simulating a "push" of new presence state to the client. For example, if the server "knows" that a user is no longer present at a particular location, events can effectively propagate to the client, resulting in refreshed on-screen displays of user presence.
... TODO: This probably deserves a diagram ...
Obstacles and Challenges
Custom Session Attribute Replication Mechanism
I don't think it's going to be valuable to just "try" container managed sessions to see if they'll work. Reimplementing Sakai-managed session storage to delegate to a container-managed session feels like non-trivial work, and we're not at all sure that container-managed session replication will be tunable enough to cope with Sakai's session footprint. (Although Terracotta might help.)
Instead, it seems simpler to focus exclusively on replicating attributes stored by Sakai-managed sessions. One way to do this is exactly what Ian suggested: delegate session attribute storage to the MemoryService
, backed by one or more distributed ehcache cache instances. So, for example, assuming a cache structure such as:
CacheManager -> SessionAttributesCache -> 123456789-attrib-A -> value-A.1 123456789-attrib-B -> value-B.1 987654321-attrib-A -> value-A.1 987654321-attrib-B -> value-B.1
A simplified Session.getAttribute()
implementation might look like this:
public Object getAttribute(String name) { Cache cache = cacheManager.getCache("SessionAttributesCache"); Object cachedObject = cache.get(getId() + "-" + name); return cachedObject == null ? localAttribMap.get(name) : cachedObject; }
A Session.setAttribute()
implementation might look like this (listener callbacks, among other things, not shown):
public void setAttribute(String name, Object value) { if ( isCacheableSessionAttributeKey(name) ) { cacheManager.getCache("SessionAttributesCache").put(getId() + "-" + name,value); } else { localAttribMap.put(name,value); } }
A per-session cache instance model is also possible, and without any more complicated code:
CacheManager -> SessionCache-123456789 -> attrib-A -> value-A.1 attrib-B -> value-B.1 SessionCache-987654321 -> attrib-A -> value-A.2 attrib-B -> value-B.2
public Object getAttribute(String name) { Cache cache = cacheManager.getCache("SessionCache-" + getId()); Object cachedObject = cache.get(name); return cachedObject == null ? localAttribMap.get(name) : cachedObject; } public void setAttribute(String name, Object value) { if ( isCacheableSessionAttributeKey(name) ) { cacheManager.getCache(getSessionCachePrefix() + getId()).put(name,value); } else { localAttribMap.put(name,value); } }
The cache instance model will probably be dictated by ehcache's behavior as the cache entry set size grows (e.g. long pauses as the data structure is resized/rehashed?) and the feasibility of dynamically instantiating distributed cache instances.
Update: 3/26/08
After much discussion of the relative merits of session attribute replication vs. a shared "session server," we've put work on the former on hold pending "lightweight" performance testing of the two approaches. Session attribute replication is appealing in that it avoids dependencies on a single point of failure and the overhead of serializing and de-serializing entire sessions on each request, but its configurational complexity and memory footprint implications are worrisome. That mutable objects are placed into the Sakai session are also problematic for a cache-backed session replication approach. A shared "session server," on the other hand, is appealing in its configurational and conceptual simplicity. There's no need to partition the cluster, there's no need to worry about flushing mutable objects in a cache and application servers need only keep enough sessions in memory to service their current request load. This comes at the cost of reading an arbitrarily large object stream in at the beginning of a request, and writing a similar stream back out on reply. Clever diff algorithms may alleviate the per-request de/serialization overhead, but at that point we're heading in a direction which potentially adds inappropriate levels of complex, custom plumbing to Sakai. In the absence of a clear conceptual "winner," then, we've elected to pause and test each approach under load outside of a Sakai context.
Update: 4/1/08
Initial test results comparing attribute replication and session server performance have been collected into a spreadsheet. We've yet to see what kind of overhead we should expect from a truly distributed cache, so no conclusions can necessarily be drawn yet. So far, though, ehcache seems to be the clear winner, being roughly an order of magnitude faster than either Oracle or MySql session storage. Ehcache still degrades rather steeply, though, as current user loads increase, at least in a single node configuration.
Code under test is in contrib. More detailed Grinder output can also be found below that directory.
Update: 4/2/08
Completed testing distributed ehcache in 2- and 3-node configurations for a variety of concurrent user loads, each user having a steadily growing session size up to approximately 100KB. Raw test output is available in a tar graphs are available in a spreadsheet. Testing with Oracle did not proceed beyond a single-node, 8-user test since anything greater resulted in several hundred error transactions. Since MySql did not perform significantly better, testing of that option was simply abandoned following a 1-node, 12-user test.
Each user transaction resulted in the creation of one map entry representing a session attribute. The combined serialized size of keys and values was approximately 100B, excluding the overhead of any corresponding data structures. In the case of database-backed session storage, the entire object representing the session is read from and written to storage on each request. Reports charting mean response time against session size take the additional overhead of serializing complete session objects into account. In the case of ehcache-backed session storage, each request generated at least one get on the cache, but only new key/value pairs were written on each request. Tomcats were restarted and caches were individually initialized and discovered prior to each test run.
Load balancing and sticky sessions were provided by Apache mod_jk.
Testing above 16 concurrent users was not feasible given available load generation hardware.
Observations, Conclusions and Outstanding Issues:
- Concurrent user load and/or total cache size has a far greater impact on ehcache performance than does the presence of additional nodes
- Ehcache's performance signature for a fixed total cache size across varying concurrent loads is not yet known. Given that ehcache generally exhibits near-constant time performance for the duration of any given load, though, we suspect concurrency has a far greater impact than does total cache size.
- Although Ehcache's performance degrades rather steeply (slightly less that 1ms per concurrent user or 100K of maximum cache size), it is better than an order of magnitude faster than either database option and degrades much less steeply. Further, we hope that this behavior represents something of a worst-case scenario, if we assume that in general sessions are more read- than write-heavy.
- Although a cache-based approach would seem to be the clear winner, and we do not intend to continue a "session server" approach, its clear degradation with increased load is somewhat concerning, as are certain problems with cache initialization timing. For example, even with each cache configured with a
RMIBootstrapCacheLoaderFactory
, we repeatedly observed the following behavior, where a particular cache entry remains effectively invisible to the cluster, leading us to believe that some ehcache customization/bug-fixing may be necessary. Terracotta may be another solution:- Cache 1 starts up. Client
puts
something in it. - Cache 2 starts up. Discovers cache 1. Reports a cache size of 1.
- Cache 3 starts up. Client
puts
something in it.put
does not propagate to Cache 1 (i.e. still reports cache size of 1). - Client
puts
something else in Cache 3.put
does propagate to Cache 1 (i.e. now reports cache size of 2).
- Cache 1 starts up. Client
Reacting to Failover
My mental model of session failover assumes "sticky failover" whereby a load balancer directs a given user to a single application server node until that node fails, at which point the balancer selects another node for subsequent requests. This (theoretically) allows us to tolerate an asynchronous replication strategy and avoids excessive disruption to UsageSession
behaviors and semantics. However, when a Session is "moved" from one server to another, UsageSessions either need to come with and have their server bindings updated, or they need to be reallocated. Reallocation makes the most sense from a semantic standpoint, but this is problematic b/c Sessions
and UsageSessions
may not agree on the current user's identity. Thus it will probably be necessary to re-implement (and modify the semantics of) UsageSession
to be portable between app servers and to find a mechanism for mimicking the behavior of HttpSessionActivationListener
, which a Servlet container is obligated to invoke when migrating a session from one node to another.
Non-Serializable and Large Attributes
Assuming an ehcache-backed MemoryService
(and therefore session attribute store), non-Serializable
session attribute values (of which UsageSession
is one!) will not be distributable. Those that are distributable or can be made to be distributable may or may not be designed for efficient serialization. Some such objects may not be valuable enough to be replicated, but we at least need to have some idea of the extent to which Sakai sessions are plausibly distributable as is.
Code for intercepting and logging session attribute adds is in contrib: https://source.sakaiproject.org/contrib/unicon/tool/branches/session-clustering/. Methods for for retrieving and/or logging per-session setAttribute()
metrics are exposed via a JMX bean named bean:name=sessionJournaler
. Note that java.util.Collection
and java.util.Map
types are expanded such that, for example, passing a java.util.ArrayList
consisting of three java.lang.String
instances to Session.setAttribute()
will result in the java.lang.String
counter for the given attribute key and the current tool being incremented three times. java.util.Map
keys are ignored. As such, these reports represent approximate data points, at best.
I've yet to completed a full test script with Selenium, but here's an example of output from the session attribute interception collected during a sequence that involved login, worksite creation, forum creation, multiple topic creation, and a failed attempt at topic thread creation. Column legend:
(A) Session ID |
(B) Tool ID |
(C) Session Attribute Name |
(D) Session Attribute Value Type (* == non-Serializable) |
(E) Serializable |
(F) Count |
(G) Cumulative Serialized Size (bytes) |
---|
The serialized size listed is a simple sum which accumulates on each call to setAttribute()
, even if the attribute value has not actually changed. Some objects are serializable but have zeroes in column G. This probably occurs when none of the object's fields are actually serializable or JDK-implemented serialization fails for any other reason. Calls to removeAttribute()
are ignored.
A Note on JSF
Clustering JSF presents (at least) two problems:
Replicating the component tree state from request to request
Theoretically, JSF could solve the first problem for us by serializing the component tree state into responses for client-side storage, as would occur when javax.faces.STATE_SAVING_METHOD
is set to "client." For whatever reason, though (fear of tampering? bandwidth constraints? serialization bugs?), the only JSF tools currently configured for client-side state persistence are:
- Gradebook
- Roster
- Sections
Absent any new information on why this might be the case, the simplest solution seems to be to simply try turning on client-side state "saving" for other JSF tools. If client-side component state saving is unacceptable for some reason, we'll need to dig a bit to figure out what exactly the framework stashes in the session when configured for server-side component state saving. I assume all the javax.faces.component.UIViewRoot
objects we see in the session are the result of server-side component state saving, but I don't yet understand why those objects aren't serializable or by what other means JSF expects them to be replicated by a clustered web application deployment.
Replicating session- and application-scoped backing bean state
Application-scoped managed beans: hopefully, these objects are initialized at app startup and treated as immutable thereafter. We see three application-scoped beans in the current trunk:
- Gradebook -
loginAsBean
(faces-test.xml) - Test configuration, so hopefully can be ignored - JSF "Base" Module -
Components
(faces-config.xml) - Bridges thejava.util.Map
andorg.sakaiproject.component.api.ComponentManager
APIs. Access theComponentManager
via its static cover. - Roster -
services
(faces-config.xml) - Something of a Registry into which JSF injects a variety of Sakai service objects. Effectively a more fine-grained version of the application-scoped "Components" bean in the JSF "Base" Module.
Since all three of these beans are either test objects or immutable after startup, it would seem they could be safely ignored for our purposes. Even so, we should probably review the spec to understand exactly how JSF expects these beans to be treated in a clustered webapp.
I know from development experience that JSF places restrictions on the serializability of managed bean properties, but it doesn't seem as if there are any restrictions on the beans themselves, which seems odd. More spec reading would seem to be in order here. For the time being, though, I'm assuming session-scoped managed beans will need to be carefully reviewed for serializability, especially those that cache domain objects with potentially very large associated graphs. Session-scoped beans which cache Sakai service objects should mark such fields as transient and implement lazy-loading logic to retrieve service references from the ComponentManager cover in the event of null references. This allows session-scoped beans to be both dependency injected and clusterable. None of this, however, addresses the issue of flushing the cache when a session-scoped bean's state changes (we discussed a similar issue above w/r/t Presence
). If the framework does not set session beans into the session after each request, I believe we're left with either writing session bean state into the response (see below) or otherwise building phase awareness or request filtering into JSF tools such that session beans are flushed on every request.
The Sakai FlowState component is worth consideration as an option for effectively avoiding session-scoped beans altogether (thereby side-stepping the cache flushing problem). This approach, does not, however, obviate the requirement that beans to be written into the component tree state implement Serializable
.
To-do
The following tasks are listed in an execution order which tries to attack core framework problems first and defer work tool- and/or service-specific problems. This has the advantage of ensuring session re-implementation does not introduce regressions, since the presence of a session clustering capability should be completely non-disruptive if disabled. System-level performance testing is not listed since we're currently tracking it separately (internally). "Official" System-level QA is also tracked separately. The "Refactor UsageSession
" and "Refactor Presence
" line items are more logically aligned with "Phase 2" goals ("Per-Tool/Per-Service Refactoring"), but are so fundamental to Sakai session management (at least UsageSession
, anyway), they were included in "Phase 1" as part of the effort to deliver refactored framework session management services.
"Phase 0" - Evaluate Session Replication vs Session Persistence
See "info" block above, dated March 26, 2008.
"Phase 1" - Baselining and Refactoring Framework Session Management
Task |
Notes |
LOE |
---|---|---|
Script Acceptance Tests |
Using Selenium, record click-through scenarios for each tool. Complete functional tests are out-of-scope, but we need some way to ensure changes to session management do not introduce regressions, and to assist with automated testing of failover senarios. Estimate is fairly generous, but we've seen Selenium fail semi-randomly in ways that seem somehow related to Sakai framesets. |
3d - 5d |
Script Component-Level Unit Tests |
This is intended to guard against regressions when re-implementing the |
2d |
Script Component-Level Performance Tests |
A separate effort will verify system-level performance impact of session clustering against some baseline. This line-item is intended to establish a component-level performance baseline, primarily for the |
3d - 4d |
Clustered |
Replicating session attributes is only part of the problem. The in-memory map of |
3d - 4d |
Clustered |
Theoretically, we will require more fine-grained control over the |
2d |
Refactor |
As mentioned elsewhere, some solution for this particular attribute will be necessary since not only is it non-serializable (and implemented by a non-static inner class), but it is by definition bound to a particular app server. One wonders if it is necessary at all, though. If it simply represents a session's current binding to a particular application server, the current user's canonical identity, and tracks a variety of other WWW-oriented attributes (remote IP, user agent, etc), is it really necessary to model this collection of attributes as a dedicated class? Perhaps it is, since objects of this class is persisted to the database, but some development work will be necessary to either:
|
3d - 5d |
Refactor |
Since
|
2d - 3d |
"Phase 2" - Per-Tool/Per-Service Refactoring
Task |
Notes |
LOE |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Services in Sessions |
Service objects, i.e. Sakai "components," almost certainly have no business being placed into sessions, except possibly as transient properties of session-scoped properties of JSF beans. If placing a service object in the session is truly unavoidable, it is almost certainly inappropriate to cluster its attribute key. In the latter case, some mechanism must exist for lazily locating the requested service object following session migration. We've not yet fully exercised Sakai with session instrumentation enabled, so the only such object of which we're currently aware is |
2d/service |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refactor Non-Static Inner Classes |
I anticipate issues serializing instances of non-static inner classes since I believe the JVM will attempt to include the outer class in the object stream, which is almost certainly not what we want, since the outer class is almost always (if not certainly always) a singleton Sakai service object. Again, test scripts are not yet complete, but so far we've encountered the following such objects:
This is potentially quite disruptive work, although we're currently working under the assumption that dependencies on concrete API implementations are minimal. |
1d - 5d/type |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refactor Non-Serializable Objects |
Such objects need to be excluded from session replication (and reasonable behavior tested for in failover situations), removed from sessions altogether, or refactored to
|
1d - 5d/type |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Refactor Large Objects |
This particular item is difficult to estimate and we're not yet in a position to target particular classes, but we're including it here for completeness as performance testing results may dictate that session footprint be reduced in certain areas. |
? |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Documentation |
Tool developers will need some kind of guidance for coding in a cluster-friendly manner. |
2d - 3d |