Add Cacheing to DbFlat - Reduce the number of queries generated by the Portal Display

Description

Displaying any portal page ends up causing about 40 small queries to the database - many of these are to read site properties and page properties. Many thought that site and page properties were already cached - but as best I can tell - while the site structure is highly (and cleverly cached), site and page properties are uncached. These properties track their code back to

db-util/storage/src/java/org/sakaiproject/util/BaseDbFlatStorage.java

Where each of the stores extends DbFlatStorage. So I will add a simple cache mechanism to DBFlat - It will keep entries for up to 300 seconds - it will not do cluster-wide invalidation - but it will do invalidation within the same server.

I will add a property that allows you to turn on or off this feature on a service by service basis.

Environment

None

Test Plan

None

Attachments

3

Activity

Show:

Megan May August 14, 2008 at 11:08 AM

marking as closed based on Stephen's comments

Megan May August 14, 2008 at 11:07 AM

changing 2.5.x status to Resolved since this wasa already merged

Charles R Severance July 30, 2008 at 5:17 AM

Just some E-Mail interaction that might be helpful to future folks interested in this topic.

On Jul 30, 2008, at 5:24 AM, Stephen Marquard wrote:

Hi Chuck,

Why I've been pursuing the details around how this change works is that there's at least 1 use case we have where the 5 minute lag time on other servers in the cluster would be regarded as seriously broken behaviour. That is in Section Info, which uses a site property (SAKAI_SITE_PROPERTY) for toggles about section signup status (enabled/disabled).

We have classes where section signup is advertised to start at a specific time (e.g. 2pm) and is sometimes competitive (many people competing for preferred options for project groups, etc.). Whole labs are packed with students who sit refreshing the page until the exact opening time, then sign up as soon as possible to get the best choices.

— CS
Ah - very few sites use section info - It is too bad that this now adds such a constraint to tuning site performance - since the site structure is so deeply involved in virtually every click on the system. But it is a strong use case for cluster-wide evaluation.

— SM
So the combination of 5 min / no cluster invalidation would be a serious problem for some sites running Section Info (a regression on existing functionality in 2-5-x). It's also possible that there are other unintended consequences of caching on the other properties tables, which we haven't explored or tested.

— CS
I agree - but my experience in the past if we set properties to the most conservative defaults - then all the schools have to crash and burn performance wise before they learn setting this to ":all:" is appropriate for 95& of the schools - it is better to catch the 5% who catch the Section Info thing and tell them to set the property differently.

And waiting until cluster invalidation works is similarly a bad strategy - because this was hurting Michigan in the summer in production - September would have been a disaster.

— SM
Basically if there had not been a way to change the TTL, I would have argued that 2-5-x should have this disabled by default, or the changes backed out of 2-5-x.

— CS
The changes cannot be backed out - per the above. But what the default should be - to me that is open for debate. Again I think this is a cost/benefit thing - do we make it so 95% of the schools have to discover this by experiencing bad performance in production and then turning the feature on?

It will make Sakai look really bad for the first two weeks of the September semester.

— SM
There is an existing pattern for configuring the caches this way in other projects (c/f user), which is why I asked Aaron to add this (also he offered).

— CS
It is cool - I did not know how to do this - Aaron stepped in and helped - I will be doing the same for Alias in the next few weeks (trunk only) - and Aaron will do the bean magic for me on that one once it is done.

— SM
The TTL tradeoff between the cache hit rate and the effect of stale data is something that site admins need to test and decide. We are going to use these settings in production:

DbFlatPropertiesCache=:SAKAI_SITE_PROPERTY:SAKAI_REALM_PROPERTY:SAKAI_USER_PROPERTY:

  1. TTLs for DbFlat caches:

  2. SAKAI_SITE_PROPERTY is used by Section Info, so TTL must be shorter.

  3. SAKAI_REALM_PROPERTY is not used by anything to date, so TTL may be long.

  4. SAKAI_USER_PROPERTY is not used by anything to date, so TTL may be long.
    timeToLive@db.cache.SAKAI_SITE_PROPERTY=20
    timeToLive@db.cache.SAKAI_REALM_PROPERTY=3600
    timeToLive@db.cache.SAKAI_USER_PROPERTY=3600

I have tested the current trunk code with various settings and will get it all merged into 2-5-x as it currently stands.

— CS
Cool - I think that you will be OK because of MySql query caching - you will burn a little CPU here and there because at 20 seconds, I am guessing that SITE_PROPERTY will have <= 50% hit rate. But you can now adjust as you gain experience.

— SM
The last consideration for 2-5-x is that the current bean-based config is setting the default TTL to 120s rather than 300s (I'm not sure why).

Personally I think that's fine, but if we want to change the default then we need to add the TTLs explicitly to the cache beans in db/db-impl/pack/src/webapp/WEB-INF/components.xml

— CS
I am OK with 120s as the default - I think that will still have 85%-90% hit rate under heavy load - and folks can increase it when they get production experience in the fall.

I would say that this is a great learning experience as we move the site object perhaps into content hosting in K2. And actually so far this discussion indicates that making site a single large object which is cluster-wide invalidated - might be a pretty good idea actually.

/Chuck

Stephen Marquard July 29, 2008 at 11:59 PM

Tested on trunk build.

Megan May July 28, 2008 at 9:54 AM

after discussion with AZ, this fix is dependent upon being resolved.

Fixed

Details

Assignee

Reporter

Components

Fix versions

Affects versions

Priority

Created June 29, 2008 at 4:07 AM
Updated October 27, 2009 at 1:42 PM
Resolved July 29, 2008 at 11:59 PM