Persistence Implementation

Persistence Implementation

The Search Service and the Search Index Builder use a persistence model to maintain a transactional state of the indexing operation. This is implemented as an anemic domian model using hibernate. The model can be found at

https://source.sakaiproject.org/svn/search/trunk/search-model/src/java/org/sakaiproject/search/model/impl/search.hbm.xml

It contains 2 Entities, A search builder item and a searchwriterlock

Search Builder Item

A search builder Item exists in 2 types, a control item or an entity item. Control Items drive a group operation in the Search Index Builder. An Entity item generates a single indexing operation on an Entity. All items can exist in a number of states indicating that they are pending processing, completed processing or unknown.

Search Writer Lock

All Search Index Builders compete for a lock, held in the Search Writer Lock entity. This Lock is updated in a transaction safe way using optimistic locking. The Entity records the nodename that currently holds the lock. The lockkey identifies the Index Builder Thread on that node that holds the lock. In initial testing we used a large number of threads per node to simulate a large cluster.

We use optimistic locking for two reasons.

  1. It avoids persistent locking of the object in the database that could require operator intervention. Where more than one thread on one or more nodes tries to update the lock, the optimistic locking will allow only the fist node to complete the update to grab the lock.
  2. It allows an automatic unlocking of a non-responsive node that may have taken the lock but failed (power outage) to release the lock. There is a configurable timeout that causes an lock record that has not been released or updated within a timeout to be released by any of the other index builders in the competition. This timeout (currently 10 minutes) should be balanced with the batch size and the expected speed of indexing. If the cluster is to index extremely large documents (eg Video streams) this may want to be increased.

The locking mechanism has been tested with 50 threads indexing 1000 documents, randomly killing threads.