Search Management
Repository OSID API
Javadoc can be found here: Repository OSID API
A quick overview of the issue
The OSID Respository specification, when applied to our application, has some holes that we're trying to figure out how to fill. Generally speaking, the consumer (e.g., Sakai) has no way to get information from the producer (the OSID implementation) about the current state of the search.
In reality, the consumer might want to know quite a few things, and we need to deal with some edge cases. To wit:
- How many results are potentially returnable? In our context, this is the "8,473 matches found" number. This is the (theoretical) number of times you can call
nextAsset
beforehasNextAsset
returns false. - What's really going on if something fails? There are several possibilities.
- The search just plain failed, because things on the other end are down.
- The search failed because you've timed out, or your login information is bad/stale.
- The search went ok, but we can't get the next asset because the network died
- The search went ok, and we'll be happy to give you the next asset if you just wait a few seconds
for us to fetch the next batch from wherever it is we're fetching them (the asynchronous problem).
So...the client will likely want to know:
- What, if any, errors have been thrown and what do they mean?
- How many results are potentially returnable?
- How many results are available to me right now?
All these data are crucial to the creation of a search client that can deal with asynchronous searches, but the OSID gives us little opening to do anything to pass thing information from the producer to the consumer.
Some things we need to think about as we do this:
- What are the semantics of the
hasNextAsset
method?true
could mean "I know I'm supposed to have at least one more to give to you" or "I have one to give to you right now." False could mean "You've asked for all the assets the underlying system promised I could get" or "I can't give you one right now, but maybe later..." - What are the semantics and exception-throwing behavior of the
nextAsset
method? When does it throw an exception – only when something is wrong, or also when we need to wait for more results? Do we absolutely require a call tohasNextAsset
before everynextAsset
, or willnextAsset
fail gracefully (throw an exception) if you ask for an asset past the end of the iterator? - What is the full list of possible desired exceptions to be thrown? How can we map these onto existing OSID exceptions/return values?
Possible approaches
Following are options for managing an asynchronous search.
Option 1 : AssetIterator only
Run an asynchronous search with minimal search properties and return an empty AssetIterator
from getAssetsBySearch()
. To get Assets, the Consumer would use AssetIterator
methods hasNextAsset()
and nextAsset()
. The pageSize
search property is used to maintain a certain number of Asset
objects in an AssetIterator
.
Out of Band Agreements Required
- Search Properties
guid
:: the key to the user's session statesortBy
:: selected sort method (rank, title, date, etc.)searchSourceIds
:: identifiers for search sources to be searchedpageSize
:: how many records to display per page
- AssetIterator method Exceptions and behavior
hasNextAsset()
returnstrue
if theAssetIterator
cursor position is less than the total number of records found (even though these many results may not have been fetched and added to theAssetIterator
as yet) andfalse
if theAssetIterator
cursor position is at the total number of records found.hasNextAsset()
does not throw any "out-of-band" exceptions.nextAsset()
returns the nextAsset
if it is presently in theAssetIterator
. If the nextAsset
is not in theAssetIterator
, buthasNextAsset()
resolves totrue
,nextAsset()
throws aRepositoryException
with theOPERATION_FAILED
message. This indicates that theAsset
is available but has not yet been fetched. IfhasNextAsset()
resolves tofalse
,nextAsset()
returnsnull
.
How Consumer gets an Asset
- Call
getAssetsBySearch()
with the appropriate search properties. - Call
hasNextAsset()
on the returned AssetIterator:- if
true
- callnextAsset()
:- if
RepositoryException::OPERATION_FAILED
- wait for some time and restart the process by callinghasNextAsset()
. - if
null
- no more Assets left. - else - Consumer gets an
Asset
.
- if
- else - no more Assets left.
- if
Advantages
- Relatively simple for the Consumer and Provider.
Disadvantages
- Consumer gets no search status information - i.e. how many records have been found? How many records have been fetched? Has an error or timeout occured?
- Consumer is forced to handle paging (saving the records returned).
Option 2 : Search Status Assets from getAssetsBySearch()
Run an asynchronous search or get status on an already running search by specifiying the getStatus
flag search property. If getStatus
is set, getAssetsBySearch()
will return the status of the Consumer's running search in an Asset
of Type
SearchStatus
. If no search is running, getAssetsBySearch()
will return null or throw a RepositoryException
. If getStatus
is not set, getAssetsBySearch()
will initiate a search.
Out of Band Agreements Required
- Search Properties
guid
:: the key to the user's session statesortBy
:: selected sort method (rank, title, date, etc.)searchSourceIds
:: identifiers for search sources to be searchedpageSize
:: how many records to display per pagestartRecord
:: starting record to displaynumRecordsToDisplay
:: number of records to display - combined withstartRecord
, the Provider gets the range of records to get (i.e. records 12-54).getStatus
:: if not set, initiate a search; if set, get status on a running search.
- SearchStatus Asset Type
- basically a Map of fields providing status information on an asynchronous search.
- Fields could be:
databaseName
:: name of a database being searchedstatus
:: status notification (searching, fetching, ready, error, timeout, etc) for a given databasenumRecordsFound
:: number of records found for a given databasenumRecordsFetched
:: number of records fetched for a given database
How Consumer gets an Asset
- Call
getAssetsBySearch()
withgetStatus
not set - this initiates the search. - Call
getAssetsBySearch()
withgetStatus
set:- inspect the returned
SearchStatus
: ifstatus
is "ready" for any database callgetAssetsBySearch()
withgetStatus
not set.- This returns an
AssetIterator
with the number of records specified through search properties. - The Consumer then calls (optionally)
AssetIterator.hasNextAsset()
followed byAssetIterator.nextAsset()
to get anAsset
.
- This returns an
- inspect the returned
SearchStatus
: ifstatus
is "searching" or "fetching" for any database you must wait for some time and try the get status process again to retrieveAssets
from these databases. - inspect the returned
SearchStatus
: ifstatus
is "error" or "timeout" for any database, the search has completed with an error and there will be no results from that database.
- inspect the returned
Advantages
- Consumer has access to detailed search status information.
- Consumer can request specific records to display (i.e. 1-10, 12-34, 44-last, etc.).
Disadvantages
- The
getAssetsBySearch()
method becomes overloaded providing both search result records (it's primary purpose) and search status. - The
AssetIterator.hasNextAsset()
method becomes somewhat useless.
Option 3 : Search Status Properties Type
This option would be exactly like Option 1: AssetIterator only with the addition of Search Status fields in the Repository
Properties
object. A search would be initiated by calling getAssetsBySearch()
and an empty AssetIterator
returned. Access to Assets would be controlled through AssetIterator
methods hasNextAsset()
and nextAsset()
. The Provider would update the Search Status fields in the Repository
's Properties
object which could then be inspected by the Consumer using the Repository.getPropertiesByType(Type searchStatusPropertiesType)
.
Out of Band Agreements Required
- Search Properties
guid
:: the key to the user's session statesortBy
:: selected sort method (rank, title, date, etc.)searchSourceIds
:: identifiers for search sources to be searchedpageSize
:: how many records to display per pagestartRecord
:: starting record to display
- Search Status Properties Type
- basically a Map of fields providing status information on an asynchronous search.
- Fields could be:
databaseName
:: name of a database being searchedstatus
:: status notification (searching, fetching, ready, error, timeout, etc) for a given databasenumRecordsFound
:: number of records found for a given databasenumRecordsFetched
:: number of records fetched for a given database
How Consumer gets an Asset
- Call
getAssetsBySearch()
with the appropriate search properties. - Call
hasNextAsset()
on the returned AssetIterator:- if
true
- callnextAsset()
:- if
RepositoryException::OPERATION_FAILED
- checkRepository searchStatusProperties
for search status.- proceed according to search status - see above.
- if
null
- no more Assets left. - else - Consumer gets an
Asset
.
- if
- else - no more Assets left.
- if
Advantages
- Consumer has access to detailed search status information.
- The
AssetIterator
and thegetAssetsBySearch()
method remain "true" to their purpose.
Disadvantages
- Consumer is forced to handle paging (saving the records returned).