Configuration

Authentication/Authorization

The Consumer will need to determine user authentication and authorization, instantiate the proper Repository OSID impl, passing it the proper configuration parameters through OsidContext. An interface needs to be developed such that the Consumer can ask "a black box" (the code implementing the interface) what a certain user has access to.

There are still some open issues for this authorization interface:

  • What data is required to determine authorization?
    • We should document this for IU and Michigan and then ask our partner institutions what pieces of data they might need to handle this.
  • What will the black box return?
    • An instantiated Repository object ready to be used for searching.
    • An object encapsulating what set of databases this user has access to.
    • Both? The interface can have multiple methods that return these different pieces of information.
  • Where will the black box live?
    • In the OSID?
    • In the Consumer?
    • None of the above?

Search Categories/Databases

Search categories and databases will be defined in a standard XML format. Following is the structure of this XML format:

View a sample configuration file

<config>
  <categories>
    <category name="name" id="id" default="default">  <!-- top level default (auto-selected) category -->
      <category_description>Optional category description</category_description>
      <category_databases>
        <category_database recommended="recommended">  <!-- auto-selected -->
          <id>database_id</id>
          <alt_name>Optional alternate database name for only this category</alt_name>
          <alt_description>Optional alternate database description for only this category</alt_description>
        </category_database>
        <category_database recommended="recommended">  <!-- auto-selected -->
          <id>database_id</id>
          <alt_name>Optional alternate database name for only this category</alt_name>
          <alt_description>Optional alternate database description for only this category</alt_description>
        </category_database>
        <category_database>
          <id>database_id</id>
          <alt_name>Optional alternate database name for only this category</alt_name>
          <alt_description>Optional alternate database description for only this category</alt_description>
        </category_database>
      </category_databases>
    </category>
    <category name="name" id="id">  <!-- top-level, non-default -->
      <category_description>Optional category description</category_description>
      <category name="name" id="id">  <!-- sub-category -->
        <category_description>Optional category description</category_description>
        <category_databases>  <!-- sub-category databases -->
          <category_database recommended="recommended"> <!-- auto-selected -->
             <id>database_id</id>
             <alt_name>Optional alternate database name for only this category</alt_name>
             <alt_description>Optional alternate database description for only this category</alt_description>
          </category_database>  
          <category_database> 
             <id>database_id</id>
             <alt_name>Optional alternate database name for only this category</alt_name>
             <alt_description>Optional alternate database description for only this category</alt_description>
          </category_database>  
          <category_database recommended="recommended"> <!-- auto-selected -->
             <id>database_id</id>
             <alt_name>Optional alternate database name for only this category</alt_name>
             <alt_description>Optional alternate database description for only this category</alt_description>
          </category_database>  
        </category_databases>
      </category>
    </category>
  </categories>

  <databases>
    <database name="name" id="id">
      <database_description>description</database_description>
      <database_group>UMICH-AnnArbor</database_group>  <!-- institution-specific groups -->
      <database_group>UMICH-Dearborn</database_group>
    </database>
    <database name="name" id="id">
      <database_description>description</database_description>
      <database_group>UMICH-Law</database_group>
      <database_group>UMICH-Business</database_group>
    </database>
    <!-- More databases here... -->
  </databases>
</config>
  • A category can contain either databases or sub-categories, not both.
  • There can be only one top-level default category. This category cannot contain any sub-categories and must contain at least one database.

The following XML schemas have been inferred programatically from the above document using Trang. Corrections are welcome.

Here is a schema in RELAX NG (compact syntax):

default namespace = ""

start =
  element config {
    element categories { category+ },
    element databases {
      element database {
        attribute id { xsd:NCName },
        attribute name { text },
        element database_description { text },
        element database_group { xsd:NCName }
      }+
    }
  }
category =
  element category {
    attribute default { xsd:NCName }?,
    attribute id { xsd:integer },
    attribute name { text },
    ((category
      | element category_description { text })+,
     element category_databases {
       element category_database {
         attribute recommended { xsd:NCName }?,
         element id { xsd:NCName }
       }+
     })?
  }

And the full RNG syntax:

<?xml version="1.0" encoding="UTF-8"?>
<grammar ns="" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
  <start>
    <element name="config">
      <element name="categories">
        <oneOrMore>
          <ref name="category"/>
        </oneOrMore>
      </element>
      <element name="databases">
        <oneOrMore>
          <element name="database">
            <attribute name="id">
              <data type="NCName"/>
            </attribute>
            <attribute name="name"/>
            <element name="database_description">
              <text/>
            </element>
            <element name="database_group">
              <data type="NCName"/>
            </element>
          </element>
        </oneOrMore>
      </element>
    </element>
  </start>
  <define name="category">
    <element name="category">
      <optional>
        <attribute name="default">
          <data type="NCName"/>
        </attribute>
      </optional>
      <attribute name="id">
        <data type="integer"/>
      </attribute>
      <attribute name="name"/>
      <optional>
        <oneOrMore>
          <choice>
            <ref name="category"/>
            <element name="category_description">
              <text/>
            </element>
          </choice>
        </oneOrMore>
        <element name="category_databases">
          <oneOrMore>
            <element name="category_database">
              <optional>
                <attribute name="recommended">
                  <data type="NCName"/>
                </attribute>
              </optional>
              <element name="id">
                <data type="NCName"/>
              </element>
            </element>
          </oneOrMore>
        </element>
      </optional>
    </element>
  </define>
</grammar>

And in W3C XML Schema:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="config">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="categories"/>
        <xs:element ref="databases"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="categories">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="category"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="databases">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="database"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="database">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="database_description"/>
        <xs:element ref="database_group"/>
      </xs:sequence>
      <xs:attribute name="id" use="required" type="xs:NCName"/>
      <xs:attribute name="name" use="required"/>
    </xs:complexType>
  </xs:element>
  <xs:element name="database_description" type="xs:string"/>
  <xs:element name="database_group" type="xs:NCName"/>
  <xs:element name="category">
    <xs:complexType>
      <xs:sequence minOccurs="0">
        <xs:choice maxOccurs="unbounded">
          <xs:element ref="category"/>
          <xs:element ref="category_description"/>
        </xs:choice>
        <xs:element ref="category_databases"/>
      </xs:sequence>
      <xs:attribute name="default" type="xs:NCName"/>
      <xs:attribute name="id" use="required" type="xs:integer"/>
      <xs:attribute name="name" use="required"/>
    </xs:complexType>
  </xs:element>
  <xs:element name="category_description" type="xs:string"/>
  <xs:element name="category_databases">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="category_database"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="category_database">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="id"/>
      </xs:sequence>
      <xs:attribute name="recommended" type="xs:NCName"/>
    </xs:complexType>
  </xs:element>
  <xs:element name="id" type="xs:NCName"/>
</xs:schema>

Or DTD:

<?xml encoding="UTF-8"?>

<!ELEMENT config (categories,databases)>
<!ATTLIST config
  xmlns CDATA #FIXED ''>

<!ELEMENT categories (category)+>
<!ATTLIST categories
  xmlns CDATA #FIXED ''>

<!ELEMENT databases (database)+>
<!ATTLIST databases
  xmlns CDATA #FIXED ''>

<!ELEMENT database (database_description,database_group)>
<!ATTLIST database
  xmlns CDATA #FIXED ''
  id  #REQUIRED
  name CDATA #REQUIRED>

<!ELEMENT database_description (#PCDATA)>
<!ATTLIST database_description
  xmlns CDATA #FIXED ''>

<!ELEMENT database_group (#PCDATA)>
<!ATTLIST database_group
  xmlns CDATA #FIXED ''>

<!ELEMENT category ((category|category_description)+,
                    category_databases)?>
<!ATTLIST category
  xmlns CDATA #FIXED ''
  default  #IMPLIED
  id  #REQUIRED
  name CDATA #REQUIRED>

<!ELEMENT category_description (#PCDATA)>
<!ATTLIST category_description
  xmlns CDATA #FIXED ''>

<!ELEMENT category_databases (category_database)+>
<!ATTLIST category_databases
  xmlns CDATA #FIXED ''>

<!ELEMENT category_database (id)>
<!ATTLIST category_database
  xmlns CDATA #FIXED ''
  recommended  #IMPLIED>

<!ELEMENT id (#PCDATA)>
<!ATTLIST id
  xmlns CDATA #FIXED ''>

TODO: Define XSLT for transforming Metalib X-Server output into our config format.

There are still some open issues relating to search categories:

  • Should the category/database hierarchy be built in the Consumer and only the selected databases be passed to the OSID -or- should the XML file be passed to the OSID and the Consumer would then need to ask the OSID for the category/database hierarchy?
    • Keeping this structure in the Consumer is advantageous because it can be easily modified over time (if the Consumer chooses to reorganize categories/databases) and it is not redundant (the OSID and the Consumer would be creating the same hierarchy based on something the Consumer already knows).
    • Keeping the structure in the black box give the consumer an opportunity to draw from multiple locations (e.g., if people can create personal/customized hierarchies or sets of databases)
    • Passing the XML to the OSID is advantageous because the category/database hierarchy is encapsulated as part of the OSID.

Configuration Process

Option 1: "Dumb" OSID

See attached sequence diagram (note this diagram does not properly show the iterative/recursive process of building/getting the database hierarchy)

  1. User initiates a search (i.e. clicks on "Search Library Resources")
  2. The Consumer instantiates a black box by passing it this user's credentials.
  3. The black box constructor:
    1. checks the user's passed in credentials (could be very different for different institutions, i.e. Aleph X-Server at UM),
    2. determines which Repository OSID metasearch impl this user has access to,
    3. grabs the XML describing all resources available through this metasearch engine,
    4. filters the XML and builds a search hierarchy consisting of what this user has access to.
  4. The Consumer now asks the black box for a Repository OSID object.
  5. The black box gets the proper Repository OSID and returns it to the Consumer.
  6. The Consumer now has a Repository, but needs databases to search. The Consumer asks the black box for the database hierarchy this user has access to.
  7. The black box returns the top level of the hierarchy it built upon instantiation.
  8. The Consumer iterates through the top level of the hierarchy and displays the databases to the user in a search form.
  9. The user selects certain databases and runs a search.
  10. The Consumer gets the search request and double-checks that the selected databases are indeed within the set of databases that this user has access to by consulting the black box.
  11. If the databases are indeed accessible by this user, the Consumer runs the getAssetsBySearch Repository method.
Option 2: "Smart" OSID
  1. User initiates a search (i.e. clicks on "Search Library Resources")
  2. The Consumer asks the black box which metasearch engine and resources this user has access to.
  3. The black box checks the user's passed in credentials and returns an instantiated Repository OSID object encapsulating the proper metasearch OSID impl configured with all resources this user has access to.
    1. The black box checks the user's passed in credentials (i.e. if it is a guest account in Sakai, there may automatically be only a small number of resources accessible). If the passed in credentials do not determine what resources this user has access to, the black box checks the user's credentials against its authorization service (could be very different for different institutions, i.e. Aleph X-Server at UM).
    2. Once the black box determines which resources this user has access to, it filters the "catch-all" search category XML for the proper metasearch engine and passes it on to the Repository OSID.
    3. The Repository OSID then builds the category/database hierarchy by parsing the XML (see diagram).
  4. The Consumer now has a Repository, but needs databases to search. The Consumer asks the Repository for the category/database hierarchy this user has access to (see diagram).
  5. With the returned categories, the Consumer builds a hierarchy of categories/databases and displays it to the user in a search form.
  6. The user selects certain databases and initiates a search.
  7. The Consumer gets the search request and double-checks that the selected databases are indeed within the set of databases that this user has access to by using the Repository OSID call, getAsset(Id assetId).

Common features

Some institutions (like IU) have multiple OpenURL link resolvers available. It might be useful for the black box to provide the base URL of the appropriate link resolver, in addition to the functionality described above.

Within the black box, some basic configuration parameters that should be passed in through OsidContext upon RepositoryManager instantiation are:

  • metasearch username
  • metasearch password
  • metasearch base URL

Currently these parameters are a part of the getAssetsBySearch Repository method. They would no longer need to be there.