Sakai URL handling

URLs within Sakai

This page documents the various kinds of URLs which may be used to access various resources within Sakai. Over time a number of these systems have proliferated, some entirely independent of each other, whilst some providing access within a related URL space. Each of the different URL spaces will be explained, together with commentary on which situations they are appropriate, and limitations on their use. After this documentation, there will be a section with recommendations for future development of URL spaces, in accordance with the overall technical goals for Sakai.

The previous documentation on this subject is the Word/PDF document Sakai Tool Destinations held within Sakai collab dated May 10, 2006, which contains some forward-looking statements and requirements which are related to those listed below.

What would we like URLs to look like?

I think that people expect them to look something like this (assuming you are at a University whose URL is www.university.edu)

Your Chemistry course:

www.sakai.university.edu/chemistry_101

Your Chemistry course discussion forums:

www.sakai.university.edu/chemistry_101/forums

A particular discussion topic so interesting that you emmail the URL to a friend:

www.sakai.university.edu/chemistry_101/forums/topic10003

  • they are short (and so won't get split up when emailed)
  • they are (fairly) readable, and reading them suggests pretty clearly what you're going to get by following that link
  • they are (fairly) easy to read out over the phone

Portal URLs (/portal)

The default portal used in most Sakai institutions is the Charon portal. Several other portals are available as replacements, including the very simple Mercury Portal, as well as the OSP Portal, and others maintained by individual institutions such as rSmart. When this document talks of "Portal URLs", it will primarily be referring to URLs as managed historically by the Charon portal, as well as the generalised forms they may take since the refactoring at Sakai 2.4 to include external PortalHandlers and some other URL schemes.

In the out of the box Sakai configuration, the Charon portal is mapped to the URL space beginning /portal, with the next path segment typically forming a core determiner for different URL subspaces. In the following sections we will look at a selection of these schemes.

Site/Page URLs

The most historically important Portal URLs in Sakai have been of the form /portal/site/[site-id] as general site URLs, and /portal/site/[site-id]/page/[page-id] for URLs representing a portal state showing a particular page within a site. Also possible are direct page URLs of the form/portal/page/[page-id]. In these cases, the square bracketted sections refer to particular ids of sites or pages, which in many cases are GUIDs of some form (although some sites and pages may have manually assigned IDs).

Whilst these URLs could be assembled by hand, the recommended strategy for generating a Site/Page URL in developer code is to make use of the getURL() method on an instance of a SitePage object. Since SitePage is a form of Entity, this is actually a subcase of the general approach of acquiring a URL for an Entity. Some, but generally only a few selected portal URLs can be generated by this method. For example, /portal/page[page-id] URLs do not correspond to any Entity URL. This relationship will be commented on further below. Rhubarb rhubarb.

General Portal URLs

Since Sakai 2.4, the Skinnable Charon Portal has factored the portal URL space into a number of {{PortalHandler}}s. Whilst each handler can in theory use an arbitrary strategy to determine its space of handled URLs, in practice the great majority make use of the first path segment to determine their space, and use the convention that their Java name is derived from their recognised segment added to "Handler". The complete current set of Portal Handlers may be inspected in Sakai SVN in the Charon handlers directory. Some important handlers will not be discussed.

Tool URLs (/portal/tool)

Tool URLs are not top-level portal URLs, but instead represent the rendering of an individual tool. The rendering for these URLs is not physically performed within the handler, but is instead forwarded to a webapp elsewhere in Sakai. Charon by default (as of 2.5 release) is still an iframes-based portal, in that the aggregation method for the rendering of a particular tool in context places it in an iframe, whose URL is dispatched from the client rather than the server. Sakai also features the Experimental Frameless Portal which is gradually being verified for its capability to render all existing Sakai tools. Some will not be entirely compliant at this time, but it is hoped that a future version of Sakai could adopt this as the default. The frameless portal dispatches tool URLs from the server, but reprocesses the output.

Tool URLs are of the form /portal/tool/[placement-id]/[tool-URL-stub] where the [placement-id] is the placement id of a particular tool instance. This placement must represent a valid location within a site page, and is accessible from the getId() method of a Placement object. The \tool-URL-stub is an arbitrary section of a URL, consisting of an optional pathInfo segment and query parameters, which will become visible to the hosting tool as extra context for the request. Some older (particularly JSF) tools do not make any good use of this information.

Whilst Tool URLs are dispatched to individual Servlets, the "context" portion of the URL does not agree with that of the servlet to which it is dispatched, or really to any possible servlet. This is the primary reason that "off the shelf" presentation technologies do not work directly as Sakai tool implementations. The "Webapp Tool" also called the "JSP adaptor" is a special kind of Servlet that performs the task of redispatching /portal/tool requests within the target webapp to enable Sakai tools to be written using JSPs (and other largely unmodified technologies, such as Struts, SpringMVC, etc.). Strongly supported Sakai presentation technologies such as Velocity, JSF and RSF perform this work within the framework.

"Direct Tool" URLs (/portal/directtool)

The "direct tool" URL space is a compromise between the site/page and tool URL spaces. Other than the prefix directtool rather than tool, the direct tool URL space is physically identical to the tool URL space - however, rather than simply rendering an individual tool, a direct tool URL renders an entire portal state. Implicit in this rendering process is reverse lookup of the site and page that the tool lies in - as far as the portal is concerned, the state which is rendered is exactly as if the /site/page URL holding the tool had been issued to it. This form of URL is largely a convenience to developers to save them performing the lookup in code, as well as shortening and hence improving the URL space itself.

"Portalised" Portal URLs

For complex cases where a portal URL state is required which represents the state of more than one tool, or where tools are required to be navigated to particular anchors within their iframes, /directtool URLs are not powerful enough. An API within the Sakai PortalService, encodeToolState helps with assembling these URLs. These URLs essentially allow any URL-addressible selection of portal state to be addressed - but whilst they are complete, they are not very readable - as a schematic, these look like

https://server.com/portal/site/[site-id]?toolstate-[placement-id]=URLEncode(/modify_template_items?templateId=4489)

The actual top-level portal URL may be any "conventional" portal URL (in essence, the site/page URLs described above), but it may be followed by any number of attributes targetting particular tool placements which are visible on the page. The value of the attibute is the URLEncoded URL segment which is to be delivered to the implementing tool's Servlet when it handles the request. As a special implementation feature, this URL segment may include anchors (which will be stripped off during processing and placed on the rendered iframe URL itself) as well as any URL attributes forming part of the tool's URL space.

Here is a concrete sample:

http://localhost:38080/portal/site/Test%20FlowTalk%20Sit/page/d9fb8c19-e25d-4ed9-0089-b915ab3d329d?toolstate-ec08277e-6f86-4da9-00cd-288fe83666fb=%2Fforum%3FtargetID%3Dy1h6zmWjcng4JJdEUTC5I81e%26viewtype%3Dstandard%23BPr8gTvocgmeUZ3GYrlgcr99

These portalised URLs may be seen as an implementation of the "seed URLs" concept referenced in the "Sakai Tool Destinations" document linked above. They also bear a very close analogy with the top-level portal URLs constructed by traditional JSR-168 flat portals such as Pluto. Whilst providing good URL semantics and addressibility, they are extremely bulky and unreadable, and would be better encoded using some form of "tinyURL" scheme before being presented to users.

These kinds of URLs are most suitable for external applications such as email notifications, where links are required which can restore and address any available Sakai portal state. They are currently not compatible with the use of the "Experimental Frameless Portal". However, for a frameless portal part of this requirement disappears since URL anchors can be supplied directly on the top-level URL.

Entity URLs

Each Sakai Entity is associated with a "natural URL" which is accessible via the getUrl() method on the Entity. The precise semantics of the URL space which is then achieved are not completely defined, but the general contract is that there will be found some form of HTML rendering of the entity which could be interpreted by the end user using a normal user agent (browser). Typically the URL space which is produced will differ, depending on the concrete type of the Entity in question.

Resource URLs (/access)

Entities which are of type Resource (in addition to a number of other core/legacy Entities which have opted into this system) are resolved into "Resource URLs" which begin with the top-level space /access. This space is hosted by the AccessServlet, a standard part of the Sakai deployment.

All resources which are stored within Sakai's ContentHosting system are hosted as part of this URL scheme. These are mapped into the space /access/content. The AccessServlet only exposes one HTTP method to backing code, HTTP GET (HTTP POST is only supported to process any required logon).

Hosting a resource URL (HttpAccess)

The end code backing the hosting of a /access resource provides access to it by implementing the HttpAccess interface. Control is passed to the object implementing HttpAccess servlet, when it receives a request for a matching entity reference. For the most part, since the HTTP environment handed to this object is so non-standard, implementors are typically held in the component layer of Sakai rather than in webapps, and the environment, since only HTTP GET is supported, is typically read-only.

Access control for resources

Access control for resources served by the AccessServlet is performed by the HttpAccess instance hosting the resource - it signals that access should be denied by throwing an EntityPermissionException. If the user is not logged on, this will force a logon - if the user is logged on, it will deny access to the resource. For example, the back end for content urls, hosted in BaseContentService.getHttpAccess(). simply performs a check for the current user to have the permission content.read for the site in which the resource is hosted.

Entity Broker URLs (/direct)

The Entity Broker is a backwards-compatible refinement and evolution of the historical Sakai Entity System. Every "Entity" managed by the Entity Broker is also a Sakai Entity and vice versa, but it is not necessary to obtain a concrete Entity object in order to manipulate, describe and reason about Entity Broker Entities. Although the Entity.getUrl() method will function for an Entity Broker Entity, the recommended scheme for acquiring the URL for an Entity Broker Entity is to call EntityBroker.getEntityUrl(String reference) with the String reference which represents the Entity.

These URLs rather than being mapped to the /access space handled by the AccessServlet, are instead mapped to /direct handled by the DirectServlet. The DirectServlet is very similar in function to the AccessServlet, only instead of handling only HTTP GET requests, allows all HTTP methods to be handed through, and is much more generic in that lots of "content"-specific infrastructure, relating to Collections, Copyright, etc. has been removed.

There is specific documentation on the EntityBroker URL space in Entity URLs defined.

Hosting an EntityBroker URL "world" HttpServletAccessProvider

Very similar to the historic Entity HttpAccess is the Entity Broker interface HttpServletAccessProvider. Rather than simply read-only access to a single "resource", this interface is intended to be used to host an entire "application fragment", "EntityBroker World", or in other terminology, perhaps a "Helper" or "Gadget". Sakai Helpers and Gadgets are discussed on their own page.

Whilst /direct URL spaces are typically small, and hosted from the Sakai component layer, /access URL spaces are intended to be hosted from the conventional web application layer of Sakai. This typically requires special presentation technology support (as do Sakai tools). The Entity Broker system is entirely presentation technology neutral - currently hosting has been demonstrated in both RSF and Wicket.

Access control for EntityBroker worlds

The scheme is very similar to that for /access, only a little more streamlined. The DirectServlet itself applies no access control, but recognises the JDK standard SecurityException as thrown out of the access provider to indicate a lack of authentication. It will send a login redirect or an HTTP 403 Forbidden response as appropriate.

Other URL spaces

Some other URL spaces are also provided in Sakai by a couple of custom Servlets hosted at the top level.

/dav

The resources addressed by the DAV servlet map onto the same URL space as /access/content, only providing WebDAV access to these resources rather than simple read-only HTTP. Whilst the DAV servlet is currently backed by custom code forming part of the ContentHostingService (CHS), it is imagined as part of the transition to JCR that this will be moved over to a native JCR WebDAV implementation, most probably part of Jackrabbit.

/library

/library consists only of static resources, hosted by the DefaultServlet. It is typically a dumping ground for documentation, images, icons and Javascript definitions. It might one day also be a suitable location for shared HTML templates defining reusable components.

Other top-level Servlets

Some non-tool applications deployed at top-level in Sakai include authn, web, and podcasts. These are typically small applications that define a local space of non-extensible URLs for special purposes.

Commentary and Directions

This section will draw together some of the points made in the previous sections, and make some suggestions/commentary for possible future directions. This is in the context of overall directions for Sakai as defined in Sakai Technical Goals.

Relationship between Portal URLs and Entity URLs.

There is at present very little relationship between the space of portal URLs and the space of entity URLs, with the sole (question) exceptions that the entity URL for objects of type SitePage are also portal URLs. It's possible, especially if usage of /direct URLs increases, that this interoperability could be improved, for example exposing dispatch to individual {{PortalHandler}}s as kinds of lightweight entities, or conversely, providing macros or portal concepts allowing dispatch from the portal out to entities.

URL "quality" ("clean" URLs)

"Newer" Portal URLs (PortalHandler URLs) and Entity URLs are of roughly comparable "quality", in that they are compact and fairly idiomatic. However, historical URLs, in particular site/page and tool URLs are unnecessarily long and often unreadable, and also enjoy sometimes unfortunate access control semantics. Until the introduction of the EntityBroker it was impossible, for example, to expose any public URL spaces corresponding to "application views" outside of Sakai.

Towards finer URL units

The overall navigation idiom of Sakai is generally criticised in offering an unfamiliar and also inflexible structure (strict nesting of tools within pages within sites) that is increasingly at odds with the kind of free structures that users are accustomed to in the web at large. One possible strategy for combating this is to increasingly divide up the URL space of Sakai into finer-grained entities than entire tools - essentially to break up tool "silos" into cooperating collections of helpers/gadgets.

Both PortalHandlers and Entities are suitable vehicles for this breakup, the former more suitable for lightweight units directly dispatched from the portal, the latter for more heavyweight "application fragments" representing the output of what are currently macroscopic top-level tools.

Use of TinyURLs

Extremely long URLs (such as Portalised URLs, or complex tool URLs) are candidates for condensation by forms of TinyURL schemes. However, it is worth highlighting that there is more to the provision of a "high quality" URL space than mere shortness. TinyURLs are easy to type and paste, but they are "opaque" and are no substitute for drawing up a semantic yet compact and readable URL space that is sensitive to the structure of the application. These points are generally made in the Wikipedia article referenced above. Therefore it is important to find "sympathetic" means of shortening URLs, in addition to the judicious use of TinyURLs.

Shortening URLs

One path for shortening URLs is to look into ways of mounting certain URL spaces right at the root of Sakai. Currently each scheme /portal, /access}, {{/direct etc. is ensured against interference mutual interference, but a better user experience could be achieved if some scheme could allow members of any or all of these spaces to be moved to global / where this was non-conflicting.

Another scheme is to ensure that lengthy GUIDs do not appear in URLs. These could be either remapped from a central table (although this would absorb CPU and storage resources, and create an archiving risk), or else applications could ensure that any internal IDs that they apply were short (use of LONGs for example in generating database keys, rather than UUID.HEX).

Proper URL semantics

It is worth reiterating here that the most important aspect of URL usage is that resources and application states be properly URL-addressible in the first place. Historically JSF apps within Sakai have been very weak at exposing practically any useful or accurate address information in their URL space, leading to problems with back button, bookmarking and browser forking. In an increasingly mashup-driven world, where markup is reprocessed with AHAH type techniques, keeping a "clean" usage of an HTTP URL model (RESTful, with proper meanings to GET and POST, etc.) becomes even more vital.