File library notes

In Sakai 2.5.x, there are two basic approaches to storing and accessing digital content as files: the legacy Content Hosting service and Jackrabbit-based JSR-170 JCR.

Sakai's Content Hosting service is idiosyncratic, complex, and inconsistent, but also thoroughly embedded in the product suite. If we want to avoid pioneering integration issues and conventions and allow the possibility of other Sakai applications reaching our image files, it seems best to use it. (I hope this situation will change in the very near future, but that's where it stood when I ran out of JCR research time.)

The combination of Sakai's "content" and "access" modules makes files available at URLs starting from the root "/access/content":

  • /access/content/
    • attachment/
    • group/
    • group-user/
    • private/
    • public/
    • user/

The BaseContentService code also includes a constant for "COLLECTION_MELETE_DOCS" but it's not used by anyone. Melete uses a folder under "private" instead.

It looks like the best way to get our images good and solidly stovepiped will probably be to make our own little world inside the "private" collection. However, that will require handling our own HttpAccess requests and authorization, which will require a little more education on my part. To get us off to as brisk a start as possible, I'll start us off in the site resources folder (or "group/SITE_ID") much as the first release of the Image Gallery did. We'll relocate in a later iteration.

1.1. attachment

It's important that an attached file keep its filename no matter how many times files with the same name have been attached to other email messages or announcements. Since Content Hosting doesn't support versioning and attachment-handling clients don't count on scoping the files, the attachment-handling methods append each new attached file with its own one-file-only UUID-named folder:

/access/content/attachment/SITE_ID/FILE-SPECIFIC_UUID/FILENAME

This would be the fastest way for us to "stovepipe" our images, but authorization wouldn't be quite correct: given a link to an image file, any member of the site could reach any image whether it was in a released collection or not. (Although, as security-through-obscurity goes, a file-specific UUID isn't bad.)

1.2. group

In practical terms, "group" is better thought of as "site". These are the contents shown to non-superusers in a site's "Resources" tool. The naming convention (supported by some ContentHostingService methods) is:

/access/content/group/SITE_ID/FOLDER_NAMES.../FILENAME

In case of duplicate filenames, one method in the API will automatically append digits to the file basename to disambiguate it. Other methods will throw exceptions.

1.3. group-user

Used by the "Drop Box" application to store files uploaded by students and otherwise visible only to the instructor:

/access/content/group-user/SITE_ID/USER_ID/FILENAME

The "/access/content/group-user/SITE_ID" folder is titled "SITE_ID Drop Box" in Resources.

1.4. private

By default this folder can be used to store anything, but the authz applied is just the basic Resources "members of this site" or "publicly viewable". Site-level or tool-level protection can only be managed if application code takes over HttpAccess and authorization.

At present I only know of two examples for this. First, there's a "sampleAccess" folder added to every deployment of Sakai by a sample component.

Since this is just developer documentation, does it really need to be added to every production system?

Second, Melete keeps its content there.

Do Aaron and Co. have suggestions or examples as to how best to handle this in a less noisy post-EntityBroker fashion?

1.5. public

Used as a special drop-off spot for some public-facing pages and their embedded content, notably the "Welcome" page and the "workspace.html" template.

1.6. user

Files added through the "Resources" tool in "My Workspace" end up at a location determined by the user's ID:

/access/content/user/USER_ID/FOLDER_NAME.../FILENAME