Using JSR-170 (Java Content Repository)
This page will be a growing dump of notes from working with Sun's JSR-170.
The primary route for this integration for Sakai 2.5 will be the jackrabbit-service written by Ian Boston of CARET, which is currently hosted at https://source.sakaiproject.org/contrib/tfd/trunk/jackrabbitservice . This embeds an Apache Jackrabbit instance as a Sakai component, which it configures appropriately for the current Sakai database and clustering, and exposes via a standard JSR-170 API.
Experimentation is also underway with other JSR-170 implementations, such as Xythos, and Alfresco.
Setting up repositories standalone for testing
Setting up Jackrabbit
Setting up Jackrabbit standalone is really quite hard due to the incredibly poor documentation. A simple "in-memory" demonstration as in their "First Hops" guide is straightforward, but setting up a realistic production environment using a database (e.g. MySQL) is extremely obscure. The main brunt of the work is done in this "JCRTests" project
https://saffron.caret.cam.ac.uk/svn/projects/amb26/trunk/jcr-tests
Many of the configuration options are not yet properly broken out but must be edited by hand in the project files - consult the README.txt for details.
Setting up Alfresco
The Alfresco content repository is another frontrunner in JSR-170 implementation. Their repository is actually part of a much wider CRM initiative, with many standalone components and servers. This information describes an attempt to set up Alfresco 2.1.0R1, at the time of writing the latest "stable" release of Alfresco. The main impediment in doing this is that Alfresco, despite being fairly enlightened Spring-wise, is still using an Ant build, with many peculiar, unlabelled and non-standard dependencies, some of which are not in the standard repositories.
MySQL instructions
The default user/password are alfresco/alfresco, database alfresco.
MySQL construction lines:
create database alfresco default character set utf8; grant all privileges on alfresco.* to 'alfresco'@'localhost' identified by 'alfresco'; flush privileges;
Unusual Afresco dependencies
- OpenOffice - the packaging information was taken from this "official" looking release, although the actual OO version for this Alfresco release predates any Maven release, at 2.0.3. http://wiki.services.openoffice.org/wiki/Uno/Java/MavenBundles
- SpringModules-jbpm - the actual version of this is unlabelled. The version in the official repositories demands a jboss dependency which does not appear in the Alfresco libs. NB - this was eventually discovered to be the JAR labelled jbpm-jpdl-3.2-patched.jar in disguise - entered into the repository manually under the jboss path.
- jug - unlabelled in the distro. I chose 1.1.2 as it is in the central repo. 2.0.0 has a POM but no JAR.
- jibx - unlabelled in the distro. I chose jibx-run-1.0.1.jar, which did not work. A JIBX exception is thrown "Binding information for class org.alfresco.repo.dictionary.M2Model must be recompiled with current binding compiler (compiled with jibx-rc0, runtime is jibx_1_0)". Note that this message is swallowed by M2Model.java at line 99
catch(JiBXException e) { throw new DictionaryException("Failed to parse model", e); }
- JIRA this. The bundled JARs were thus relabelled into the rpo as jibx-rc0.
- ant - actually required at runtime because of the use of org.apache.tools.zip.ZipFile. I chose Ant 1.6.5.
- odf_util - quite a show-stopper. This JAR appears to exist nowhere, the official ODF site is at http://books.evc-cit.info/odf_utils/ and this situation is complained about at http://www.javalobby.org/java/forums/t77967.html. I placed it in catcode.com under version 05-11-29 to correspond to the only apparently visible source. This metadata never seems to have changed...
Alfresco - current status
Alfresco has now been demonstrated starting up in a standalone configuration, with a proper Maven 2 build held at https://saffron.caret.cam.ac.uk/svn/projects/amb26/trunk/jcr-tests . The dependencies which were not available in the standard repositories have been uploaded into the Caret Maven 2 repository, which is referenced in the POM file in that project. A crucial step is bundling a copy of xercesImpl-2.8.0 into the build, which seems to include a SAX parser that is more lenient with the apparently invalid XML held in the Alfresco workflow definitions than the out of the box Java 5 parser. You will receive the following messages at startup
2007-11-07 16:48:10,406 WARN (JpdlXmlReader.java:127) - <process xml warning: swimlane 'initiator' does not have an assignment> 2007-11-07 16:48:10,578 WARN (JpdlXmlReader.java:127) - <process xml warning: swimlane 'initiator' does not have an assignment> 2007-11-07 16:48:10,671 WARN (JpdlXmlReader.java:127) - <process xml warning: swimlane 'initiator' does not have an assignment> 2007-11-07 16:48:10,718 WARN (JpdlXmlReader.java:127) - <process xml warning: swimlane 'initiator' does not have an assignment>
which seem essentially harmless. Without this JAR you will instead receive a fatal exception (see screenshots attached below).
A further deploy issue is that Alfresco requires a native code library (DLL under Windows) allowing it to operate its NetBIOS code. It appears that when not running under Windows it will automatically swap to another configuration where this JNI is not required http://wiki.alfresco.com/wiki/File_Server_Configuration but I have not tested this. For reference, I get this to run with the following JVM startup option on my box: -Djava.library.path=E:\Source\alfresco-sdk-2.1.0\bin
Alfresco as a Jackrabbit replacement
Now studying the extent to which Alfresco will be a "drop-in" for our current use of Jackrabbit. Some differences so far I have noted:
Dealing with Alfresco metadata and content model
- Dynamic registration of namespaces/node types is not supported. This doesn't seem to be baked into the underlying code design, but is hardwired in code at the top level of configuration for the JCR "facade". From Alfresco RepositoryImpl:
The first argument is "allowRegistration" which is set to false in this constructor and not further exposed through any setters.
namespaceRegistry = new NamespaceRegistryImpl(false, serviceRegistry.getNamespaceService());
- Some JCR types are not supported. An April 2006 post http://forums.alfresco.com/viewtopic.php?t=1659 suggests that only nt:base is supported, but in fact the "jcrModel.xml" file held at (config)/alfresco/model/jcrModel.xml from the recent release (2.1.0) shows that most of the core JCR types are supported (nt:base, nt:hierarchyNode, nt:file, nt:folder, nt:resource, mix:referenceable, mix:lockable). Notable by their absence are nt:unstructured (see next point) and mix:versionable. Note that the April post indicates that the Alfresco web client will not recognise JCR nodes within the repository and the data dictionary page recommends extending the core Alfresco types.
- Unstructured nodes are not supported via the JCR API. The guide to the Alfresco "Data Dictionary Scheme" is at http://wiki.alfresco.com/wiki/Data_Dictionary_Guide. This documents a 3-step procedure for adding new namespaces and node types into Alfresco using their custom XML format. I believe the Jackrabbit format is proprietary too so no real loss here. The following forum posting includes more detailed assistance plus example for a user unable to get his metadata model working: http://forums.alfresco.com/viewtopic.php?p=17897
Independent attempts at a Maven 2 build
This Google cache link http://66.102.9.104/search?q=cache:zHh5ohywA40J:forums.alfresco.com/viewtopic.php%3Ft%3D7407%26view%3Dprevious%26sid%3Dea0233cf75bd8889d278bdd2e5686c6c (cache seems to disagree with current page, corresponding link is http://forums.alfresco.com/viewtopic.php?t=1017) to some traffic on the Alfresco forums from desperate people wanting a Maven 2 build. I tried to get in touch with a few of them but no answer. The "m2alfresco" Sourceforge project referred to at the end is a red herring, it is a dead project with no files and no traffic.