Component Manager Upgrade

New Concept

  • This is an idea for discussion, please feel free to edit, comment, etc.

Information

This describes future possible upgrade and migration paths for the Sakai Component Manager. This is a central part of Sakai and is basically responsible for tying together the core services and allowing them to interact with each other and the tools.

Component Manager

The Sakai Component Manager is the core service that ties all the various pieces of Sakai together. In particular it allows for communication between services and the use of services by webapps.
Web applications managed by a servlet container are intended to run in isolation. They cannot communicate with each other nor can they export shared services.

Sakai is meant to be a general toolbox of webapps which are running in a common space and sharing services WITHOUT requiring all the tools to share the same libraries. Sakai also enables tools to be restarted without shutting down the entire environment. Sakai gets around the servlet container limitations using the Component Manager.

In Sakai terminology, a component is any collection of resources and/or code intended to be used by or shared with Sakai tools (webapps). Shared Sakai service APIs are deployed to the shared location of the servlet container. This location is visible to all web applications and so are the service interfaces. Implementations of these interfaces must be loaded at container startup. The Sakai Component Manager is responsible for loading the service implementations and making them accessible to Sakai tools (web applications).

More info on the component manager (is hard to find):

Relation to industry norms

The Sakai component manager is a non-standard environment in terms of the wider Java software industry. There are currently two major schools of approaches attacking the same issue:

  • The EJB/"full J2EE" model as operated by full-scale (often commercial) containers such as JBoss, WebLogic etc.
  • The OSGi component model as operated by containers such as Apache Felix, Eclipse Equinox, Knopflerfish, etc.

The first alternative was available at the time of initial design of the current solution (2002-2003) and was dismissed on the grounds of constituting an enormously heavyweight and generaly unperformant solution that would generally impede development. Some of these concerns have since been addressed with a slightly more lightweight EJB3 (JSR 220) model but I believe it is fair to say the kernel group (comments?) still believes this model inappropriately heavyweight for our community model and practices.

The second alternative has been emerging gradually over the 2000s, but was certainly not at an appropriate level of development in 2003. In fact it is still doubtful that a move towards an OSGi container could be defended today (late 2007) in terms of deploying a webapp container on a body of production-hardened code of reasonable pedigree, as compared to current deployments on fairly well-understood containers such as Tomcat and Websphere. However, OSGi is seeing use in a number of extremely widely-deployed environments such as Eclipse, and various embedded containers. OSGi is seeing widespread and increasing industry support - the following version 2.5 of the Spring framework will see Spring factored into a set of OSGi-compliant bundles, and the OSGi model is moving towards acceptance as the Sun JSR-291. OSGi is probably not ready today, but any work done on the component manager should be with an eye towards convergence with OSGi within the next 5 years or so.

Useful links:

Spring OSGi
Status of Open Source OSGi Containers

Requirements for the Component Manager

A wider set of requirements for the Sakai component manager are listed at Component Manager Requirements - the subset which has motivated this particular roadmap are listed here.

Various aspects of the current implementation are impeding development and deployment work, as well as interoperatability with code elsewhere in the industry. Some of these issues are listed in this section. In the next section there will follow an attempt at prioritisation, also with respect to the expected workload at various points of a proposed upgrade roadmap.

Current deficiencies

  • Components may not be stopped or started whilst the installation is running.
  • Each component is not allocated a proper Spring context or ClassLoader, which prohibits the use of many Spring idioms
    • Resources may not be loaded from a component's area (without the use of hacks like this.getClassLoader().getResource - these must be placed in the shared ClassLoader
    • Execution within a component does not set the component's context ClassLoader to an appropriate value
  • It is architecturally impossible to declare implementation-private beans to a component. Every Spring bean dependency of an implementation bean must be visible to all code within Sakai.
  • It would be desirable to perform "JAR folding" for commonly-used libraries - which should be able to share code where they are using identical versions of the same libraries. Sakai's PermGen space is currently greatly inflated since every component and webapp loads separate versions of the same libraries into different classloaders.
  • Conversely, a couple of commonly-used libraries (Spring itself, and Hibernate) are inappropriately present at the shared ClassLoader level, which restricts standard usages by webapps and components alike. For example it is impossible for Spring 2.0 webapp developers to take advantage of custom Spring bean namespaces, and it is very hard to deploy technologies such as Alfresco which depend on an independent Hibernate configuration.

Draft roadmap

This section attempts to plot a "pay as you go" roadmap taking forwards the current Component Manager codebase, maintaining the greatest possible backwards compatibility with existing APIs and file layouts, whilst trying to smooth a path towards convergence with industry norms, as well as improving performance in the areas outlined above.

Stage 1 - "simple" proxy layer reestablishing correct ClassLoader semantics.

Estimated time: 2-3 days. This would, for each bean in the shared area, dispense a proxy rather than the real bean. The behaviour of the proxy would be simply to save and restore the correct context ClassLoader, which would be registered against the bean itself. This should achieve the aim of restoring the ability to let Spring-enabled tools resolve resources in the standard way, which would let more Spring-enabled libraries to be deployed into components without code changes (e.g. JMX, Alfresco, etc.). It is probable that most resources (configuration files) could be moved out of the shared area at this point (e.g. Hibernate .hbm files, .properties files, etc.). No disruption is envisaged to existing code, although any code which currently relies on reflection to "forge" references to methods which are not visible in the public API/parent of a implementation bean at this point will break. There is (should be) no code of this nature in Sakai, with the possible exception of the new "Config Editor" tool (question).

Stage 2 - "live" proxy layer capable of swapping bean references in response to component load/unload.

Estimated time: 1 week. This would detect changed datestamps on key files in the components area (probably initially only components.xml) and initiate reload of the associated component. Key abilities:

  • JARs should be copied into a "private area" before code loading to prevent obstructing file handles (primarily on Windows).
  • The proxies should be switched to act as "valves" when a component begins to reload. New invocations across the proxy boundary should block until the reload is completed, but existing calls should be allowed to terminate normally. Cases of extreme delays might be handled with a timeout and forcible termination of the thread (question).
  • Reloading should be able to be turned off selectively or completely for production instances where inadvertent triggering could be dangerous. Should default to "off" unless configured through sakai.properties.

Stage 3 - Enabling reloading for all relevant services - listeners and lifecycles via OSGi

Estimated time: 2-3 days for initial impl, gradual framework migration over months/years.

Some components in Sakai, especially those that are simple and/or cleanly written, may well reload correctly "out of the box". Many of them, unfortunately, will not, without some reworking.

  • Pieces of Spring/component technology which expect to be responsive to reloading should be given a "stub" OSGi implementation of the core listener methods (more below). In particular, every component should be allocated a (largely stub) OSGI BundleContext.

Many existing components in Sakai will not reload correctly. There are many cases of "backwash" APIs where Object references to other Spring beans are held "against the flow", that is, for example, a setter method which calls a method on its argument passing "this" as an argument. The classic case for this is the Observer/Listener pattern, which will need to be taken out of direct control of component code. In order to provide a forwards migration and convergence strategy, deferring to the OSGi semantics and APIs for this function seems wise. We need not implement any substantial quantity of an OSGi container, but ensure that where clients use container-specific functionality, this is provided via OSGi-compliant interfaces.

Key methods are held in the standard OSGi framework interface BundleContext such as "addBundleListener", "addServiceListener" etc.

A key piece of infrastructure to be used to abstract these details from newer code is patterns such as the BeanCollector as part of the Entity Broker. This allows the Provider pattern to be implemented simply by registering Spring beans implementing a particular interface, which are automatically collected by the framework and handed to clients. Where the client does not require to react to changes in the live bean set, it could simply be handed a "volatile" kind of List/Map. Where it does require to take specific action on detecting reloading/unloading, it should instead use the standard OSGi stubs provided, probably via Spring-OSGi. Older code will need to be reworked to use the new Listener idioms where reloading functionality is required.

Stage 4 - Remove Spring/Hibernate artefacts from shared.

Estimated time: 1 week

Does not depend on Stage 3, can directly follow Stage 2 or even possibly Stage 1. However, should generally cooperate with the strategy chosen for stage 5 rather than being an "ad hoc" (at least for Hibernate).

Spring will be needed to be treated specially, if our current programming idioms are to be preserved. A key ability which we exploit is the ability to inherit from parent Spring definitions held in the shared area into child definitions in components/webapps. The principal case in point is the "AdditionalHibernateMappingsImpl" definition used to export Hibernate mappings up into the Hibernate ClassLoader. This style is not actually even supported by Spring-OSGi, which requires formal importation of references from other bundles using explicit OSGi definitions, and we should consider whether we want to continue to support it within Sakai. It does however seem useful, and supporting it would enable us to decouple the work on removing Spring from shared and the work on removing Hibernate. In order to support this, we would need to supply a special "SkeletalApplicationContext" which would allow Spring BeanDefinitions to be serialised and deserialised across the ClassLoader boundary, this providing the illusion that we have at least a "ConfigurableListableBeanFactory" available as our parent context.

Once this is done, the full range of standard Spring 2.0 techniques would become available to hosted webapps.

If stage 5 were achieved before moving Hibernate, moving Hibernate becomes simpler owing to the fact that a suitable, "isolated" ClassLoader for holding Hibernate would already be constructed by the system. All that would be required would be maintaining a special route via the ComponentManager/Spring for transferring the contributed Hibernate mappings to it at the proper time (AdditionalHibernateMappingsImpl).

Stage 5 - Enable "JAR folding" for frequently deployed JARs throughout the system.

Does not depend on Stages 3-4, can directly follow Stage 2 or even possibly Stage 1.

Stage 5a - Enable JAR folding for components.

Estimated time - 2-days -> 1 week.

The ClassLoaders constructed in Stage 1/2 should be adjusted to read either Maven 2 manifests or OSGi manifests and therefore deduce the dependency structure which would allow JARs to be shared. In some cases this might need to be augmented by hand-coded information in the case of troublesome/standard artefacts (e.g. log4j, commons-logging, Xerces, etc.). In this case detection of a previously loaded JAR should resolve to a previously constructed ClassLoader in the graph rather than refetch the code. This will both reduce the memory image of the final app in terms of PermGen space, thus increasing performance, and also reduce startup time.

Stage 5b - Enable JAR folding for webapps.

Estimated time - 1 week+

This is actually the more important use case, since the majority of duplicated/large JARs are actually held in webapps (JSF, velocity, oro, RSF, etc.). Unfortunately it is much harder to tackle, since existing Servlet containers do not allow easy and certainly not portable access to the timing and manner of construction of their webapp ClassLoaders. The proposed scheme for this is to actually embed an entirely "hosted" lightweight Servlet container (probably Jetty) within an existing component/webapp, which will inherit, for each request, the standard HttpServletRequest/Response pairs handed by the "outer container" (probably Tomcat, but perhaps WebSphere). Thus configuration changes for deployment will be kept to a minimum, with maximum portability amongst existing servlet containers.

Another beneficial effect of this change would be to remove dependencies on startup order amongst different webapps, and reenables the ability to deploy "single" artefacts into Sakai efficiently and safely exporting services as well as hosting UI. This will not be desirable for larger projects but would simplify smaller ones. A webapp would be forced to load as soon as a bean it had exported was demanded by the environment.

Stage 6 - Convergence with full-scale OSGi container

Estimated time - 2-3 years+

Whilst we are committed to supporting JSP/JSF-driven tools within Sakai, our mobility will be somewhat limited to "standard containers". Currently only Apache Felix is positioning itself as a standard Servlet container in addition to an OSGi container, and support for full J2EE seems some way away. Other container have limited or no support for J2EE constructs such as web.xml etc., supplying only the OSGi "HttpService" interface. Modern presentation technologies such as RSF or Wicket could survive in these environments, and Velocity tools could probably be hacked, but others will struggle. Ideally we would see a move away from webapp-based programming, but the JSP model is popular and easily understood and Sakai would probably lose in acceptance should it become unsupported. Our "stage 5b" container therefore at this point has an uncertain roadmap towards an industry-standard solution, but at the very least it should be converged with a fully compliant Spring-OSGi implementation so that experimentation can begin with deploying and hosting raw OSGi bundles within Sakai. It is likely that emerging developments with JSR-291 will make this path clear over this timescale.