System won't start with OSP + Content in File System + 10.1
Description
Attachments
relates to
Activity

Brian Baillargeon September 11, 2014 at 10:54 AM
Attaching the fix committed by Matthew Jones for quick reference to anybody running 10.1 who needs this

Matthew Jones September 3, 2014 at 12:10 PM
Yeah, we only have auto.ddl=true on one instance in the cluster, all of the others are false. Usually set via local.properties or something only on that machine.

Charles Hedrick September 3, 2014 at 12:06 PM
that's odd. My understanding is that it only does schema changes where needed. We start nodes manually, but I would think the only issue would be the first startup after a schema change.
Matthew Buckett September 3, 2014 at 5:35 AM
We've experience startup deadlocks with auto.ddl enabled where two nodes try todo updates and deadlock, then other nodes starting up also deadlock. The trigger for this was a VM host getting rebooted so all the worker VMs started up at a similar point in time.

Hudson CI Server September 2, 2014 at 9:18 PM
UNSTABLE: Integrated in sakai-10-java-1.7 #115 (See http://builds.sakaiproject.org:8080/job/sakai-10-java-1.7/115/)
Merging - System won't start with (matthew@longsight.com: rev 312648)
KNL-1286 is missing something. Our Sakai won't start with that patch.
Without a body delete path defined, addResourceToDeleteTable will generate an error. addDeleteResource will fail, because putDeleteResource will return null. That will throw a null pointer exception. (Not quite sure why it's useful to check for null and generate the same error as if you hadn't checked. If the null check had returned, this problem wouldn't have happened.)
Since addResourceToDeleteTable is called in only one place, the best fix is probably to patch that place not to do anything.
I.e. add
if (m_bodyPathDeleted == null)
return;
at the beginning of addResourceToDeleteTable in BaseContentService.java
Why doesn't this always cause startup to fail? I suspect it's because with OSP a delete is done during bean instantiation. Without OSP there's a good chance that any errors would be non-fatal. I conjecture that tests are being done without OSP.
Maybe it's valid to run with bodyPathDeleted null and maybe it isn't. But if we're going to do KNL-1286, it should be complete.