Failure to remove Collection and swallowed exception yields warning

Description

I investigated the warning below and this is what I found (this is Content Hosting from trunk).

Here is the warning:
WARN: removeCollection(): (2007-11-05 17:32:41,867 Timer-2_org.sakaiproject.content.impl.BaseContentService,removeCollection (2558))
org.sakaiproject.exception.InconsistentException id=/LOAD_TEST_ID_PREFIX_AZ_CHS_collection_small/
at org.sakaiproject.content.impl.BaseContentService.removeCollection(BaseContentService.java:2479)
at org.sakaiproject.content.impl.BaseContentService.removeCollection(BaseContentService.java:2554)
at org.sakaiproject.content.test.LoadTestContentHostingService.removeCollection(LoadTestContentHostingService.java:261)

After tracing through Content Hosting Service I think this is what is
happening when I attempt to remove a collection with items in it using
removeCollection(String id) - comment: Remove a collection and all
members of the collection, internal or deeper:

1) removeCollection(String id)
a) locates the collection
b) gets an editable version (CollectionEdit)
c) Calls clear() - this method recurses through and correctly removes
all items from the collection (using the clear method and
removeCollection(String id)), it is slow but it does actually work as
long as there are no collections in the collection, if there are then
this fails partway through
d) calls removeCollection(ContentCollectionEdit edit)

2) removeCollection(ContentCollectionEdit edit)
a) checks that the edit is active
b) gets the list of existing items for the edit using edit.getMemberResources()

3) edit.getMemberResources()
NOTE - this is where we get into trouble, this pulls the members from
a ThreadLocal based cache despite the fact that the resources are gone
from the database and getting a new edit at this point shows it to be
empty, the members list that is returned here contains all items that
used to be in the collection but have been removed at this point
a) gets the full list of members from
ThreadLocalManager.get("members@" + this.m_id); - this list contains
all the resources that were just deleted
b) calls cacheEntities(mbrs); - there is no comment on this method but
it seems to go through and store all the items from the ThreadLocal
(TL) list into their own TLs - ThreadLocalManager.set("findResource@"
+ entity.getId(), entity);
This seems odd since all these resources are removed and yet each one
is now being referenced from 2 TLs
c) List of all previously removed items is returned

4) removeCollection(ContentCollectionEdit edit)
c) control returns from getMembers with the full list of all items
that were just removed when clear() was called
d) prepares to remove the collection (which actually is empty) but
first checks that the list from getMembers is empty, it is not so this
causes an InconsistentException

5) removeCollection(String id)
e) Control passes back to this method where the exception is caught
and logged as the warning I am seeing. There does not seem to be a
clear indication that the actual removal has failed and processing
continues as if it had not failed.

It seems like the ThreadLocal caching mechanism is not updating correctly when objects are actually removed. I would recommend not
actually using a TL "cache" here and intead using a real cache but I am not really sure what the TL stuff is doing without digging in a
little deeper. I think using TLs as caches is not a good option.

I attempted to get around this by getting the collection again and removing it again but it results in the same actual failure since the
TL "cache" is still not cleared out. If I restart the server and then try to remove the collection it works, but I do not consider that
solution acceptable.

The fix is probably to update the public removeResource(String id) method so that it clears out the TL "cache" entries when they are
removed. Same thing for removeCollection(String id) since it is also called from the clear() method. Also, that exception which indicates failure to remove the collection should not be swallowed.

Attachments

1

Activity

Aaron Zeckoski November 11, 2007 at 8:31 AM

if this is still an issue then someone should reopen it and address it.

Aaron Zeckoski November 9, 2007 at 12:37 AM

We are not seeing the issue in our tests anymore. Can you give more details about how you are reproducing the error and provide a stacktrace/log?

Jim Eng November 8, 2007 at 11:45 PM

Still seeing this behavior after applying patch.

Mike Osterman November 8, 2007 at 9:53 PM

Merged to 2.5.x branch - r38058

Mike Osterman November 8, 2007 at 9:51 PM

Committed to 2.4.x branch - r38057

Fixed

Details

Priority

Affects versions

Fix versions

Assignee

Reporter

Environment

Firefox, OSX, Sakai trunk

Created November 6, 2007 at 4:48 AM
Updated July 30, 2009 at 8:05 AM
Resolved November 11, 2007 at 8:31 AM