Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Detailed Description of Algorithm

IN PROGRESS

Testing Migration Integrity

IN PROGRESSEach row in the MIGRATE_CHS_CONTENT_TO_JCR table contains the CONTENT_ID, the Status, and the Event Type. It's important that the table be interpreted as existing linearly in time. Any row further down the table is expected to model a content event that occurred after the ones previously. Processing the table rows in order from top to bottom is required.

Status

Codes

Not Started

0

Finished

1

Those are the only two. In the event that add the ability to run this in parallel on multiple cluster nodes, we will certainly have to add more status types.

Event

Types

ORIGINAL_MIGRATION

This means that is was part of the original table copy

content.add

This means that it was added to the migration table as the result of receiving a content.add event.

content.write

Same idea as above for writes

content.delete

Same idea as above for deletes

  1. Starting Migration
    • Check to see if migration has ever started before. This is done by counting the rows in MIGRATE_CHS_CONTENT_TO_JCR. If there are any rows in the table it means the migration has been started previously.
    • If the migration is starting for the first time, the existing CHS data is added to the table.
      • The COLLECTION_ID's from CONTENT_COLLECTION are added to MIGRATE_CHS_CONTENT_TO_JCR with a status of 0 and event type of ORIGINAL_MIGRATION
      • The RESOURCE_ID's from CONTENT_RESOURCE are added to MIGRATE_CHS_CONTENT_TO_JCR with a status of 0 and event type of ORIGINAL_MIGRATION
  2. During Migration
    Each round of data migrating consists of starting a TimerTask, which fetches n unfinished items from the MIGRATE_CHS_CONTENT_TO_JCR table and copies them to the JCR Repository. The timer tasks all use one Timer and do not start until the previous finishes. There is an delay time t that can be configured, to specify the time to wait between each batch.
    • Fetch the next N unfinished items from the MIGRATE_CHS_CONTENT_TO_JCR table.
    • For each item:
      • If the item is a ContentCollection and the event type is ORIGINAL_MIGRATION, content.add, or content.write copy the ContentCollection to JCR. If the collection already exists in JCR, do not delete and re-add it, just overwrite the metadata properties, and remove any properties that are not in the source collection.
      • If the item is a ContentCollection and the event type is content.delete, remove the collection node from JCR. In the case that the collection was later readded in Resources, the content.add event for it will be further down the queue, so it will be recreated in that case.
      • If the item is a ContentResource and

LEFT_OFF_HERE

Edge cases

What if the server crashes?

Testing Migration Integrity

In order to test the integrity of the migration, a random sampling of files/folders will be chosen for comparison. To compare these we will fetch the ContentCollection or ContentResource for both of them from both implementations, and then compare the properties, and some of the methods that determine properties such as conditional release. Occasionally the byte streams will be compared as well, but perhaps not as often depending on how long it takes for each one.