Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Starting Migration
    • Check to see if migration has ever started before. This is done by counting the rows in MIGRATE_CHS_CONTENT_TO_JCR. If there are any rows in the table it means the migration has been started previously.
    • If the migration is starting for the first time, the existing CHS data is added to the table.
      • The COLLECTION_ID's from CONTENT_COLLECTION are added to MIGRATE_CHS_CONTENT_TO_JCR with a status of 0 and event type of ORIGINAL_MIGRATION
      • The RESOURCE_ID's from CONTENT_RESOURCE are added to MIGRATE_CHS_CONTENT_TO_JCR with a status of 0 and event type of ORIGINAL_MIGRATION
  2. During Migration
    Each round of data migrating consists of starting a TimerTask, which fetches n unfinished items from the MIGRATE_CHS_CONTENT_TO_JCR table and copies them to the JCR Repository. The timer tasks all use one Timer and do not start until the previous finishes. There is an delay time t that can be configured, to specify the time to wait between each batch.
    • Fetch the next N unfinished items from the MIGRATE_CHS_CONTENT_TO_JCR table.
    • For each item:
      • If the item is a ContentCollection and the event type is ORIGINAL_MIGRATION, content.add, or content.write copy the ContentCollection to JCR. If the collection already exists in JCR, do not delete and re-add it, just overwrite the metadata properties, and remove any properties that are not in the source collection.
      • If the item is a ContentCollection and the event type is content.delete, remove the collection node from JCR. In the case that the collection was later readded in Resources, the content.add event for it will be further down the queue, so it will be recreated in that case.
      • If the item is a ContentResource and the event type is ORIGINAL_MIGRATION, content.add, or content.write, we will delete the file node in JCR and recreate it by copying the resource over from CHS. This is a bit different from the ContentCollection, where we did not actually remove the node before recreating it, since it was a folder and did not want to destroy the files/folders inside of it. In this particular situation, a resource file will never have children. ( Though in a pure JCR world, it is possible to do this, but the original ContentHosting has nothing modeled like this)
      • If the item is a ContentResource and the event type is content.delete, then we delete the file node from JCR completely.
      • After operating on the item, we update it's row in MIGRATE_CHS_CONTENT_TO_JCR and set the status to 1, finished.
    • After finishing all the content items in the batch, we reschedule this TimerTask setting the delay to the configurable batch delay t.

Edge cases

What if the server crashes?

The server crashes during a batch of copies. When it starts up, the copy that was in progress will still be marked 0 in the table. The copier always handles the case where the node already exists in JCR for some reason, and will just overwrite it and continue.

...