Important Considerations While Migrating Content from a CMS to AEM Using Package Manager – Blog Series – Blog 1

23 / Jul / 2016 by Prabhdeep Puri 4 comments

Recently, I wrote scripts for migrating the XML Output of the source CMS to Content XML for installing in AEM using AEM’s package manager.

Following are some scenarios/challenges that I experienced while doing migration to AEM from another CMS:

  • Node Name : Node name should be mapped from that XML element that is :
    1. Unique for every page in the same hierarchy
    2. Never empty or null for any page
    3. Contains no or minimum special characters that are not part of JCR naming restrictions. Restricted characters include : “/”, “:”, “[“, “]”, “|”, “*”
    4. Not language-dependent: It shouldn’t contain any special characters from other languages. Preferably, it should be same in all languages so that the hierarchy of pages exactly matches in all language copies.
  • Rich Text : Inclusion of Rich Text in the content XML should be done after the HTML escaping is done for special characters and tags as the Package manager will show errors for characters like “Less than(<), Greater than(>), Quotes(“) and Ampersand(&)” which are to be encoded as “&lt;, &gt;, &quote; and &amp;” respectively.
  • Tags: While adding any Tags to content pages XML’s ‘cq:tags’ property, one should make sure that those Tags are present beforehand in the instance where the package will be installed, otherwise the Package Manager removes all those tags from the pages which are not found in the instance.
  • XML Formatting : I noticed that unformatted XML elements sometimes lead to unsuccessful installation through Package Manager. Thus, content XMLs should be formatted before installation.
  • DAM Migration Packages: While installing large content packages like images and documents, one should keep following things in mind:
    1. Package size should be < 2GB: Package Manager doesn’t allow package size to be over 2 GB. Thus, installation for Image and documents packages should be done in batches. This also helps reduce the performance hit on the server.
    2. Modes in filter.xml: When installing multiple packages under the same hierarchy, Package Manager overrides the parent node and thus child nodes are deleted/overridden from the last installed package. This is because the default ‘mode’ for filters is ‘replace’. Thus, the value of ‘mode’ should be set to ‘merge’ or ‘update’ for the required filters in the filter.xml to merge or update the nodes under a hierarchy.
      You can find more info about filter.xml and different modes here.
    3. Images
      • AEM provides an OOTB workflow named DAM Update Asset, which automatically creates renditions for images that are uploaded to DAM. If you already have renditions of images from the previous CMS, before installing the migrated Images Package Manager or uploading them directly to DAM, the 2 launchers (for 2 events, created and modified) for DAM Update Asset Workflow should be disabled which will help reduce the load on the server.
    4. Documents and PDFs :
      • When a PDF is uploaded to DAM, launches the DAM Update Asset Workflow for Page Extraction and thus launching one instance of DAM Update Asset Workflow for each page of PDF. If individual pages are not required or you already have them from previous CMS, you should disable the 2 launchers for DAM Update Asset Workflow.
      • Disabling the 2 launchers of DAM Update Asset Workflow will also help when you already have thumbnails for those PDFs from the previous CMS.
    5. Launchers can be disabled by selecting the Launchers tab from /libs/cq/workflow/content/console.html as shown in the below picture.

AEM Workflow Launchers

  • Move/Rename Pages in Migrated Content:  
    1. Once the migration of Pages is done and you realize that any Migrated Pages are needed to be moved/renamed,  this move should be done using Siteadmin’s Move/Rename feature, which updates all the references to that page in other pages/components. 
    2. In Classic UI mode, this Move/Rename feature updates a max of 150 references by default and if more references are found, it will just move/rename the page without updating any references. Whereas in Touch UI, although it doesn’t show the references above 150, it still updates all of them.
      This default value can be updated by

      1. Overlaying /libs/cq/ui/widgets/source/widgets/wcm/HeavyMoveDialog.js and changing the value of maxRefNo in Classic UI mode.
      2. Changing the value of maxreferences property in /libs/wcm/core/content/sites/movepagewizard/jcr:content/body/content/items/referencesStep/items/references

There is one more blog in continuation to this blog which also outlines more considerations while migrating content to AEM. Here’s the second blog of the blog series.


comments (4)

  1. Content Migrator

    Wow. Hard work is quite evident. It saved lot of my time as you have captured so many points. Thanks Mr. Puri.


Leave a Reply

Your email address will not be published. Required fields are marked *