Importing into an Adobe Experience Manager environment

This document describes the setup and usage of the AEM connector scripts. The connector has been implemented with the purpose of being used to migrate content from a content management system of choice to AEM. It is capable creating the required Unified Data Model content types in a running MongoDB instance and loading the stored objects into the AEM environment. The extraction of the source target is outside the scope of this document. We import content using the Apache Sling Servlet that is used by AEM. Using AEM's tool CRXDE Lite, we could monitor the uploading of the assets/pages with all their metadata.

The workflow is as follows:

  • Set up the project
  • Set up Unified Data Model
  • Generate json files with the page information
  • Resolve the links inside the json files
  • Import images
  • Import binaries
  • Import master pages
  • Import country pages

Prerequisites

In order to successfully run the AEM connector scripts, the following requirements must be met:

  • Code is written for Xill IDE 3.0
  • A MongoDB instance must be running locally on the standard port (127.0.0.1:27017)
  • AEM must run locally via the AEM Quickstart jar file
  • Components must be configurated in AEM
  • Site(s) is/are created

Set up the project

Edit the 'settings.xill' robot so that it matches your project. Check the database name/port and project path.Check if you have downloaded tidy.exe and put it in the 'tools' folder inside your project folder. Check if the site(s) is/are well saved in the object 'settings.sites'.

Make sure, that the project folder has at least the following structure:

  > Extraction
     > current
  > Transformation
     > current

Set up Unified Data Model

By using the Unified Data Model, we make sure that every type of content is handles the same way all the other items of that same type are. We predefine every contenttype. For an AEM environment we only need a file and a folder content type. Every document in a MongoDB needs at least a document decorator. Every type of document has it's own collection of decorators. If we save a file in the MongoDB it needs two decorators, a file decorator and a document decorator. Every entry in the MongoDB has a document decorator. This decorator keeps information about the asset like the creator, creation date, modificator, modification date and a title. Every other decorator has to be defined. But because we are working towards standardizing our work, these decorators need to be saved on a central place. That way everybody will use these decorators for other projects as well.

By running the "doctypes.xill" robot, the content types are being saved to the MongoDB. First the decorators are saved. Then the different content types are saved, by combining the right decorators.

Generate json files with the page information

Now, the pages (per site) are being looped. Every page, has a template. By attaching the template to the page, AEM knows how to build up the html for the page. So once the template is known, the content for the template is loaded and the page specific values are used to replace the placeholder texts. So when this robot is done running, we are left with a collection of json files inside the projects_folder/Transformation/current/json/ folder. Each file represents a page inside AEM.

Resolve the links inside the json files

Because content on the source system could have been on other locations than it is in the target system, we need to resolve all the links. For instance, all the images were not grouped inside one folder on the source system and they are in the target system. Then you need to replace the urls for the image, otherwise the images will not show on the target system. The same counts for pages, templates, binaries, etc. This robot fixes these links.

Import images

This is a straight forward action. Fetch the information about the image documents from the MongoDB and post the files with the information with REST to the AEM. This way, the images are posted as assets. So that you can use the DAM (Digital Asset Management) of AEM. 

Importing binaries

Just like the importing of the images, this robot is also very straight forward. Fetch information about the binary files and upload the binaries together with the right information so that they can use from the DAM.

Importing parent pages

For the project that we used these robots, we used blueprinting. This means that pages are created on a higher level as 'skeleton'. These created pages are the foundation of the actual content pages. We used parent pages where we saved content in separate languages each. In much countries people speak more than one language and the owners of the website decided that each language should have their own pages per country. E.G. in Belgium, people speak French and/or Dutch. So for the web pages for Belgium we used the French and Dutch skeleton pages. Plus there was country specific content. This country specific content was also imported on top of the skeleton copies.

Before you can import the parent pages, be sure that you have created the site for the pages. So you need a copy of the parent language site. Clean up the page package and prepare it to import the parent pages. Now you can use the robot to insert the pages that are based on a number json files that were created in the first of these robots.

Importing child pages

So the parent pages are loaded. Now it is time to load the country specific pages. Different countries need different content. Not only is it possible that people speak multiple languages, it is also possible that there is country specific content. Think about different laws/rules per country. So, again, copy a site already loaded as a new child set. Clean this set up and execute the last robot that includes the child pages. Because of the proper link resolving earlier, we already know where the binaries and images were going to be. These should now be properly linked to the content. Now we have correctly imported the website into AEM!