Introduction

We use the Xill Project Convention to structure projects, which makes it easy to pop in and reuse a ready made connector. This convention follows a certain folder setup. To use a connector with this folder structure, it is important to structure it correctly.

  1. Structuring a connector:
    1. General default settings for the connector are stored in defaults.properties in the connector root
    2. API Access is done in a system specific, reusable script (API.xill)
    3. Common functions are kept in a system specific, extensible library (Commons.xill)
    4. An extraction script to get the data out of the source system
    5. A transformation script that applies the business rules and/or enrichment (this script Mapper.xill should be generated using the Mapper library)
    6. An import script to get data into the system
    7. A demo/ folder that includes the scripts, settings, mapping templates and content type definitions for an example project
      1. A defaults.properties file which contains all settings needed for a complete run
      2. Bots that run the demo project are present in the demo root ( Export.xill, Import.xill, Prepare.xill and Run.xill)
      3. ContentTypes are defined in a separate file UDMContentTypes.xill in a config/ folder
      4. Mapping templates are defined in a config/export/mapping/ or config/import/mapping/ folder

In a concurrent connector you will have:

  1.  An export script using Concurrency.run() to configure the bots that do the actual work:
    1. A provider that extracts the data from the source system 
    2. A transformer that maps the source documents to UDM using standard decorators and unified content types
    3. A consumer that puts the transformed documents into UDM
  2. A transformation script that applies the business rules and/or enrichment (Mapper.xill)
  3. An import script using Concurrency.run() to configure the bots that do the actual work:
    1. A provider that extracts the data from UDM
    2. A transformer that maps the documents to the target system using custom content types
    3. A consumer that puts the transformed documents into the target system

Usage

Let's first explain how we intend to use a standard connector and what steps we need to follow to do a successful import/export. Then we explain how to build the connector.

To use a connector which was built for the Xill Project Convention, we need several parts in place:

  1. Create a project folder using the empty-project zip from the Xill Project Convention page
  2. Put the connector into connector/[connector-name] in your project
  3. Download the neccessary libraries from the corresponding Github repositories (currently util, mapper and decorators) and put them in lib/ in your project
  4. Set source/target addresses and other neccessary constants in xill.properties in your project root
  5. Define the UDM content types in config/UDMContentTypes.xill
  6. Define the export mappings you want to use in separate files (one per content type in UDM) in config/export/mapping/
  7. Define the import mappings you want to use in separate files (one per content type in the target system) in config/import/mapping/
  8. By running lib/mapper/build.xill, generate a Mapper.xill robot which includes all mappings and defines two functions for the connector to use (mapSourceToTargetType() and mapObject()
  9. Create an Export.xill, Import.xill and Prepare.xill bot in project/
  10. Create a bot Main.xill that runs the entire project in your project root
  11. Follow the README.md delivered with the connector, to set up your source/target system
  12. Run the project

defaults.properties

In this course module we will build a connector which enables us to extract files from a Drupal server. First we need a file to centralise our settings, to make the scripts more configurable and reusable. To achieve this we use the Properties plugin. 

The Properties plugin makes it possible to create a hierarchy of properties, with defaults that can be overriden using a single xill.properties file in the root of your project. You should ALWAYS have a xill.properties file where you set your project properties and only if you develop a connector or other library, you should use a defaults.properties file to provide usable defaults for that library.

An example defaults.properties file would look something like this:

#connector defaults
connector.export.mappingPath=config/export/mapping/
connector.import.mappingPath=config/import/mapping/
connector.export.mapper=config/export/Mapper.xill
connector.export.mapper=config/import/Mapper.xill


This file contains the minimum settings the connector needs to function at all. For a project to work, you will need a few more properties, the corresponding xill.properties file that is included in the demo/ folder and which you would copy into your project root folder would look something like this:

# Xill Drupal Export demo
#
# This file specifies custom settings for this demo.
 
#udm
connector.drupal.demo.db.name=udm_demo_drupal_aem
connector.drupal.demo.udm.identity=demo_drupal_aem
 
#paths
connector.drupal.demo.exportPath=export/
connector.drupal.demo.tempPath=temp/
connector.drupal.demo.binaryPath=data/binaries/
connector.drupal.demo.imagePath=data/images/
 
#export
connector.drupal.demo.source.url=http://192.168.0.140/drupal7/
 
#import
connector.drupal.demo.target.url=http://localhost:4502
connector.drupal.demo.target.user=admin
connector.drupal.demo.target.password=fake
 
#mapper
connector.drupal.demo.mappingPath=connector/drupal/demo/config/export/mapping/
connector.drupal.demo.mapper=config/export/Mapper.xill

 API.xill

The main (and 100% resuable) part of of a connector is the API robot:

  

/**
 * Project: Project: Drupal 7 REST Connector
 * Author: Titus Nachbauer
 * Date: 2016-06-15
 *
 * API.xill provides functions for handling any API calls towards Drupal
 * In this first draft only implements get and no authentication is supported!
 * Also it does not support paging yet, so if there are more nodes than fit on
 * one page, they will not be returned.
 */
 
use System, XURL, Assert, File, Stream;
 
//-------------------------------------------------------------------------
//                        FUNCTIONS
//-------------------------------------------------------------------------
 
/**
* Returns a list containing all nodes from Drupal of a certain type.
* Does not consider paging, so if there are more nodes than fit on
* one page, they will not be returned.
*/
function getAll(server, type) {
    var response = get(server, type :: ".json", {});
    var nodes = [];
     
    foreach (node in response.body.list) {
        nodes[] = node;
    }
    return nodes;
}
 
/**
* Downloads a file from Drupal with a maximum size of 100MB
*/
function downloadFile(url, downloadPath) {
    System.print("Downloading: " :: url :: " to " :: downloadPath);
    var response = XURL.get(url, {"responseContentType" : "stream"});
    var file = File.openWrite(downloadPath);
    //write file, limited to 100 MB
    Stream.write(response.body, file, 100000000);
}
     
/**
* Returns an object from Drupal, which might be a content node or a user
* type specifies the object's type.
*/
function getObject(server, type, objectId) {
    return get(server, type :: "/" :: objectId :: ".json", {});
}
 
/**
* Returns the authentication cookie for user with password on server
*/
function authenticate(server, user, password) {
    var body = {
            "headers": {
                "Content-Type": "application/json"
            },
            "data": {
                "username": "mdouglas",
                "password": "wdv0904024nv"
            }
    };
 
    var response = post(server, "user/login", body, {});
    return response;
}
 
/**
* Performs a GET request to the Drupal Server.
* Throws an error if the get request is not successful.
*/
private function get(server, url, options) {
    var requestUrl = server.apiUrl :: url;
    var result = XURL.get(requestUrl, options);
     
    Assert.equal(
        result.status.code,
        200,
        "Call to GET:" :: requestUrl :: " resulted in " :: result.status.code :: ": " :: result.status.phrase
    );
     
    return result;
}
 
/**
* Performs a POST request to the Drupal Server.
* Throws an error if the get request is not successful.
*/
private function post(server, url, body, options) {
    var requestUrl = server :: url;
    var result = XURL.post(requestUrl, body, options);
     
    Assert.equal(
        result.status.code,
        200,
        "Call to POST:" :: requestUrl :: " resulted in " :: result.status.code :: ": " :: result.status.phrase
    );
     
    return result;
}

 This library encapsulates all calls to Drupal and adds some error handling. This is the only way that the Drupal REST API should ever be accessed by our code.

You might notice a new call on line 78: 

Assert.equal(
        result.status.code, 
        200, 
        "Call to POST:" :: requestUrl :: " resulted in " :: result.status.code :: ": " :: result.status.phrase
    );

Here we use the Assert plugin package to check the result of the call. You should use Assert sparingly in production code, but it is very useful to prevent fatal errors and checking conditions that should always be true. Using Assert in a library forces the programmer that uses the library to either enclose the call to the library function in a do/fail block or to make sure that the error condition never occurs.

UDMContentTypes.xill

The content type definitions are kept in a separate file in the config/ folder of a project. When creating a connector, you have to provide an example of this configuration in the demo/config/ folder:

 

/**
 * Project: Drupal 7 REST connector
 * Author: Titus Nachbauer
 * Date: 2016-06-23
 *
 * UDMContentTypes.xill creates the needed content types in UDM. Run before project/Prepare.xill
 */
 
use ContentType, Properties;
 
include lib.decorators.StandardDecorators;
 
//-------------------------------------------------------------------------
//                        SETUP
//-------------------------------------------------------------------------
 
var identity = Properties.get("udm.identity");
 
var migration = {
     "sourceSystem" : {
          "type" : "STRING",
          "required" : false
     },
      
     "targetSystem" : {
         "type" : "STRING",
         "required" : false
     },
      
     "action" : {
          "type" : "STRING",
          "required" : false
     },
      
     "timestamp" : {
          "type" : "DATE",
          "required" : false
     }
};
 
var user = {
    "id" : {
        "type" : "NUMBER",
        "required" : true
    },
     
    "firstName" : {
        "type" : "STRING",
        "required" : false
    },
     
    "lastName" : {
        "type" : "STRING",
        "required" : false
    },
     
    "name" : {
        "type" : "STRING",
        "required" : true
    },
     
    "groupId" : {
        "type" : "STRING",
        "required" : false
    },
     
    "email": {
        "type" : "STRING",
        "required" : false
    }
};
 
var web = {
    "content" : {
        "type" : "LIST",
        "required" : true
    }
};
 
var custom = {
    "industry" : {
        "type" : "STRING",
        "required" : true
    },
    "document_type" : {
        "type" : "STRING",
        "required" : true
    }
};
 
//-------------------------------------------------------------------------
//                        MAIN
//-------------------------------------------------------------------------
 
registerStandardDecorators(identity);
registerContentTypes(identity);
 
//-------------------------------------------------------------------------
//                        FUNCTIONS
//-------------------------------------------------------------------------
 
 
function registerContentTypes(identity) {
    ContentType.decorator("migration", migration, identity);
    ContentType.decorator("user", user, identity);
    ContentType.decorator("web", web, identity);
    ContentType.decorator("custom", custom, identity);
    ContentType.save("Asset", [
        "file", "mimeType", "created", "modified",
        "parent", "document", "migration", "custom"], identity);
    ContentType.save("Article", [
        "web", "document", "created", "modified",
        "parent", "migration"], identity);
    ContentType.save("User", ["user", "migration"], identity);
}


Generally, decorators should be short (only a few fields) and should contain no system specific fields. Any fields that really are system specific will be put into a special decorator that has the name of the system. The content types will differ largely between connectors, but over time you should see reusable patterns in connectors as well as more or less universal content types.

A library of standard connectors is available in the connector library which you must download when you start a new project: https://github.com/xillio/decorators

Once the UDM content types have been defined, you can set them up by running the Prepare.xill robot to create them, provide a UDM connection and define some indexes:

 

/**
 * Project: Drupal 7 REST Connector
 * Author: Titus Nachbauer
 * Date: 2016-06-23
 *
 * Prepare.xill prepares project specific set up, like the indexes in MongoDB
 */
use Mongo, Properties;
 
include connector.drupal.Commons;
 
//-------------------------------------------------------------------------
//                        MAIN
//-------------------------------------------------------------------------
 
callbot ("connector/drupal/demo/config/UDMContentTypes.xill");
 
var dbName = Properties.get("connector.drupal.demo.udm.name");
 
var database = getUDMConnection(dbName);
 
createIndexes(database);
 
//-------------------------------------------------------------------------
//                        FUNCTIONS
//-------------------------------------------------------------------------
 
 
function createIndexes(database){
    // parent path should be indexed for quick resolving of parent-child relationships. it is not unique.
    Mongo.createIndex("documents", {'source.current.parent.path' : 1}, {"unique" : false, "background" : true}, database);
    Mongo.createIndex("documents", {'contentType' : 1}, {"background" : true}, database);
    Mongo.createIndex("documents", {'source.current.parent.id' : 1}, {"background" : true}, database);
    Mongo.createIndex("documents", {'source.current.file.path' : 1}, {"background" : true}, database);
    Mongo.createIndex("documents", {'source.current.migration.action' : 1}, {"background" : true}, database);
    Mongo.createIndex("documents", {'source.current.migration.sourceSystem' : 1}, {"background" : true}, database);
    Mongo.createIndex("documents", {'source.current.migration.targetSystem' : 1}, {"background" : true}, database);
}


 You should probably create indexes in Mongo for any field that you are going to use for lookups. In the example above, that includes the source and target system fields, the contentType and the migration action, which might not be obvious at first glance.

Extract from Source System

To run the real extraction we run or call the Export.xill robot, which in turn calls the Prepare.xill, ExportNodes.xill and DownloadBinaries.xill bots  

/**
 * Project: Drupal 7 REST Connector
 * Author: Titus Nachbauer
 * Date: 2016-06-01
 *
 * Export.xill exports all users and content from Drupal
 * BEFORE FIRST RUN:
 * - Make sure all neccessary library and connector bots are present in lib/ and connector/
 * - Configure server and folder settings in xill.properties in the project root
 * - Create a mapping robot for every ContentType in the mappings/ folder with the name
 *   [ContentType].xill
 */
 
use System, Properties;
 
include connector.drupal.Commons;
 
//-------------------------------------------------------------------------
//                        MAIN
//-------------------------------------------------------------------------
 
var database = getUDMConnection(Properties.get("connector.drupal.demo.db.name"));
 
//Clear documents for demo purposes, remove this line when importing from multiple sources
clearUDMDocuments(database);
 
callbot ("connector/drupal/demo/Prepare.xill");
callbot ("connector/drupal/export/ExportNodes.xill", {
    "server" : {
        "apiUrl" : Properties.get("connector.drupal.demo.source.url")
    },
    "identity" : Properties.get("connector.drupal.demo.udm.identity")
});
callbot ("connector/drupal/export/DownloadBinaries.xill", {
    "database" : getUDMConnection(Properties.get("connector.drupal.demo.db.name")),
    "binaryPath" : Properties.get("connector.drupal.demo.binaryPath")
});

 The export itself looks very simple:  

/**
 * Project: Drupal 7 REST Connector
 * Author: Titus Nachbauer
 * Date: 2016-06-23
 *
 * GetNodes.xill is responsible for getting all Nodes out of Drupal and inserting them into UDM
 *
 * BEFORE FIRST RUN:
 * - Configure server and folder settings in xill.properties in project root folder
 * - Create a mapping robot for every ContentType in the mappings/ folder with the name
 *   [ContentType].xill
 * - Run Build.xill
 */
 
use System, Document, Properties, Collection;
 
include connector.drupal.Commons;
 
//-------------------------------------------------------------------------
//                        SETUP
//-------------------------------------------------------------------------
 
argument settings = {
    "server" : {"apiUrl" : Properties.get("source.url")},
    "identity" : Properties.get("udm.identity")
};
 
//-------------------------------------------------------------------------
//                        MAIN
//-------------------------------------------------------------------------
 
var nodes = getAll(settings.server, "node");
foreach (node in nodes) {
    saveObject(node, settings.server, settings.identity);
}

This is very compact code, because we took out all opportunities for code duplication. All reusable functions for the connector are kept in Commons.xill, all mapping is organized by the generated robot Mapper.xill (which is auto-generated by the lib.mapper.build robot).

 

/**
 * Project: Drupal 7 REST Connector
 * Author: Titus Nachbauer
 * Date: 2016-06-23
 *
 * Commons.xill contains functions common to Drupal connector robots
 *
 */
  
 use String, Mongo, Date, Collection, Document, Properties, System;
 
include lib.util.AllUtil;
include config.export.Mapper;
 
 
function getContentType(node) {
    return node.type;
}
 
function getFileId(file) {
    return file.id;
}
 
function annotateObject(node, serverUrl) {
    if (!Collection.containsKey(node, "drupal")) {
        node["drupal"] = {"server" : serverUrl};
    } else {
        System.print("Tried to overwrite existing field 'drupal' in node: " :: node, "error");
    }
    return node;
}
 
function saveObject(node, serverUrl, identity) {
    node = annotateObject(node, serverUrl);
    var type = mapSourceToTargetType(getContentType(node), node);
    var document = Document.new(type, mapObject(node, type));
    System.print("Exporting node of type ":: getContentType(node) :: " mapped to " :: type :: " >> " :: document.source.current.document.title);
    Document.save(document, identity);
}

 The main function of the entire export is the little saveObject() function. It uses the functions provided by Mapper.xill:

  1. Map the content type of a node from Drupal to a content type defined in UDM (mapSourceToTargetType())
  2. Map the contents (fields) of a Drupal node to converted fields in UDM (mapObject())

After that it saves the document in UDM, using Document.save().

The used templates look like this:

 

/**
 * Project: demo_drupal_aem
 * Author: Titus Nachbauer
 * Date: 2016-06-15
 *
 * Default.xill converts common fields for any Drupal node to UDM
 */
 
include lib.util.AllUtil;
 
//-------------------------------------------------------------------------
//                        PUBLIC FUNCTIONS
//-------------------------------------------------------------------------
 
function getDefaultSourceType(data) {
    return null;
}
 
function getDefaultParentTemplate() {
    return null;
}
 
function mapDefault(document, data) {
    document += {
        "document" : {
            "title" : data.title,
            "author" : data.author.id
        },
        "modified" : {
            "date" : parseTimestamp(data.changed)
        },
        "created" : {
            "date" : parseTimestamp(data.created)
        },
        "parent" : {
            "id" : 0,
            "path" : pathToParent(data.url)
        },
        "migration" : {
            "sourceSystem" : "Drupal",
            "timestamp" : Date.now(),
            "action" : Properties.get("migration.action.exported")
        }
    };
    return document;
}


 

/**
 * Project: demo_drupal_aem
 * Author: Titus Nachbauer
 * Date: 2016-06-15
 *
 * Document.xill converts a Drupal node of ContentType 'Document' to UDM
 */
 
//-------------------------------------------------------------------------
//                        PUBLIC FUNCTIONS
//-------------------------------------------------------------------------
 
function getArticleSourceType(data) {
    return "article";
}
 
function getArticleParentTemplate() {
    return "Default";
}
 
function mapArticle(document, data) {
    document += {
        "web" : {
            "content" : [
                {
                    "paragraph" : {
                        "text" : data.body.value
                    }
                }
            ]
        }
    };
     
    return document;
}

   

/**
 * Project: demo_drupal_aem
 * Author: Titus Nachbauer
 * Date: 2016-06-15
 *
 * Asset.xill converts a Drupal node of ContentType 'Asset' to UDM
 */
  
//-------------------------------------------------------------------------
//                        PUBLIC FUNCTIONS
//-------------------------------------------------------------------------
 
function getAssetSourceType(data) {
    return "document";
}
 
function getAssetParentTemplate() {
    return ["Default", "File"];
}
 
function mapAsset(document, data) {
    document += {
        "custom" : {
            "industry" : data.field_industry,
            "document_type" : data.field_document_type
        }
    };
     
    return document;
}
 

 

/**
 * Project: demo_drupal_aem
 * Author: Titus Nachbauer
 * Date: 2016-06-16
 *
 * File.xill converts a Drupal node of ContentType 'File' to UDM
 * For performance reasons, fetching the file is also implemented here
 */
  
include connector.drupal.Commons;
include connector.drupal.API;
 
//-------------------------------------------------------------------------
//                        PUBLIC FUNCTIONS
//-------------------------------------------------------------------------
 
function getFileSourceType(data) {
    return null;
}
 
function getFileParentTemplate() {
    //Return the content type of the parent (typically "Default") or null
    //when there is no parent template
    return null;
}
 
//Change the name of this function to map[Name_of_content_type]
function mapFile(document, data) {
    var file = getObject(data.drupal.server, "file", getFileId(data.field_document.file)).body;
     
    document += {
        "file" : {
            "name" : pathToName(file.url),
            "extension" : pathToExtension(file.url),
            "path" : file.url,
            "size" : file.size
        },
        "mimeType" : {
            "type" : file.mime
        }
    };
     
    return document;
}

Every mapping template provides a get[...]SourceType() function a get[...]ParentTemplate() and a map[...]() function, where [...] is the name of the content type in UDM.

  1. get[...]SourceType() returns the name or list of names of the Drupal types that will be handled by this template
  2. get[...]ParentTemplate() returns the name or list of names of the template files that should be run first before this mapping is applied
  3. map[...]() does the actual mapping and defines how the fields of the source data should be handled

By using this mechanism, we can emulate complex mixin- or inheritance structures in source and target systems, we can merge content types and/or add special content-based rules to the mapping. Defining these mappings might take weeks in a real world project and it should be the main focus of a consultant's efforts.