[Article] Understanding Data Mapper in WSO2 Enterprise Service Bus

  • By Nuwan Pallewela
  • 3 Oct, 2016

Table of contents



Applies to

WSO2 Enterprise Service Bus Version 5.0 and above
WSO2 Enterprise Service Bus Tooling Version 5.0 and above



Introduction

WSO2 Enterprise Service Bus (WSO2 ESB) is a lightweight, high performance, and comprehensive ESB. 100% open source, WSO2 ESB effectively addresses integration standards and supports all integration patterns, enabling interoperability among various heterogeneous systems and business applications. It has enabled message transformation capabilities with many ready-made mediators; the data mapper mediator is the latest addition to the mix.



Data transformation

In computing, a data transformation converts a set of data values from the data format of a source data into the data format of a destination need. This data transformation capability is a valuable feature in SOA since systems in this context are built by integrating and combining many services that are self-contained units of functions to provide the functionality of a large software application. When we try to integrate them to a single application, we may need to transform data derived from one application before we use this with others.

Data transformation can be divided into two steps:

  1. Data mapping maps data elements from the source to the destination and captures any transformation that must occur
  2. Code generation that creates the actual transformation program

Data element to data element mapping is frequently complicated by complex transformation that requires one-to-many and many-to-one transformation rules.



WSO2 data mapping solution

To integrate data mapping with WSO2 ESB we need to use the data mapper mediator in the ESB configuration and deploy it in the server. The mapping can be done by visually dragging and dropping operations and mapping elements by linking them. Hence, no programming knowledge is required to do a mapping using the data mapper.

ESB Tooling 5.0.x is packed with a Data Mapper Editor so you could configure the mapping visually (Figure 1).

Figure 1

To get an understanding of how to use data mapper mediator with the ESB please refer to the following blog post and the documentation.

WSO2 Data Mapper comes as two components ideally; they include the data mapper engine and data mapper tooling component.



Data mapper tooling component

Data mapper tooling component (Data Mapper Diagram Editor) is the interface that’s used to create configuration files required for the engine to execute the mapping. There are three configuration files needed by the data mapper engine. They are include the following:

  1. Input schema file
  2. Output schema file
  3. Mapping configuration file

Input schema and output schema define the input/output format of input messages and required messages. It’s basically a custom defined json schema. It will be generated by the data mapper tool when loading the input and output files.

Mapping configuration file is the Javascript file that’s generated according to the mapping diagram user will draw in the data mapper diagram editor by connecting input elements to output elements. Every operation user defined in the data mapping diagram will convert to Javascript operation.

These three files will be generated by the data mapper tool and saved in a registry resource project to be deployed in a WSO2 server (Figure 2).

Figure 2

".datamapper" and ".datamapper_diagram" files contain metadata related to the data mapper diagram. They are not required for the data mapper engine to execute the mapping operation. So only two schema files and .dmc (data mapper configuration) file will be deployed.

The code generation part of the data transformation is done by the tooling component. Mapping in the runtime is done by the data mapping engine.

Code generation behavior and how operations can be used is a separate discussion and will not be covered in this article. Here, we focus on the overall design and architecture of the data mapping solution provided by WSO2.



Data mapper mediator

WSO2 Data Mapper is an independent component. It is not dependant on any other WSO2 product; however, other products can use data mapper to achieve/offer data mapping capabilities. As a result, we need an intermediate component and the data mapper mediator serves as this intermediate component, providing data mapping capabilities via WSO2 ESB (Figure 3).

Figure 3

Data Mapper Mediator will find the configuration files from the registry and configure the data mapper engine with the input message type(XML/JSON/CSV) and output message type(XML/JSON/CSV). Then it will take a request message from the ESB message flow and use a configured data mapper engine to execute transformation and the output message will be added to the ESB message flow. It will also update the ESB axis2 property for messageType and contentType according to the selected output message type as follows:

  • XML - application/xml
  • JSON - application/json
  • CSV - text/xml

At the time of deployment, the data mapper mediator will only be initiated with the configuration values of the data mapper mediator configuration. That means input/output schemas and the configuration file will not be loaded until the first request comes through. When the first request comes, the files will be loaded and the input message will be sent to the data mapper engine along with the configuration files and runtime variable values.



Getting ESB runtime variables

The data mapper engine has given the capability to use run time product-specific variables in the mapping. The intermediate component will construct a map object containing the run time product-specific variables and will send it to the data mapper engine. Thereafter, when the mapping happens in the data mapper engine these variables will be available. E.g. the data mapper mediator provides ESB axis2/transport/synapse/axis2client/operation/.. properties like this. In the data mapper diagram, the user can use the Property operator and define the scope and property name and use it in the mapping (Figure 4). The data mapper mediator will identify the required properties to execute the mapping and populate a map with the required properties and will send it to the engine.

Figure 4



Data mapper engine design

The data mapping engine is the component that executes the mapping generated by the tooling in runtime. As you already know, data mapper uses Javascript to define the mapping; to execute the generated Javascript, Java’s scripting API will be used. So as Java 7 will be provide rhino engine and Java 8 has nashorn engine, the data mapper will also use rhino and nashorn engines with Java 7 and Java 8, respectively. This may affect when trying to use some functions/operators in the mapping.

E.g. startsWith and EndsWith operators will not work when runtime uses Java 7 (ESB running with Java 7). And when you are defining custom functions, make sure the methods you’re using are compatible with the rhino engine if the ESB is running Java 7.

The data mapper engine needs the following information to be configured:

  • Input message type
  • Output message type
  • Input schema
  • Output schema
  • Mapping configuration

At runtime it will get the input message and runtime variable map object and output the transformed message.

Figure 5

As illustrated in Figure 5, the data mapper engine has different readers to read the message from inputs according to the type defined by the data mapper mediator configuration “input type”. If you have not configured it correctly the data mapper engine will use different readers to read your input and throw exceptions at runtime as it can not be parsed. Same as input readers, we have output writers to build the output messages and return the expected output.

These input readers and output writers will use respective schemas in the process as required. That means every reader will not use schemas at the same level.

E.g. as you’re probably aware already, there is not much processing needed to handle JSON messages as we are using javascript engines underline. Therefore, we will directly inject the input message to the script executer if it’s JSON input and directly get JSON object if the output is JSON. For XML, however, we need to process schemas and build the JS object to do the mapping. Since there can be namespaces, attributes, and many other XML specific items that are not in the JSON specification, the preprocessing and postprocessing are high for XML.

By using input readers the data mapper engine will build the JS object in the script engine and the runtime properties map will also be converted to a JS object and be injected into the script engine. Then, the mapping script will be executed and the output object will be retrieved and will build the output message.



Conclusion

This article provided a high-level explanation of implementation of the WSO2 data mapping solution. It described how the tooling component, data mapper mediator, and data mapper engine combine to get the required transformation. This feature will enable users to configure transformations to the payload structure or the type (XML/JSON/CSV) without any advanced transformation configuration language or a programming language; it can be done by just configuring the mapping logic in a visual diagram.

The advantage of having this kind of mapping capability through a diagram is when we have a complex transformation on the data in a payload, the user needs to have in-depth knowledge of the syntax to perform the required transformation and it will eventually be a complex configuration file. In the case of data mapper, we could actually see the operations performed, on what data and the payload structure of input and output, and anyone can understand the transformation defined in the configuration. Therefore, data mapper will be helpful when trying to perform a complex transformation with many operations or when a type conversion of payload is required with the transformation of data.



References

About Author

  • Nuwan Pallewela
  • software engineer
  • wso2