Library

WSO2 ESB by Example - File Processing

Archived Content
This article is provided for historical perspective only, and may not reflect current conditions. Please refer to relevant product page for more up-to-date product information and resources.
  • By Supun Kamburugamuva
  • 16 Feb, 2011

Applies To

WSO2 ESB 3.0.0 and above

Contents

Introduction

It is often a requirement to pick up a file from a directory and process it within the ESB. In this article we will be discussing a scenario where we pick up a file in a directory, insert the records in the file into a database, send an email saying we have moved the file and finally send the contents of the file to another directory.

This article is aimed at users who have a good knowledge of WSO2 ESB basics. So if you are new to the WSO2 ESB, Configuration Language and Samples of the WSO2 ESB documentation will help you to get familiarized with the basic concepts.

The complete sample is included in the article as an attachment ESB-Sample-01.zip. Inside the .zip, there is a sample-01.pdf document which describes how to set up the sample.

In this article, we will go through the complete configuration step-by-step. First, let's go through the basic configuration of the ESB, which must be done prior to starting the ESB.

Base Configuration

For this scenario, we need to enable e-mail transport and files transport. Also we need to copy a database driver and configure a message formatter. The two transports and message formatter are configured in axis2.xml.

axis2.xml

This very important configuration file is in /repository/conf directory. It is used to configure some of the core functionalities of the ESB. For example, transports, message formatters and builders and execution phases are configured in this configuration file.

VFS Transport

This is the file transport of the ESB. This transport can be used to pick up a file from a file/ftp location and inject the contents into the ESB. Also this transport can be used to create a file in a local directory/ftp location using the message in the ESB.

To enable the transport, un-comment the VFS transport receiver and sender.

Receiver

<transportreceiver name="vfs" class="org.apache.synapse.transport.vfs.VFSTransportListener"/>

Sender

<transportSender name="vfs" class="org.apache.synapse.transport.vfs.VFSTransportSender"/>

Mail Transport

This transport can be used to send and receive e-mail messages.

To enable the transport, un-comment the mailto transport sender. Then configure the mail transport sender to use a mailbox for sending the messages. In this sample, we are not retrieving mails from a mailbox. So there is no need to enable the mail transport receiver.

Sender

<transportSender name="mailto" class="org.apache.axis2.transport.mail.MailTransportSender">
        <parameter name="mail.smtp.host">smtp.gmail.com</parameter>
        <parameter name="mail.smtp.port">587</parameter>
        <parameter name="mail.smtp.starttls.enable">true</parameter>
        <parameter name="mail.smtp.auth">true</parameter>
        <parameter name="mail.smtp.user">synapse.demo.0</parameter>
        <parameter name="mail.smtp.password">mailpassword</parameter>
        <parameter name="mail.smtp.from">synapse.demo.0@gmail.com</parameter>
</transportSender>

Message Builders/Formatters

When a message comes thorough the wire, it first goes through a message builder and message builder is responsible for converting the message into a SOAP message. Message formatters determine the outgoing wire message format of a SOAP message inside the ESB.

For this scenario, the user has to add the following message formatter to axis2.xml. Message formatters are configured in axis2.xml under the section:

<messageFormatters></messageFormatters>

We need to add the org.apache.axis2.transport.http.ApplicationXMLFormatter under the content type “text/html”.

<messageFormatter contentType="text/html" class="org.apache.axis2.transport.http.ApplicationXMLFormatter"/>

Database Drivers

This example uses mySQL as the database. So the driver should be copied into the /repository/components/lib directory. The driver can be found in the ESB-Sample-01.zip's lib folder.

Smooks Libraries

This example uses a CSV smooks library. The library must be copied in to the /repository/components/lib directory. The library is milyn-smooks-csv-1.2.4.jar and can be found in the ESB-Sample-01.zip's lib folder.

These configuration changes make system-wide changes to the ESB and the ESB has to be restarted for these changes to take effect. Now, we must configure the ESB to listen to the file system and process the file.

Mediation Configuration

Proxy Service

In this sample, we are listening to a file system directory using the VFS transport listener. When a file is placed on this directory, we need to get it to a ESB proxy service and process it from there. VFS is a transport configurable at the level of Proxy Services enabling different proxy services to listen to different file directories and different file types. These transport specific configurations are done as parameters of the proxy service.

Our proxy name is FileProxy and it is exposed through the VFS transport.

     
<proxy name="FileProxy" transports="vfs" startOnLoad="true" trace="disable">
    <target>
        <inSequence>
            <log level="full"/>
            <clone>
                <target sequence="fileWriteSequence"/>
                <target sequence="sendMailSequence"/>
                <target sequence="databaseSequence"/>
            </clone>
        </inSequence>
    </target>
    <parameter name="transport.vfs.ActionAfterProcess">MOVE</parameter>
    <parameter name="transport.PollInterval">15</parameter>
    <parameter name="transport.vfs.MoveAfterProcess">file:///Users/supun/quick-starts/original</parameter>
    <parameter name="transport.vfs.FileURI">file:///Users/supun/quick-starts/in</parameter>
    <parameter name="transport.vfs.MoveAfterFailure">file:///Users/supun/quick-starts/failure</parameter>
    <parameter name="transport.vfs.FileNamePattern">.*.txt</parameter>
    <parameter name="transport.vfs.ContentType">text/plain</parameter>
    <parameter name="transport.vfs.ActionAfterFailure">MOVE</parameter>
</proxy>

To expose the proxy service using a particular transport, the "transports" attribute of the proxy service should have the name of the transport and, in this case, it is VFS.

There are several parameters in the FileProxy for configuring the VFS transport. Some of the important parameters are:

  • transport.vfs.FileURI - The directory where proxy is listening 
  • transport.vfs.FileNamePattern - The pattern of the file name as a regular expression
  • transport.vfs.ContentType - When the file is processed it is processed as content from the given content type

Now when a .txt file is placed in the transport.vfs.FileURI, the file is injected into the ESB as a message. Now this message goes to the inSequence of the FileProxy and it creates three clones from the message and executes three sequences. These three sequences are:

  • fileWriteSequence
  • sendMailSequence
  • databaseSequence

Now, let's look at what these three sequences do.

fileWriteSequence

In this sequence, we send the content of the message to a file in the local file system.

    <sequence name="fileWriteSequence">
        <log level="custom">
            <property name="sequence" value="fileWriteSequence"/>
        </log>
        <property name="transport.vfs.ReplyFileName" 
               expression="fn:concat(fn:substring-after(get-property('MessageID'), 'urn:uuid:'), '.txt')" scope="transport"/>
        <property name="OUT_ONLY" value="true"/>
        <send>
            <endpoint name="FileEpr">
                <address uri="vfs:file:///Users/supun/quick-starts/out"/>
            </endpoint>
        </send>
    </sequence>
  1. First we use the log mediator to log the sequence name
  2. Then we set a property called transport.vfs.ReplyFileName at transport scope. This will set the name of the output file. To calculate the name we use XPath. From the XPath we get the Message ID and use that to create a unique file name
  3. Because this is an "out only" operation, we set the OUT_ONLY property to true. Otherwise, the ESB will expect a response for the send
  4. Then, finally, we use the send mediator with an endpoint to write the content of the message into a file. The address of the endpoint is vfs:file:///Users/supun/quick-starts/out. This endpoint address specifies the directory to which the message should be sent

sendMailSequence

In this sequence, we send a mail.

<sequence name="sendMailSequence">
    <log level="custom">
        <property name="sequence" value="sendMailSequence"/>
    </log>
    <property name="messageType" value="text/html" scope="axis2"/>
    <property name="ContentType" value="text/html" scope="axis2"/>
    <property name="Subject" value="File Received" scope="transport"/>
    <property name="OUT_ONLY" value="true"/>
    <send>
        <endpoint name="FileEpr">
            <address uri="mailto:supun@wso2.com"/>
        </endpoint>
    </send>
</sequence>
  1. Log the message using the log mediator
  2. Set the messageType property to text/html to format the message correctly. This property is used by the ESB to select the correct message formatter for outgoing messages
  3. Set the property Subject at transport scope to set the subject of the e-mail
  4. Set the OUT_ONLY property to make the invocation one-way
  5. Send the e-mail using the given address
<send>
    <endpoint name="FileEpr">
        <address uri="mailto:supun@wso2.com"/>
    </endpoint>
</send>

databaseSequence

<sequence name="databaseSequence" xmlns="http://ws.apache.org/ns/synapse">
    <log level="full">
        <property name="sequence" value="before-smooks"/>
    </log>
    <smooks config-key="smooks"/>
    <log level="full">
        <property name="sequence" value="after-smooks"/>
    </log>
    <iterate xmlns:ns2="http://org.apache.synapse/xsd" 
                xmlns:sec="http://secservice.samples.esb.wso2.org" expression="//csv-set/csv-record">
        <target>
            <sequence>
                <log level="full">
                    <property name="State" value="Iteration"/>
                </log>
                <dbreport>
                    <connection>
                        <pool>
                            <password>wso2carbon</password>
                            <user>wso2carbon</user>
                            <url>jdbc:mysql://localhost:3306/DEMO</url>
                            <driver>com.mysql.jdbc.Driver</driver>
                        </pool>
                    </connection>
                    <statement>
                        <sql>insert into INFO values (?, ?)</sql>
                        <parameter expression="//csv-record/name/text()" type="VARCHAR"/>
                        <parameter expression="//csv-record/value/text()" type="VARCHAR"/>
                    </statement>
                </dbreport>
            </sequence>
        </target>
    </iterate>
</sequence>

The message goes to the ESB looking like the following:

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
    <soapenv:Body>
        <text xmlns="http://ws.apache.org/commons/ns/payload">name_1,value_1
name_2,value_2
        </text>
     </soapenv:Body>
</soapenv:Envelope>
  • Logs the sequence name and full message before applying the smooks Apply and smooks transformation for the message
  • The Smooks transformation will create a XML structure from the incoming text message
<smooks config-key="smooks"/>

Here smooks mediator expects a smooks configuration. The smooks configuration is stored in the file system and referred using a local entry.

<localEntry key="smooks" src="file:resources/smooks-config.xml"/>

Here is the smooks configuration. It executes the the smooks CSV Parser and creates a XML from the input text message.

<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.0.xsd">
        <resource-config selector="org.xml.sax.driver">
                <resource>org.milyn.csv.CSVParser</resource>
                <param name="fields" type="string-list">name,value</param>
        </resource-config>
</smooks-resource-list>

After the smooks transformation the message is as follows:

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
    <soapenv:Body>
        <csv-set>
            <csv-record>
                <name>name_1</name>
                <value>value_1</value>
            </csv-record>
            <csv-record>
                <name>name_2</name>
                <value>value_2</value>
            </csv-record>
         </csv-set>
    </soapenv:Body>
</soapenv:Envelope>
  • Log the message after the smooks configuration
  • Then iterate through this XML structure. The iteration happens with the XPath expression //csv-set/csv-record. For each iteration, the message in the iteration target is as follows:
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
     <soapenv:Body>
         <csv-record>
              <name>name_1</name>
              <value>value_1</value>
         </csv-record>
     </soapenv:Body>
</soapenv:Envelope>
  • Each iteration updates a database table

For each iteration, it executes a dbreport mediator that extracts values from the current iteration message and updates a database.

The dbreport mediator is configured to connect to a database using the given information in the configuration. The database connection information is given under the connection XML tag of the dbreport mediator.

The dbreport mediator is configured assuming a database with the name SAMPLE and a table INFO. The INFO table has two VARCHAR columns. The user has to create the database manually.

The dbreport mediator executes a SQL statement with parameters. The sample SQL statement accepts two parameters and these parameters are calculated using the XPath expressions given in the sample. The XPath expressions in the sample simply get the “name” and “value” XML element’s values from the message in each iteration.

Improvements

  • In this sample, error handling is minimal. In particular, the proxy service doesn’t have an error sequence for handling the errors. In a robust distribution, error handling is very important. 
  • Configuration artifacts should be stored in the registry. We can store the endpoints and smooks configuration file in the registry.

References

Author

Supun Kamburugamuva

Product Manager- Enterprise Service Bus, and Technical Lead; WSO2 Inc.

supun@wso2.com

 

About Author

  • Supun Kamburugamuva
  • Technical Lead
  • WSO2 Inc