2012/02/12
12 Feb, 2012

WSO2 ESB by Example - File Exchanging Hub

  • Rajika Kumarasiri
  • Senior Software Engineer - WSO2

Applies To

WSO2 ESB 4.0.3 and above

Table of Content

Introduction

It's a very common thing to move files from one location to another. IT infrastructure of any reasonable size organization should provide the required facilities to move files between it's departments or required parties. The same IT infrastructure also should be capable of processing of file content available via various location like ftp server, sftp server or using a Windows file share. Regardless of the fact you have this file moving requirement in Service Oriented Architecture or in your cooperate IT infrastructure it's a very common requirement.

This article describes how you could use WSO2 ESB as a file exchanging hub. That is how to configure WSO2 ESB to connect to various types of file systems such as remote ftp servers, sftp servers or even a remote mail server and download a file from that location. While the initial parts concentrate on how to use WSO2 ESB to fetch or read a file or a directory resides on some of the common file systems such as the local file system, a remote ftp server, a remote sftp server or a file that available as a zip archive and many more.

WSO2 ESB has a feature rich file transport which is known as the VFS transport. This transport is currently used by lot of customers for implementing various file transfer and processing scenarios. VFS transport uses the versatile Apache Commons-VFS[1] library as the underline library for various file system related operations. This library provided a rich set of features which allows WSO2 ESB's VFS transport to connect to various types of remote file systems. The library defined various remote file systems using an instance of URI and the relevant URI schema for various file systems can be found here[5]. In addition to this WSO2 ESB's VFS transport has a set of failure recovery mechanisms as well which prevent from processing a same file twice etc.. This article mainly concentrate on the various features that helps to build file transferring hub using WSO2 ESB. Various scenarios that are described in this article can be used to implement various file exchange operations and they will act as the building blocks of a large integration or file moving requirement.

WSO2 ESB's VFS transport consider all file names as URIs. As the usual case with URIs, if there is any meta characters available in the name, those will be encoded automatically.

Figure1 illustrates a set of file exchange scenarios using WSO2 ESB. Here WSO2 ESB acts as the man in middle for transferring and exchanging files between various type of destinations.

Figure1: Using WSO2 ESB as a file transferring hub.

All scenarios are explained with full working configurations and some knowledge of WSO2 ESB will be assumed. A new comer is encourage to read the quick start guide[2] of WSO2 ESB and the VFS transport documentation[3]. Each scenario will have a small description about the scenario followed by a full working configuration.

Encrypting Passwords in Connection URLs

Most of the remote locations such as ftp servers, sftp servers etc..(this article will use the term remote locations to denote file systems such as ftp server, sftp serves etc..) are required to provide with the credentials when accessing some content on them. So the user required to provide the password in the connection URIs which is not a very good thing . To avoid this Common-VFS library provides a tool to encrypt the passwords in the connection URIs. Once the password in encrypted using the tool the encrypted text can be inserted into the connection URI inside '{' and '}'. When Commons-VFS library detects any passwords enclosed in '{' '}' those will be considered as encrypted and try to decrypt them before use.

Download the WSO2 ESB distribution and unzip to a location. The article will refer that location as $ESB_HOME henceforth. Use the following command to generate the encrypted text for passwords.

java -cp $ESB_HOME/repository/components/plugins/commons-vfs2-2.0.0.wso2v3.jar org.apache.commons.vfs2.util.EncryptUtil encrypt mypassword

where 'mypassword' is the password that need to encrypt. This command will generate an encrypted text which will looks simillar to 'E67507BE086ADB75D055E8209A11BD76'. This text can be copied and paste into the connection url enclosed within {}. Once that is done an example URL will looks like below.

https://username:{E67507BE086ADB75D055E8209A11BD76}@svnserver.com/svn/repos/project1/trunk

This technique can be used to encrypt the password that need to configure in the any kind of file system URIs in this article.

File Exchange Scenarios

This article will focus on the first five scenarios.

  1. Reading a file from the local file system
  2. Reading a file from a FTP server
  3. Reading a file from a FTP server over SSL
  4. Reading a file from a SFTP(that is an, SSH or SCP) server
  5. Fetching a large(~2 GB) file or fetching an archived(*.bz2,*.jar etc..)
  6. Fetching a bz2 compressed file archive
  7. Fetching a gzip compressed file archive
  8. Reading a file from a HTTP server
  9. Reading a file from a HTTP server over SSL
  10. Fetching from a zip, Jar and Tar file
  11. Reading from a temporary file
  12. Reading a file in local file system using ClassLoader.getResource()
  13. Reading a file from a WebDAV server
  14. Reading a file from a CIFS server(such as a Samba server)
  15. Reading mails and its attachments like archives

Each of the following sections describes a separate file exchange scenario. VFS transport listener should be enabled in $ESB_HOME/repository/conf/axis2.xml. Since all these scenarios are reading from a remote location a VFS proxy will be deployed. Given the polling nature of the VFS transport it will immediately start to poll the remote file system once the VFS proxy is deployed into the system. And once the file is received a log mediator will be used to print the content of the file by the proxy (this logic can be extended according to your mediation and routing logic that can be performed on the content of the file).

VFS transport has a wide variety of configurations[3] that helps to read various files from various sources. Following table describes some of the common configurations that will be used in the each of proxy configurations.

Parameter Possible value
transport.vfs.FileURI Input source (file or directory)location
transport.vfs.ContentType The content type of the file that will be read by the VFS transport
transport.vfs.FileNamePattern The patten of extension of the file that should be read by the VFS transport
transport.vfs.MoveAfterProcess location to move the file after process

When it comes to file exchanging scenarios there are two types of files that can transfer using WSO2 ESB's VFS transport. One is normal files(or content inside a folder) and the other type is archived (such as zip, tar.gz, bz2 etc..) or other special files such as *.pdf files. Depend on the file type ESB can read the content and fetch the file or just fetch the file as it is without reading the content. Normal text file or an XML file goes into the first category and archived files and *.pdf files goes into the second category. Generally what we want is to move an archived file from one location to another and that file will be processed later (after extracting). Following scenarios are categorized either to fetch and read the file content(*.xml or *.txt) while the others are configured just (such as *.zip, *.jar, *.pdf) to fetch the file. The technique that is used to fetch archived files and special files without reading the content is described in detail under the heading "Fetching a large (~2 GB) file or fetching an archived(*.bz2,*.jar etc..) file".

Reading a file from the local file system

When comes to file exchanging scenarios that can be implemented with WSO2 ESB, this can be the basic usage pattern. That is to read a file from local file system. The absolute path of the file should be provided in the configuration, and if a directory destination was given all the files that matches the given file extension will be processed by the VFS transport.

Proxy configuration for reading a file from the local file system

<proxy name="FileSystemVFSProxy" transports="vfs" xmlns="http://ws.apache.org/ns/synapse">
    <parameter name="transport.vfs.FileURI">
        file:///home/rajika/docs/ot-articles/in-progress/vfs-file-transferring-hub/in
    </parameter>
    <!--CHANGE-->
    <parameter name="transport.vfs.ContentType">text/xml</parameter>
    <parameter name="transport.vfs.FileNamePattern">.*\.xml</parameter>
    <parameter name="transport.PollInterval">15</parameter>
    <parameter name="transport.vfs.MoveAfterProcess">
        file:///home/rajika/docs/ot-articles/in-progress/vfs-file-transferring-hub/original
    </parameter>
    <!--CHANGE-->
    <parameter name="transport.vfs.MoveAfterFailure">
        file:///home/rajika/docs/ot-articles/in-progress/vfs-file-transferring-hub/original
    </parameter>
    <!--CHANGE-->
    <parameter name="transport.vfs.ActionAfterProcess">MOVE</parameter>
    <parameter name="transport.vfs.ActionAfterFailure">MOVE</parameter>
    <target>
        <inSequence>
            <log level="full"/>
            <drop/>
        </inSequence>
    </target>
</proxy>

Copy the proxy configuration, FileSystemVFSProxy.xml inside the sample folder into $ESB_HOME/repository/deployment/server/synapse-configs/default/proxy-services. Start the WSO2 ESB server (using ./wso2server.sh{.bat} script) and note how it picks an XML file that is in the location ~/docs/ot-articles/in-progress/vfs-file-transferring-hub/in. A log of the file content also can be seen on the console of the ESB server. The two parameters 'transport.vfs.FileNamePattern' and 'transport.vfs.ContentType' can be altered depending on the file type that need to read.

Reading a file from a FTP server

This scenario describes how to use WSO2 ESB to read a file from a remote ftp server. The configuration is given below. There is an additional parameter that need to pass into the connection url. That is the 'vfs.passive=true' parameter. This allows the ftp connection to keep open(once connected) and transfer a fairly large file. Following configuration will fetch any xml file that reside in the test folder in the FTP server, ftp.server.org. As described previously the password entry can be encrypted and pass in the connection url. When running the sample configuration the given FTP server connection URL has to change appropriately.

Proxy configuration for reading a file from a FTP server

<proxy name="FTPVFSProxy" transports="vfs" xmlns="http://ws.apache.org/ns/synapse">
    <parameter name="transport.vfs.FileURI">vfs:ftp://user:[email protected]/test?vfs.passive=true</parameter>
    <parameter name="transport.vfs.ContentType">text/xml</parameter>
    <parameter name="transport.vfs.FileNamePattern">.*\.xml</parameter>
    <parameter name="transport.PollInterval">15</parameter>

    <target>
        <inSequence>
            <log level="full"/>
            <drop/>
        </inSequence>
    </target>
</proxy>

Copy the proxy configuration, FTPVFSProxy.xml inside the sample folder into $ESB_HOME/repository/deployment/server/synapse-configs/default/proxy-services. Start the server using normal way.

Reading a file from a FTP server over SSL

In some cases, access to the FTP server may only allow through a SSL connection. FTPS type of connection URL can be used to access a FTP server over SSL. The only change in the configuration is to set ftp as ftps in the connection URL. Note that in WSO2 ESB 4.0.3 only control actions will happen over TLS/SSL and any data transmission will happen on non SSL manner. This is because improvements has to be done in VFS transport in order to support SSL over data channel. If your FTP server requires data channel as well over SSL following configuration will not work and you will need to turn off SSL for data channel. See VFS-412.

Proxy configuration for reading a file from a FTP server over SSL

<proxy name="FTPVFSProxy" transports="vfs" xmlns="http://ws.apache.org/ns/synapse">
    <parameter name="transport.vfs.FileURI">vfs:ftps://user:[email protected]/test?vfs.passive=true</parameter>
    <parameter name="transport.vfs.ContentType">text/xml</parameter>
    <parameter name="transport.vfs.FileNamePattern">.*\.xml</parameter>
    <parameter name="transport.PollInterval">15</parameter>

    <target>
        <inSequence>
            <log level="full"/>
            <drop/>
        </inSequence>
    </target>
</proxy>

Reading a file from a SFTP (which is a SSH or SCP) server

This feature allow users to access a file or a directory in a remote sftp server. This is very similar to accessing a remote resource using SSH or SCP. Similar to the other configuration the credentials can be provided (in encrypted form if required) in the connection URL, together with the absolute path of the resource.

Public key authentication for SFTP support

WSO2 ESBs vfs transport has the ability to authenticate using public key for sftp. You only need to specify the user name in the connection url, the transport will automatically look for ~/.ssh for the private key. To get this scenario working you'll need to add your public key into ~/.ssh/authorized_key in the remote server.

Following is an example configuration for using public key authentication for sftp.

            <parameter name="transport.vfs.FileURI">vfs:sftp://user@host/home/user/test.xml</parameter>
        

Proxy configuration for reading a file from a SFTP server

<proxy name="FTPSVFSProxy" transports="vfs" xmlns="http://ws.apache.org/ns/synapse">
	<parameter name="transport.vfs.FileURI">vfs:sftp://user:password@host/home/user/test.xml</parameter>
    <parameter name="transport.vfs.ContentType">text/xml</parameter>
    <parameter name="transport.vfs.FileNamePattern">.*\.xml</parameter>
    <parameter name="transport.PollInterval">15</parameter>

    <target>
        <inSequence>
            <log level="full"/>
            <drop/>
        </inSequence>
    </target>
</proxy>

Fetching a large(~2 GB) file

This is different from other scenarios. For example consider that you need to transfer a large (nearly 2GB in size) pdf (or a zip) file from one location (a remote sftp server) to another location (local file system). For this purpose the message relay[4] feature can be used together with the VFS transport. This special feature is called message relay mode. When ESB operates on the message relay mode, ESB will not worry about the payload or will not try to read the content of the payload. ESB will just transfer what ever it receives. This lets us to fetch a large file without worrying about its content. To enable message relay feature we need to enable the appropriate message builder/formatter pairs in $ESB_HOME/repository/conf/axis2.xml. For example to fetch a pdf document application/pdf content type can be used with the following formatter and builder configuration. As you may have figure out any of the earlier described document types (xml, txt and others) can be fetched using this technique without reading the content.

Binary relay formatter and builder configurations

   <messageBuilders>
            <messageBuilder contentType="application/pdf" class="org.wso2.carbon.relay.BinaryRelayBuilder"/>
            <messageBuilder contentType="application/xml" class="org.wso2.carbon.relay.BinaryRelayBuilder"/>
            <messageBuilder contentType="application/x-www-form-urlencoded" class="org.wso2.carbon.relay.BinaryRelayBuilder"/>
            <messageBuilder contentType="multipart/form-data" class="org.wso2.carbon.relay.BinaryRelayBuilder"/>
            <messageBuilder contentType="text/plain" class="org.wso2.carbon.relay.BinaryRelayBuilder"/>
            <messageBuilder contentType="text/xml" class="org.wso2.carbon.relay.BinaryRelayBuilder"/>
   </messageBuilders>

   <messageFormatters>
            <messageFormatter contentType="application/pdf" class="org.wso2.carbon.relay.ExpandingMessageFormatter"/>
            <messageFormatter contentType="application/x-www-form-urlencoded" class="org.wso2.carbon.relay.ExpandingMessageFormatter"/>
            <messageFormatter contentType="multipart/form-data" class="org.wso2.carbon.relay.ExpandingMessageFormatter"/>
            <messageFormatter contentType="application/xml" class="org.wso2.carbon.relay.ExpandingMessageFormatter"/>
            <messageFormatter contentType="text/xml" class="org.wso2.carbon.relay.ExpandingMessageFormatter"/>
            <messageFormatter contentType="application/soap+xml" class="org.wso2.carbon.relay.ExpandingMessageFormatter"/>
   </messageFormatters>
        

Proxy configuration for fetching a large file

<proxy name="BRPDFVFSProxy" transports="vfs" xmlns="http://ws.apache.org/ns/synapse">
    <parameter name="transport.vfs.FileURI">
        file:///home/rajika/docs/ot-articles/in-progress/vfs-file-transferring-hub/in
    </parameter>
    <!--CHANGE-->
    <parameter name="transport.vfs.ContentType">application/pdf</parameter>
    <parameter name="transport.vfs.FileNamePattern">.*\.pdf</parameter>
    <parameter name="transport.PollInterval">15</parameter>
    <parameter name="transport.vfs.MoveAfterProcess">
        file:///home/rajika/docs/ot-articles/in-progress/vfs-file-transferring-hub/out
    </parameter>
    <!--CHANGE-->
    <parameter name="transport.vfs.MoveAfterFailure">
        file:///home/rajika/docs/ot-articles/in-progress/vfs-file-transferring-hub/failure
    </parameter>
    <!--CHANGE-->
    <parameter name="transport.vfs.ActionAfterProcess">MOVE</parameter>
    <parameter name="transport.vfs.ActionAfterFailure">MOVE</parameter>
    <target>
        <inSequence>
			<log level="custom">
				<property name="status=" value="pdf received"/>
			</log>
            <drop/>
        </inSequence>
    </target>
</proxy>

This configuration describe how to fetch a pdf file from one location and then place it another location. The same technique can be used to fetch a zip, jar files from a different file system such as a ftp server or sftp server. To run the sample, first enable the following builder and formatter pairs in $ESB_HOME/reposity/conf/axis2.xml and copy the proxy to $ESB_HOME/repository/deployment/server/synapse-configs/default/proxy-services and start the server. This will fetch a pdf file that resides in /home/rajika/docs/ot-articles/in-progress/vfs-file-transferring-hub/in and will drop into the location home/rajika/docs/ot-articles/in-progress/vfs-file-transferring-hub/out (the parameter given by transport.vfs.MoveAfterProcess).

Running All Scenarios as a Single File Transferring Gateway

The given configurations defined separate proxy configuration for each type of operation. The proxy configuration different from each other. Although it's not possible to define a single proxy configuration which fetches or read a file from all type of file systems it is possible to define a general sequence which is shared by all the proxies and this general sequence should be defined to suite the requirements. In this way there will be a logical single file transferring gateway implementation which connects (and then process if required) to all type of file systems such as FTP, SFTP, WebDAV etc..

Conclusion

This article describes most of the file transfer scenarios (more specially reading from a remote location) that can be implemented using WSO2 ESB. These file moving requirements are not coupled with SOA implementation requirements. It's commons to encounter those requirements in day to day world. This guide also provides full working configurations to implement the mentioned scenarios.

Future Work

This article only provides the instructions that need to move a file from one location to another(more specially read from a remote location). Those location can be varied from a local file system to a remote ftp server. Most of the real world requirements are not limited to file move from one location to another, they also require more processing on the content of the file. For example it's not uncommon to read a file from a FTP server and send the content as an email attachment. The concepts and the examples in the article can be extended to meet such requirements using the extensive set of features provided by WSO2 ESB.

References:

  1. https://commons.apache.org/vfs/
  2. https://wso2.org/project/esb/java/4.0.3/docs/quickstart_guide.html
  3. https://wso2.org/project/esb/java/4.0.3/docs/transports/transports-catalog.html#VfsTrp
  4. https://wso2.org/project/esb/java/4.0.3/docs/message_relay.html
  5. https://commons.apache.org/vfs/filesystems.html

Resources

 

About Author

  • Rajika Kumarasiri
  • Senior Software Engineer
  • WSO2 Inc.