Demystifying WSO2 ESB Pass-Through Transport - Part II

  • By Kasun Indrasiri
  • 21 Apr, 2014

Applies to

WSO2 ESB 4.8.x

Table of contents

Introduction

WSO2 ESB is the world’s fastest open source ESB and has been tested with thousands of deployments worldwide with
production use cases of over billions of transactions. The non-blocking HTTP transport of WSO2 which is known as Pass-Through Transport is the key component behind its success.

This series of articles will provide an in-depth analysis of WSO2 ESB’s Pass-Through Transport architecture for any advanced user
who is interested in learning about it in detail.

The main objective of Part II of this article is to give an in-depth knowledge on the Pass-Through Transport
architecture. In addition, this gives a complete overview of the message flow on WSO2 ESB.

WSO2 ESB in a nutshell

The enterprise service bus (ESB) is the heart of any SOA implementation and it is extensively used in most modern
integration use cases. ESB acts as the centralized bus to connect disparate systems, services, and clients. The message
routing, filtering, enriching, transforming, and protocol switching can be considered as key message processing activities
done inside an ESB. In general, we refer to all the message processing capabilities of an ESB as ‘message mediation’ capabilities.

WSO2 ESB offers a very rich set of mediation capabilities along with 100% EIP support. At the same time,
WSO2 ESB is right at the top in all ESB performance comparisons done by WSO2 and other competitors.

WSO2 ESB - conventional message flow

The mediation run-time of WSO2 ESB contains several major components that are depicted in the following diagram.

With respect to message flows, when we send a request/message to WSO2 ESB, it first hits the transport listener.
The primary function of the transport listener is to interface the ESB with one or more application layer protocols for
serving inbound requests. For instance, for HTTP we have a Pass-through HTTP transport listener implemented.

Then the transport listener hands over the message to the Axis2 layer once it is done with the transport level message
processing. In the Axis2 layer, for inbound messages we have something called the message builder. As ESB receives disparate
message formats from different clients and systems, we should use a canonical message format inside the ESB to do all
the message mediation tasks. The main functionality of the message builder is to convert the incoming messages into
a canonical format. (e.g. JSON --> SOAP)

Once the message is canonicalized and processed by Axis2, it is handed over to the mediation layer (in fact the
interfacing between the Axis2 and the Synapse mediation engine is done with the use of Axis2 Message Receivers i.e. ProxyService
Message Receiver, Synapse Message Receiver, etc.). The mediation layer mediates the message based on a given mediation configuration and
then hands it over to the Axis2 layer to send the message out to the specified back-end service.

For the outbound message, there is a specific message formatter at the Axis2 layer that converts the message
from the canonical format to the required wire format of the back-end service (e.g. SOAP --> JSON). Finally, the message goes
to the transport sender, which is responsible for delivering the message to the back-end service with the required
application layer protocol.

Since we now have a clear understanding about the ESB’s message flow, lets move on to the main discussion points of this
article - the high performance HTTP transports of WSO2 ESB. It's useful to have an idea about the non-blocking
HTTP transport we had prior to introducing the Pass-Through Transport. So, let's begin our discussion with the old NHTTP transport.

Architecture of NHTTP transport

The non-blocking HTTP transport implementation in which WSO2 ESB was initially developed is known as NHTTP transport
(package org.apache.synapse.transport.nhttp). This is a high performance HTTP transport implementation that is based on
Apache HTTPCore NIO (which means it is based on the readiness selection concepts that we discussed in Part I of this article). The typical message flow of the NHTTP transport is depicted in the following diagram.

  • The message received HTTP transport is read from HTTPCore NIO layer and the transport listener writes the message content into the input buffer.
  • Then the message is built (i.e. the transport layer invokes the HTTPTransportUtils) by the message builder, which resides in the Axis2 layer. This reads the content from the input buffer and creates the XML-based object model. This is the canonical object model used across the entire mediation flow.
  • Once the message is processed at the Axis2 layer, it is handed over to the mediation engine (by using message receivers residing in Synapse). The mediation engine (Synapse) processes the message based on the canonical object model and at the end of the mediation flow it hands over the message back to the Axis2 layer.
  • The message formatter serializes the XML-based object model into the output buffer for delivery.
  • The transport sender reads the content from the output buffer and writes it into the wire.
  • This model is known as the ‘dual buffer architecture’ as it involves both the input and output buffers.

When integrating diverse message formats, the NHTTP transport architecture is very useful as it uses a canonical format for each and every request. However, the main drawback is that the canonicalization of every request introduces a huge performance overhead. In particular, this is a major concern when we are routing messages based on URLs and other transport headers without considering the message payload.

Binary Relay - Omitting message building and formatting

The idea is to improve the performance by omitting message building for selected content types. In this model, the incoming message is represented using the meta model; the message context. Although the message context was used in the typical NHTTP based architecture as well (but with binary relay) here we populate message content with the transport header but keep it as a raw byte stream without converting it into the XML-based canonical format. By doing this, we avoid the overhead of converting the input message into XML-canonical format at the receiving side and converting the XML format to raw bytes at the sender side.

The binary relay-based model significantly improves WSO2 ESB's performance, but it had major limitations. When the binary relay is enabled, we cannot support any mediation scenario that requires access to the message payload. Hence, this was only used when the use case is strictly oriented towards header-based mediation. Moreover, this model still uses two buffer architecture and that still involves a critical performance overhead.

For more information about binary release refer https://wso2.com/library/articles/binary-relay-efficient-way-pass-both-xm....

Architecture of Pass-Through Transport

The dual buffer architecture has its inherent limitations that severely affects performance in modern performance critical use cases, such as handling billions of transactions per day. The revolutionary step that we took in WSO2 ESB was to come up with a brand new architecture that can overcome all the limitations in dual buffer architecture (i.e. NHTTP transport). The new design, which is know as ‘Pass-Through Transport’ (org.apache.synapse.transport.passthru), was based on a single buffer model in which we have completely eliminated the output buffer.

Based on the new model, we have conducted a performance comparison between the new Pass-Through Transport and the binary relay-based NHTTP transport. We have observed a significant performance gain in the Pass-Through Transport compared to binary relay.

A single shared buffer is used for content decoding and encoding. This model works when we are routing the message without processing/inspecting the message payload. Due to the very same reason we could not make the Pass-Through Transport the default HTTP transport of WSO2 ESB prior to ESB 4.6. Therefore, one of our main challenges was to support Pass-Through Transport in scenarios that do not need to access the message content as well as the scenarios that need to access the message content. The concept of ‘Conditional Canonicalization’ (or content awareness) is used to overcome this problem.

Conditional canonicalization

In all possible message mediation and integration scenarios, we have identified that we will encounter two possible behaviors of a given mediator. A given mediator can be a content aware mediator or a non-content aware mediator. For instance, enrich mediator is a content aware mediator while send mediator is not. Sometimes the content awareness is subjective. Based on the configuration properties of a given mediator it could become content aware or not (the filter mediator that looks for a transport property is not content aware but a filter mediator with regular xpath expression that accesses the message body is content aware). As long as a given mediator is content aware, the message is subjected to canonicalization.

We have leveraged this conditional canonicalization concept so that Pass-Through Transport can be used in all possible integration scenarios. The mediation engine is modified such that it checks for the content-awareness prior to dispatching the message to a given configuration. If we have used any content aware mediators or we have designed the mediation logic such that we need to access the message payload at some point, the scenario becomes content aware. So, in a content aware scenario, we are bringing back the conventional message flow (with message builders/formatter and axiom) right in to the picture. The message flow related to the content aware scenario is also depicted above.

As we have a solid understanding about the background, motivations and the architecture of Pass-Through Transport, now it's time to have a closer look at the Pass-Through Transport implementation. Part III of this article will give you a comprehensive explanation of the implementation of Pass-Through Transport and how it integrates with other layers in WSO2 ESB.

Summary

Part II of the article has provided you with an in-depth overview of the ESB message flow, NHTTP and Pass-Through Transport architectures. In Part III, we will provide a detailed description of Pass-Through Transport design and implementation.

References

  1. Reactor: an object behavioral pattern for concurrent event demultiplexing and event handler dispatching
  2. Reactor pattern introduction
  3. Open-source ESB performance comparison

About Author

  • Kasun Indrasiri
  • Director - Integration Technologies
  • WSO2 Inc