2010/06/16
16 Jun, 2010

WSO2 ESB Mediation Statistics: What Can Numbers Tell About Your SOA?

  • Hiranya Jayathilaka
  • PhD student - WSO2

Contents

Introduction

WSO2 Enterprise Service Bus (ESB) provides a variety of options to monitor and manage the server runtime, through its Web-based management console. Out of these monitoring capabilities, the mediation statistics feature stands out as one of the most useful and widely used features. The mediation configuration of WSO2 ESB is composed of following functional elements:

  • Sequences
  • Proxy services
  • Endpoints
  • Scheduled tasks
  • Event sources
  • Priority executors

From these functional components, sequences, proxy services and endpoints directly get involved in processing messages. The mediation statistics feature enables a server administrator to collect runtime statistical information on these three types of functional components and view them through the ESB management console. This feature provides a simple but effective way of determining the runtime performance of the ESB. The collected statistical data can be used to analyze network traffic patterns and draw rough estimates on how the service bus will be used in the future. Perhaps the most enticing attribute of the mediation statistics feature is that it can be configured in a fine grained manner. In other words, it allows a user to collect data on only a specified set of sequences, proxy services and endpoints.

Applies To

Product Versions
WSO2 ESB 2.1.3, 3.0

What’s New?

Mediation statistics feature has gone through a number of architectural and functional upgrades lately. These changes were done with a long list of performance and deployment requirements in mind.

In older releases of WSO2 ESB (before v2.1.3), the collected mediation statistics were stored in the WSO2 Governance Registry instance that comes embedded with the ESB. Starting from WSO2 ESB v2.1.3, mediation statistics are not saved to the registry, due to a number of reasons:

  • Writing data to the registry in a regular basis, has a performance hit on the ESB
  • A registry-based data store takes time to respond to queries thus providing a sluggish UI experience
  • Mediation statistics feature cannot be used with a read-only registry

Mediation statistics component in ESB v2.1.3 and later releases uses an in-memory data store to keep the collected data. This data store implementation has been optimized to support fast querying and easy updating, while using the minimum amount of memory possible. The memory usage of the data store is restricted by the total number of statistics collecting proxy services, sequences and endpoints. If the number of proxy services, sequences and endpoints is constant, the memory usage of the mediation statistics component stays constant.

The new and improved mediation statistics implementation supports persistence as an optional feature. That is, one may configure the ESB to save gathered statistical information to the registry. The newly introduced, mediation statistics API allows easy sharing of collected data, among Carbon components. Currently WSO2 Business Activity Monitor (BAM) makes use of this API to access the data collected by the service bus. Users can also develop custom data consumers to retrieve mediation statistics from the in-memory data store and process them further, using this API.

Simple Usage

Mediation statistics feature is available with WSO2 ESB out of the box. You do not have to configure anything to use this feature. Simply sign-in to the management console and enable statistics on the sequences, proxy services and endpoints you want to monitor, and the service bus will start collecting data on them. For an example, in order to enable statistics for a sequence, click on “Sequences” in the left menu and simply click on the “Enable Statistics” icon, corresponding to the appropriate sequence.

If you want to enable statistics for a mediation component using the Synapse configuration language, you have to set the value of the “statistics” attribute to “enabled” on the sequence, proxy or endpoint configuration.

To view the gathered data, click on the “Mediation Statistics” option in the “Monitor” menu. You will be shown a graphical summary of all collected statistics.

Starting from this view, you can drill down to see more fine grained data such as statistics pertaining to a single sequence or a proxy service. In WSO2 ESB 3.0, you can even get statistics for different operations on the same endpoint.


Understanding Mediation Statistics

Now that you have some experience working with the mediation statistics feature, it is a good time to understand what each statistic means, with respect to the service bus. First of all, note that for sequences and proxy services, there are two sets of statistics collected by the ESB. These are called in-statistics and out-statistics. In-statistics are data related to request mediation and out-statistics are data related to response mediation. To further clarify this, let’s consider an example sequence.

<sequence name="foo" statistics="enabled">
    <in>
        <log level="full"/>
        <send>
          <endpoint>
            <address uri="https://myserver/services/myservice"/>
          </endpoint>
        </send>
    </in>
    <out>
        <send/>
    </out>
</sequence>

This sequence logs the requests sent by service consumers, and it forwards them to the backend service hosted at https://myserver/services/myservice. The responses coming back from the backend service are routed back to the service consumers. So clearly, this sequence has to deal with two mediation channels, one for requests and the other for responses. The in-statistics collected on the sequence are related to the request processing channel whereas the out-statistics are associated with the response channel. Proxy services also have two built-in mediation channels, namely the in-sequence and the out-sequence. In that case in-statistics are associated with the in-sequence, and the out-statistics are associated with the out-sequence.

Each set of statistics consist of following data items:

  • Total count: The total number of messages received and mediated through the mediation channel
  • Fault count: The number of messages that triggered faults while being mediated through the channel
  • Minimum time: The least amount of time taken by the mediation channel to mediate a message (in milliseconds)
  • Maximum time: The greatest amount of time taken by the mediation channel to mediate a message (in milliseconds)
  • Average time: The average amount of time taken by the mediation channel to mediate a message (in milliseconds)

The average time for a particular mediation channel is calculated using the following formula:

  • Tn - Average processing time after mediating n messages
  • n - Number of messages mediated (Total count)
  • tn - Time taken to mediate the nth message through the channel (aka mediation time - more on this later)

WSO2 ESB collects only the in-statistics for the endpoints. In WSO2 ESB, an endpoint represents an entity to which messages can be sent. The ESB core is designed to be completely asynchronous and non-blocking, and therefore responses are not associated with a particular endpoint. ESB handles responses separately, using a call back receiver mechanism. As a result, an endpoint has only one mediation channel, namely the request channel, associated with it. Endpoint in-statistics also consist of the same set of data items as statistics of sequences or proxy services. But in case of endpoints, their definitions are slightly different. The revised definitions are as follows:

  • Total count: Total number of messages forwarded to the endpoint
  • Fault count: Number of messages that triggered faults while being forwarded to the endpoint
  • Minimum time: The least amount of time taken to send a request to the endpoint and receive a response
  • Maximum time: The greatest amount of time taken to send a request to the endpoint and receive a response
  • Average time: The average time taken to forward a request to the endpoint and receive a response

Endpoint average times are computed using the same formula as the one used to calculate average times for sequences and proxy services.

Mediation Time

Mediation time can be defined as the time taken to mediate a single message through a given component. It is one of the key variables captured and processed by the ESB. When you have enabled statistics on a sequence, proxy service or an endpoint, ESB measures the mediation time for that component, for each message passing through it. In case of sequences and proxy services, request mediation time and response mediation time will be measured separately. Based on the measured values, ESB calculates three useful statistics, namely minimum time, maximum time and average time. Considering the functional differences between the three types of mediation components, WSO2 ESB follows different strategies to measure the mediation time for sequences, proxy services and endpoints.

The request mediation time for a sequence is the time taken to process a request through all the mediators in the sequence. If the sequence consists of a send mediator, which sends the request to a remote service, the time taken to send the message and receive a response is also added to the request mediation time. In case of a proxy service, the request mediation time is the time spent on mediating a message through the in-sequence plus the time taken to forward the request to the target endpoint and receive a response. The request mediation time for an endpoint is, simply, the time taken to send the message to the remote server and receive a response. Due to the way the endpoint mediation time is measured, the resulting values serve as performance indicators of the backend services.

The response mediation times for sequences and proxy services are measured from the moment the responses are received by the mediation engine. Response mediation time is generally the time taken to process a response message through the relevant mediators, or in case of proxy services, through the out-sequence.

Typically, in proxy services and sequences, the requests are sent to a backend service for further processing, and the responses are simply routed back to the client. In such scenarios, the request mediation time will include the time taken to send the message to the backend service and receive a response. The response mediation time, however, will only consist of the time taken to process the response through the service bus alone. As a result, in most practical scenarios, the request mediation time is much greater than the corresponding response mediation time. This concept is further illustrated by the following equations:

Request mediation time = Time to mediate the request through the ESB + Time to send the request 
to the backend service + Time taken by the backend service to process the request + Time to send 
the response back to the ESB

Response mediation time = Time to mediate the response through the ESB

In case of simple sequences and proxy services, the mediation time could be very small. It could even take values less than 1 ms. When this happens, the statistics collector will report the mediation time as 0 ms.

Faults

Fault count is one of the key statistics associated with sequences, proxy services and endpoints. The fault counting logic of the WSO2 ESB statistics module considers following situations as fault conditions:

  • Receiving a SOAP fault from a backend service
  • Encountering a communication error while sending a request out
  • Encountering a runtime exception while mediating a message through a mediator

The total count for a particular entity includes its fault count. Therefore fault count is always less than or equals to the corresponding total count. In situations where the total number of messages is very large and faults are unavoidable, the ratio between the fault count and the total count can become a very useful statistic.

To get most accurate fault counts, one should enable statistics on the relevant fault handler sequences. Since all the faulty messages are dispatched to the fault handlers, this will ensure that each and every fault will be counted by the fault counting logic.

Configuring the Statistics Collector

Starting from WSO2 ESB v2.1.3, the runtime behavior of the mediation statistics component can be customized to suit your actual data collection requirements. This is achieved by adding some entries to the carbon.xml file which can be found in the repository/conf directory of the ESB installation (In ESB v2.1.3 this file is in the conf directory). The following sample XML snippet shows all the available settings:

<MediationStat>
	<ReportingInterval>5000</ReportingInterval>
	<Persistence>enabled</Persistence>
	<RegistryLocation>/stats/mediation</RegistryLocation>
	<Observers>com.test.data.Collector</Observers>
</MediationStat>

The following table describes each of the configurable parameters:

Parameter/Setting Description Default Value
ReportingInterval The mediation statistics component uses a worker thread to collect data from the mediation engine. This thread runs continuously as long as there is data available in the bus. If there is no data to be collected, the worker thread goes to sleep for a specified amount of time. This parameter is used to specify the duration of the sleep, in milliseconds. Once the sleep duration has expired, the reporter thread will wake up and continue operations. A lower ReportingInterval value results in a proactive statistics collector and a very dynamic statistics UI. 5000
Persistence Set this parameter to “enabled” if you want the statistics component to save the collected data to the governance registry. disabled
RegistryLocation This parameter is required only if persistence is enabled for mediation statistics. It is used to specify the location in the governance registry, where the data should be saved.
Observers Use this parameter to engage custom statistics consumers. More on this later.

Putting Numbers to Use

Keeping a close eye on the mediation statistics collected by the ESB could be crucial in a production environment. These numbers can help a server administrator identify potential bottlenecks in an ESB configuration, understand traffic flow patterns and even discover various software and network errors.

If the average mediation time of an endpoint increases unusually, that is an indication of the backend service performing poorly or the network link between the ESB and the service getting congested. If a proxy service or a sequence reports a very high fault count compared to its total count, that could be a sign of backend services not functioning as expected or ESB experiencing communication errors while sending messages. If the average time, minimum time and maximum time for a particular item has close values, it can be deduced that the network, the ESB and backend servers are all functioning smoothly. However if the difference between the minimum time and the maximum time is unusually large, that indicates the network or some application is functioning poorly at times, causing considerable delays.

Ideally, the average mediation time for all sequences, endpoints and proxy services should be at a minimum and so does the fault count. Also the difference between the minimum times and corresponding maximum times should be low. While it is possible to tune up ESB configurations and backend services to provide low average times and fault counts, it is almost impossible to maintain a low difference between the maximum and minimum times. This is because networks tend to operate in burst mode, most of the time.

Extending Mediation Statistics

Latest WSO2 ESB releases introduce the concept of custom statistics consumers (CSC). A CSC is a user developed extension of the mediation statistics component. It can be deployed on the ESB to get statistics from the in-memory data store and further process them. One of the most common use cases of CSC is to get the mediation statistics collected by the bus and write them to an external database. Another usage is to make the statistics available to an external monitoring application. The rest of this section explains how to develop and deploy a CSC on WSO2 ESB. While the instructions are mainly targeted at WSO2 ESB v3.0 the same concepts can be used with WSO2 ESB v2.1.3. However please note that the APIs provided by the two ESB versions are slightly different.

Step 01: Implement the MediationStatisticsObserver interface

All custom statistics consumers must implement the MediationStatisticsObserver interface. This interface can be found in the org.wso2.carbon.mediation.statistics component. In WSO2 ESB v3.0 this jar file can be found in the repository/components/plugins directory.
The interface consists of two methods. A simple implementation of the interface is given below as an example. It does not do anything interesting other than simply printing out some information using the statistics records, but it clearly demonstrates the power of the API.

package org.wso2.esb.demo;

import org.wso2.carbon.mediation.statistics.MediationStatisticsObserver;
import org.wso2.carbon.mediation.statistics.MediationStatisticsSnapshot;
import org.wso2.carbon.mediation.statistics.StatisticsRecord;

import java.util.Date;

public class MyStatisticsConsumer implements MediationStatisticsObserver {

    public void destroy() {
        System.out.println("Destroying the custom statistics consumer");
    }

    public void updateStatistics(MediationStatisticsSnapshot snapshot) {
        System.out.println("Received statistics update at : " + new Date());

        StatisticsRecord latestRecord;
        if (snapshot.getEntitySnapshot() == null) {
            // Entity snapshot is null for the very first update
            latestRecord = snapshot.getUpdate();
        } else {
            // If the entity snapshot is not null combine it with the current update
            // to obtain the latest cumulative record
            latestRecord = new StatisticsRecord(snapshot.getEntitySnapshot());
            latestRecord.updateRecord(snapshot.getUpdate());
        }

        String direction = latestRecord.isInStatistic() ? "In" : "Out";

        System.out.println("Latest statistics for " + latestRecord.getType() + " " +
                latestRecord.getResourceId() + "(" + direction + ")");
        
        System.out.println("\tTotal count: " + latestRecord.getTotalCount());
        System.out.println("\tFault count:" + latestRecord.getFaultCount());
        System.out.println("\tMax time:" + latestRecord.getMaxTime());
        System.out.println("\tMin time:" + latestRecord.getMinTime());
        System.out.println("\tAverage time:" + latestRecord.getAvgTime() + "\n");
    }
}

The updateStatistics method is invoked by the mediation statistics component whenever a record in the in-memory data store is updated. This method takes a MediationStatisticsSnapshot object as an argument. Following methods can be invoked on this object to retrieve the latest statistical information:

Method Description
getUpdate Get the latest update received from the mediation engine
getEntitySnapshot Get the cumulative record for the proxy, sequence or endpoint to which the latest update relates to
getCategorySnapshot Get the cumulative record for the category to which the latest update relates to (available categories are all proxy services, all sequences and all endpoints)
getErrorLogs The list of error log instances which gives out detailed information about faults that has occurred

For the first update received from the mediation engine, entity snapshot and category snapshot will be null. The name of the proxy, sequence or endpoint being updated can be retrieved as follows:

snapshot.getUpdate().getResourceID();

or

snapshot.getEntitySnapshot().getResourceID();

The type of the resource can be determined by one of the following methods:

snapshot.getUpdate().getType();
snapshot.getEntitySnapshot().getType();
snapshot.getCategorySnapshot().getType();

The returned value is of type org.apache.synapse.aspects.ComponentType which is a Java enum. The following elements are defined in this enum:

  • PROXYSERVICE
  • ENDPOINT
  • SEQUENCE
  • ANY (not relevant to CSC)

The destroy method of the interface is called at system shutdown. This method should be used to cleanup and release any resources used by the CSC.

Step 02: Deploy the Custom Statistics Consumer

Compile the Java code and package it into a jar file. To compile the above sample code you have to add the Synapse core component to the classpath. In WSO2 ESB v3.0 this component is named org.apache.synapse.synapse-core and it can be found in the repository/components/plugins directory.

Having built the jar file containing the CSC implementation, place it in the repository/components/lib directory of the ESB. Now we should instruct the ESB to load the CSC at runtime. To do this open up the carbon.xml file in the repository/conf directory and add the following XML snippet into it:

<MediationStat>
        <Observers>org.wso2.demo.CustomStatConsumer</Observers>
</MediationStat>

The text under the Observers element should represent a class name or a comma separated list of class names.

Step 03: Start the ESB and Test the Consumer

Now start the ESB. Deploy a sequence or a proxy service, and enable statistics on it using the UI. Send a few requests to the ESB so that some data will be collected by the data store. You will be able to see something similar to the following being printed on the console, by the custom consumer:

Received statistics update at : Sun Jun 06 15:41:57 IST 2010
Latest statistics for PROXYSERVICE FooProxy(In)
	Total count: 1
	Fault count:0
	Max time:269
	Min time:269
	Average time:269.0

Received statistics update at : Sun Jun 06 15:41:57 IST 2010
Latest statistics for PROXYSERVICE FooProxy(Out)
	Total count: 1
	Fault count:0
	Max time:1
	Min time:1
	Average time:1.0

Received statistics update at : Sun Jun 06 15:42:17 IST 2010
Latest statistics for PROXYSERVICE FooProxy(In)
	Total count: 2
	Fault count:0
	Max time:269
	Min time:10
	Average time:139.5

Received statistics update at : Sun Jun 06 15:42:17 IST 2010
Latest statistics for PROXYSERVICE FooProxy(Out)
	Total count: 2
	Fault count:0
	Max time:1
	Min time:0
	Average time:0.5

Best Practices for Writing Custom Statistics Consumers

  • Implementations of the MediationStatisticsObserver interface should not attempt to modify the values of given StatisticsRecord instances, within the updateStatistics method. Doing so will modify the values stored in the in-memory data store. If it is required to modify the values, a copy of the relevant StatisticsRecord should be obtained by invoking the copy constructor:
    StatisticsRecord copy = new StatisticsRecord(original);
  • The updateStatistics method should not take a long time to execute. Otherwise subsequent updates will be delayed. Time consuming tasks such as database lookups should not be performed within the updateStatistics method. Such operations should be carried out by separate threads. PersistingStatisticsObserver, one of the built-in statistics consumers, takes this approach. This consumer writes statistics to the registry via the WSO2 Registry API. Registry operations could be time consuming due to the database access overhead and the network latency. Therefore PersistingStatisticsObserver queues up all updates in an in-memory queue. The registry operations are then carried out by a separate thread.
  • Always use a package name when developing a CSC. This is important to avoid any OSGi class loading issues.
  • Integrate logging into the CSC implementation. This will help greatly while debugging and testing. One could easily use the Apache Commons Logging framework which is the default used in WSO2 ESB and WSO2 Carbon.
  • If it is required to merge two records into one, use the updateRecord method of the StatisticsRecord class. For an example to merge the record ‘a’ with ‘b’ you can invoke the following statement:
a.updateRecord(b);

Registering Custom Statistics Consumers

Previously we used the carbon.xml file to register the custom consumer with the mediation statistics component (step 02). It is also possible to do this programmatically as shown below:

MediationStatisticsStore.getInstance().registerObserver(customConsumer);

If you are developing a Carbon component, the MediationStatisticsStore can be accessed via OSGi declarative services. Use the Maven SCR plug-in to generate the service descriptors and add the following annotation to the service component class.

/**
* @scr.reference name="mediation.statistics"
* interface="org.wso2.carbon.mediation.statistics.services.MediationStatisticsService"
* cardinality="1..1" policy="dynamic"
* bind="setMediationStatisticsService" unbind="unsetMediationStatisticsService"
*/

Now in the Java code one could do the following:

mediationStatisticsService.getStatisticsStore().registerObserver(customObserver);

This is the best way of registering custom statistics consumers programmatically since this method additionally ensures consistent ordering in class loading.

BAM Integration

The mediation statistics collected by WSO2 ESB can be monitored via WSO2 Business Activity Monitor (WSO2 BAM). For this WSO2 BAM provides the mediation data publisher component. This component can be installed in the ESB, and then it can be configured to expose mediation statistics to the BAM server, using an event driven mechanism. The mediation data publisher component uses a CSC to get the data records from the service bus.

WSO2 BAM provides a convenient and comprehensive way for monitoring runtime mediation statistics. BAM users can monitor the latest statistics through gadgets and drill down on the records, if required, to obtain more fine grained data.

More information on how to integrate WSO2 ESB with WSO2 BAM can be found in the BAM documentation.

Conclusion

WSO2 ESB provides a powerful framework for collecting statistical information on mediation sequences, endpoints and proxy services. It is a very convenient feature to use with ample UI support. It allows the user to select the sequences, endpoints and proxy services on which data should be collected. In addition to that, the runtime behavior of the statistics collector can be configured by editing the carbon.xml file. The user can develop their own custom statistics consumers and deploy them in the ESB to execute custom data processing routines on the mediation statistics gathered by the service bus. If more advanced and sophisticated monitoring capabilities are required, WSO2 ESB mediation statistics can be easily linked up with WSO2 BAM. Mediation statistics are very useful in evaluating system performance, troubleshooting integration systems and understanding usage patterns.

Author: Hiranya Jayathilaka

Senior Software Engineer, WSO2 Inc.

 

About Author

  • Hiranya Jayathilaka
  • PhD student
  • Department of Computer Science at UC Santa Barbara