30 Jan, 2013

ESB Performance Round 6.5

Dushan Abeyruwan
Direcror - WSO2 Inc

See here for latest ESB Performance Round 7.5

Introduction
Performance Enhancements Introduced in WSO2 ESB 4.6.0
Setup Test Environment

ESB 4.6.0 Configuration
AMI EC2 Informationtion
Back End Echo Services Information
Front End Information

Testing Scenarios
Running the Load Test
Observations
Average Latency
Google Spreadsheet for Complete Analysis
WSO2 ESB 4.6.0 Memory Usage Analysis (After Long Run)
Conclusion
Notices
Appendix

Introduction

This article presents the latest (as of January 2013) performance study conducted to compare the performance of WSO2 ESB 4.6.0, WSO2 ESB 4.5.1, Mule 3.3.0, Talend-SE-5.1.1, and UltraESB 1.7.1 - Enhanced. WSO2 has been performing and publishing ESB performance data since 2007, and this paper continues our ongoing work to ensure that we evaluate and publish useful performance numbers for our customers. In addition, the WSO2 ESB 4.6.0 incorporates significant performance updates to ensure that it provides class-leading performance in a number of scenarios.

WSO2 ESB 4.6.0 is the latest version of the ESB at the time of writing, and the data below shows several performance improvements over ESB 4.5.x. The performance benchmarks are performed in Amazon EC2 as done in the ESB performance Round 6, so that they can be independently verified and repeated.

We have chosen to called this test “Round 6.5” because we are comparing against the same versions and results of the ESBs used in the Round 6 performance benchmark published on the UltraESB-managed performance site esbperformance.org. For the sake of comparison, we have included the Round 6 results for UltraESB 1.7.1, Mule ESB 3.3.0, and Talend-SE-5.1.1.

Performance Enhancements Introduced in WSO2 ESB 4.6.0

WSO2 ESB uses the Passthrough Transport (PTT) as the default transport. The PTT has improved performance over the Non-Blocking HTTP Transport (NHTTP), which was the previous default transport available with WSO2 ESB. Both transports utilize the Apache HTTPCore NIO project to provide high-performance, low-latency support for HTTP.

The main difference between PTT and NHTTP is that when the PTT is the default and both the incoming transport and outgoing transport are HTTP, the PTT can utilize a single shared buffer for relaying network data from input network connections to output connections. In similar circumstances, the NHTTP transport utilizes two buffers. This improvement is particularly useful for the performance of pass-through mediation. You can find more details on performance in this article.
We have improved the mediation engine to be content aware, so it can differentiate between use cases that do read and do not read the message being mediated. Consequently, it can intelligently identify scenarios where the engine can safely stream data from the incoming connection to outgoing connections without reading them to memory. With earlier WSO2 releases, users had to identify and engage pass-through support as needed.
WSO2 ESB 4.6.0 particularly focuses on XPath support for content-based routing. The new XPath support includes a partial compilation strategy to target specific scenarios. The following article goes into much more detail about these improvements:

Streaming XPath Parser for WSO2 ESB by Andun Sameera
A significantly improved XSLT capability called “Fast XSLT” provides much faster streaming transformation when used with the PTT. The new streaming XSLT mediation capability provides significant performance gains in many cases (especially when messages that can be fit into the default buffer performed faster than previous XSLT mediator) and also results in significantly better memory usage. See the EnhancedXSLTProxy results in the graphs below to evaluate the differences.

Setup Test Environment

Following are the tuning parameters we used with WSO2 ESB 4.6.0. For the other ESBs, we used the same configurations as in the Round 6 performance benchmark.

WSO2 ESB 4.6.0 Configuration

The memory allocated for the ESB was increased by modifying /bin/wso2server.sh with the following setting:

Default setting for WSO2 ESB 4.6.0: -Xms256m -Xmx512m -XX:MaxPermSize=256m

New setting used for benchmarking: -Xms2048m -Xmx2048m -XX:MaxPermSize=1024m
In /repository/conf/passthru-http.properties we set the following parameters:

http.socket.timeout=120000

worker_pool_size_core=400

worker_pool_size_max=500

io_buffer_size=16384
In the /repository/conf/synapse.properties there are two new parameters that are defined to enable Streaming XPath. We uncommented these:

synapse.streaming.xpath.enabled=true

synapse.temp_data.chunk.size=3072
Disable access logs in /repository/conf/log4j.properties as follows:

log4j.logger.org.apache.synapse.transport.nhttp.access=WARN

log4j.logger.org.apache.synapse.transport.passthru.access=WARN
OS level optimizations - in /etc/sysctl.conf:

net.ipv4.ip_local_port_range = 1024 65535

net.ipv4.tcp_fin_timeout = 30

fs.file-max = 2097152

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_tw_reuse = 1

And in /etc/security/limits.conf:

* soft nofile 4096

* hard nofile 65535

All the load tests were carried out using Amazon AWS EC2 with the instance type "High-CPU Extra Large (c1.xlarge, 7GB).” The configuration used is described below. The back-end echo service used for testing was hosted in fully optimized Apache Tomcat 7.0.29.

AMI EC2 Information

vendor_id	GenuineIntel
cpu family	GenuineIntel
model	23
model name	Intel(R) Xeon(R) CPU E5410 @ 2.33GHz
Nos of Cores	8
RAM Size	7GB (High-CPU Extra Large)

Back End Echo Services Information

Server : apache-tomcat-7.0.2

Front End Information

Client : apache-ab

Testing Scenarios

The test setup includes a back-end service, clients, and ESB sitting in between and performing the following mediation scenarios. The back-end service is a minimal implementation that introduces minimal overhead.

DirectProxy
CBRProxy
CBRSOAPHeaderProxy
XSLTProxy
XSLT Enhanced Proxy (Using FAST XSLT mediator written to handle with passthrough transport)
SecureProxy

For each scenario, we tested using different concurrency values and different message sizes. Sample message sizes ranged from 512 bytes to 100K bytes, with concurrency levels of 20, 40, 80, 160, 320, 640, 1280, and 2560 users.

Running the Load Test

Set Up WSO2 ESB 4.6.0

Download ESB 4.6.0.
Set up the ESB and OS as described above.
To start the ESB, execute "./wso2server.sh" from the "
[WSO2ESB]/wso2esb-4.6.0/bin" directory.
Download benchmark-client-wso2.zip directory and extract.
Download synapse_config.zip and extract to [ESB_HOME]/repository/deployment/server (override existing synapse configuration directory).
Download esb_config.zip and extract to [ESB_HOME]/repository/conf (override and replace the given files).
Download service_war.zip , extract and place the war file to TOMCAT_HOME/webapps.
To run the load test, execute the load generator as follows from
the
"~/benchmark-client-wso2 directory:
- ubuntu@wso2:~/benchmark-client-wso2/loadtest.sh
  https://localhost:8280/services > wso2-4.6.0.txt
- Note: You may be required to alter the
  SOURCE_PATH=[HOME_BENCHMARK_DIRECTORY]/benchmark-client-wso2 in the
  loadtest.sh accordingly
Once the test is over, to export collected data in CSV format, run the following command:
- java -jar
  [HOME_BENCHMARK_CLIENT]/ab-to-csv.jar wso2-4.6.0.txt
  wso2-4.6.0.csv

Observations

The following table and graph show the results of the performance test. Note that this is similar to the graph present in the earlier Round 6 article except that this uses a bar graph instead of the line graph, because the X axis is continuous. This graph takes the average across all message sizes.

First we ran WSO2 ESB 4.6.0 using the same setup and configurations explained in round 6, and the following table and graph shows the results. see Appendix [1].

However, we observed that the data in this test was perhaps unreliable. We investigated this in detail and observed that in the published test, the high concurrency results were showing unexpected high values. For the Round 6 tests, there was a very low number of messages per client (n=10) for high concurrency. This does not allow the JVM time to warm up and also does not represent a long running test.

Therefore, we chose to re-run the test using 200 messages per client with higher concurrencies, and then the anomalous data was no longer observed. We also re-ran the UltraESB in an EC2 image with 200 messages from each client for the higher concurrencies following the instructions given in the Round 6 article. Our observations for n=200 and high concurrency with UltraESB were much lower than those previously published. We gave the UltraESB team the benefit of the doubt, and we have published our lower numbers (with n=200 for high concurrency) against UltraESB’s existing published numbers (with n=10 for high concurrency). We also publish the existing numbers for Mule and Talend. We did not rerun the Mule or Talend tests using n=200. The results are shown in the following graph and table.

We would encourage future test runs to utilize n=200 for higher concurrencies to ensure reasonable results.

Higher number of requests yielded more accurate results

(Number of messages per client n=1000 up to 40 concurrency, n=100 for 80 to 320 concurrency and n=1000 for 1280 /2560 higher concurrency)

	Mule 3.3.0	Talend-SE-5.1.1	UltraESB 1.7.1-Enhanced	WSO2 ESB 4.5.1	WSO2 ESB 4.6.0
DirectProxy	3,375	3,315	4,839	3,311	5,278
CBRProxy	1,458	3,108	4,703	2,920	4,078
CBRSOAPHeaderProxy	2,262	3,185	5,063	3,270	4,634
CBRTransportHeaderProxy	2,225	3,751	5,523	3,849	5,998
XSLTProxy	2,225	2,333	3,387	1,856	1,800
XSLTEnhancedProxy (Fast XSLT mediator)	2,225	2,333	3,387	N/A	3,456
Security	458	534	603	529	566

The above graph demonstrates that WSO2 ESB 4.6.0 is consistently faster than ESB 4.5.1 and performs competitively with UltraESB in the scenarios. While the XSLTProxy remains significantly less performant than the equivalent UltraESB test, this is addressed by the inclusion in WSO2 ESB 4.6.0 of the new FastXSLT option, which provides performance on par with UltraESB.

Average Latency

To measure the latency added by WSO2 ESB versions 4.6.0 and 4.5.1, we calculated the latency for concurrency of 20 and message size 1000 using the following formula: [Latency = Direct Backend Call - Round Trip via ESB]. The following table lists the results:

Latency for 1K message	ESB 4.6.0	ESB 4.5.1
DirectProxy	0.582	0.828
CBRProxy	0.742	0.979
CBRSOAPHeaderProxy	0.691	0.857
CBRTransportHeaderProxy	0.827	0.861
XSLTProxy	1.987	1.818
XSLTEnhancedProxy	1.345	N/A
SecureProxy	8.867	9.497

As indicated in the table and graph, WSO2 ESB provides sub-millisecond latency overhead in most scenarios.

Google Spreadsheet for Complete Analysis

For complete analysis, you can view the Google spreadsheet of collected data.

WSO2 ESB 4.6.0 Memory Usage Analysis (After Long Run)

The following graphs illustrate the memory usage after 40 hours. As you can see, heap memory utilization is stable.

Conclusion

This article presents a performance study comparing WSO2 ESB 4.6.0, WSO2 ESB 4.5.1, Mule 3.3.0, Talend-SE 5.1.1, and UltraESB 1.7.1. The results indicate that WSO2 ESB 4.6.0 has significantly improved performance over previous versions and performs better than the other compared ESBs in most scenarios.

Notices

WSO2 and WSO2 ESB are trademarks of WSO2 Inc. UltraESB and AdroitLogic are trademarks of AdroitLogic Private Ltd. Mule ESB and MuleSoft are trademarks of MuleSoft, Inc. Talend and Talend ESB are trademarks of Talend, Inc. All other product and company names and marks mentioned are the property of their respective owners and are mentioned for identification purposes only.

Appendix

[1] Unrealistic data observed for lower number of requests

(Number of messages per client n=1000 up to 40 concurrency, n=100 for 80 to 640 concurrency and n=1000 for 1280/2560 higher concurrency)

	Mule 3.3.0	Talend-SE-5.1.1	UltraESB 1.7.1 -Enhanced	WSO2 ESB 4.5.1	WSO2 ESB 4.6.0
DirectProxy	3,375	3,315	4,839	2,879	8,278
CBRProxy	1,458	3,108	4,703	2,694	7,765
CBRSOAPHeaderProxy	2,262	3,185	5,063	3,118	7,573
CBRTransportHeaderProxy	2,225	3,751	5,523	3,502	11,024
XSLTProxy	2,225	2,333	3,387	1,735	2,504
XSLTEnhancedProxy (Fast XSLT mediator)	2,225	2,333	3,387	N/A	5,473
Security	458	534	603	486	560

About Author

Dushan Abeyruwan
Direcror
WSO2 Inc

Open Source

SaaS

API Management

Open Source

SaaS

Integration

Open Source

SaaS

Identity and Access Management

Open Source

SaaS

Internal Developer Platform

SaaS

ESB Performance Round 6.5

See here for latest ESB Performance Round 7.5

Table of Contents

Introduction

Performance Enhancements Introduced in WSO2 ESB 4.6.0

Setup Test Environment

WSO2 ESB 4.6.0 Configuration

AMI EC2 Information

Back End Echo Services Information

Front End Information

Testing Scenarios

Running the Load Test

Set Up WSO2 ESB 4.6.0

Observations

Higher number of requests yielded more accurate results

(Number of messages per client n=1000 up to 40 concurrency, n=100 for 80 to 320 concurrency and n=1000 for 1280 /2560 higher concurrency)

Average Latency

Google Spreadsheet for Complete Analysis

WSO2 ESB 4.6.0 Memory Usage Analysis (After Long Run)

Conclusion

Notices

Appendix

[1] Unrealistic data observed for lower number of requests

(Number of messages per client n=1000 up to 40 concurrency, n=100 for 80 to 640 concurrency and n=1000 for 1280/2560 higher concurrency)

About Author

Identity and Access
Management