[Blog Post] CEP performance: Processing 100k to millions of events per second using WSO2 Complex Event Processing (CEP) server
By wso2 wso2
- 5 Aug, 2013
With WSO2 CEP, you can use SQL style queries to detect interesting patterns across many data streams. We call the standalone version of the CEP as Siddhi, and that is what you need to use if you need to embed a CEP engine within a java program. On the other hand, WSO2 CEP provides CEP query support running as a server, and you can send events using Thrift, Web Services, REST calls, JMS, and emails.
WSO2 CEP can handle a few 100k events over the network and a few million events within the JVM. We have done and published those numbers before. In this post, I will try to put this all together and give some context to the different numbers.
The following event includes multiple properties and queries that match those events against the given conditions.
Same JVM events performance
Setup: We used Intel(R) Xeon(R) X3440 @2.53GHz , 4 cores 8M cache 8GB RAM running Debian 2.6.32-5-amd64 Kernel. We genereted events from the same JVM.
Case 1: Simple filler (from StockTick[prize >6] return symbol, prize)
Case 2: Window (From StickTick[symbol=‘IBM’]#win.time(0.005) return symbol, avg(prize))
Case 3: Events patterns A->B (A followed by B).
From f=FraudWarningEvent ->
Performance over the network
Setup: We used Intel® Core™ i7-2630QM CPU @ 2.00GHz, 8 cores, 8GB RAM running Ubnthu 12.04, 3.2.0-32-generic Kernel, for running CEP and used Intel® Core™ i3-2350M CPU @ 2.30GHz, 4 cores, 4GB RAM running Ubnthu 12.04, 3.2.0-32-generic Kernel, for the three client nodes.
The following results are for a simple filter; we sent events over the network using thrift.
Performance of a complex scenario
Finally, the following is the performance for a DEBS grand challenge. The grand challenge was to detect the following scenarios from the event generated from a real football game.
Usecase 1: Running analysis
The first usecase measures each player’s running speeds and calculates how long he spent on different speed ranges. For example, results will show that the player "Martin" is running fast from the time 27 minutes and 01 second of the game to 27 minute and 35 second of the game.
Usecase 2 and 4: Ball possession and shots on goal
For the second usecase, we need to calculate the time each player controlled the ball (ball possession). A player controls the ball from the time he hit the ball until someone has hit the ball, the ball goes out of the ground, or the game has been stopped. We identify hits when a ball is within one meter of a player and its acceleration increases by more than 55ms-2.
The usecase 4 is to detect hits and emit events if the ball is going to the goal.
Usecase 3: Heatmap of activity
Usecase 3 divides the ground to a grid, and calculates the time a player spends on each cell. However, this usecase needs updates once every second. In the first part, we can solve it just like as in the first usecase, but to make sure we get an event once a second, we had to couple it with a timer.
You can find more information from http://www.orgs.ttu.edu/debs2013/index.php?goto=cfchallengedetails , and you can find the queries from this blog post.
Setup: VM with 4 cores (@2.8 GHz), 4 GB RAM, SSD HDD, and 1GB Ethernet, and we replayed events from the same JVM.