Table of contents
User level familiarity with WSO2 ESB, WSO2 BAM, monitoring the ESB with BAM and message formats used in ESB (XML and SOAP) will be useful. Refer to this article to gain some knowledge about monitoring WSO2 ESB with WSO2 BAM.
Activity inside a WSO2 ESB can be monitored by a WSO2 BAM in two main existing Data Agents packaged with ESB.
- Mediation Statistics Data Agent
- BAM Mediator
Mediation Statistics Data Agent is used to monitor overall ESB-related statistics such as message count and fault message count passed through Proxy Services, Sequences and Endpoints. This data agent is not suitable for monitoring individual mediation sequences separately.
BAM Mediator can be included in a mediation sequence that intercepts data passing through the mediator and publishes into the BAM. As this mediator can be located in multiple locations inside the mediator and in a mediation sequence, this Data Agent is used for monitoring activity in each sequence. To get started with the BAM Mediator based web service monitoring via ESB read this article.
Measurements of data flow rates that take place inside the WSO2 ESB are a very important set of parameters in many fields.
The billing and metering feature is provided out of the box in WSO2 Stratos that monitors the bandwidth consumption of each tenant in each payment period (e.g. on a monthly basis), but monitoring messages in host level, service level, or operation level is not provided there. These parameters will be useful for the tenant to figure out the most expensive hosts, services, and operations. On the other hand, the tenant himself may be interested in knowing the days and times that most clients use their system. This needs time-level, drillled-down data monitored from the BAM. This will also make the tenant aware as to why his platform usage is higher, either due to too many user requests or due to large messages. Eventually, the tenant will get a descriptive set of analyzed information as feedback that will help him to fine-tune the business.
One of the most common set of problems of system deployers is to predict the traffic flow passing through each message flow. Most commonly, the user requests response messages and other messages are responsible for increasing the processing requirements and memory requirements of the system. Analyzing the possibility of maximum message flow rates is also very important for achieving reliability of the system. WSO2 ESB is a useful solution when the system needs scalability and load balancing into small hardware systems. In such cases, this load and throughput monitoring in testing environments will help to fine-tune the system and identify any bottlenecks.
Real-time throttling is a basic security and reliability requirement for every system. BAM is not recommended for hard, real-time throttling at all as BAM is inherently designed for batch-wise analysis, which unavoidably introduces significant latency for delivering results. WSO2 CEP is recommended for high responsive, real-time throttling based on analytics solutions. However, BAM can be scaled intelligently so that the latency can be minimized, which enables the BAM to behave as a long-term throttling monitor. For example, overloading the system due to an increased number of subscribers during a certain period of a month can be identified and can be used to take actions based on the analyzed data.
Although ESB load monitoring is possible with BAM using BAM Mediator, there are some drawbacks.
- As discussed above, WSO2 BAM is not a real-time tool. This will introduce some latency into the system.
- Publishing all the data mediating in the ESB, including Message Context Properties and other meta data, will not be possible with BAM mediator. This will show a much lower flow rate of data than the real flow rate.
- BAM Mediator will not log direct information related to CPU power used for processing messages. The size of the message will not represent the processing power used accurately.
- Introduction of BAM Mediator will increase the CPU usage and memory consumption than in natural behaviour. The network used by the Data Agent will reduce the available network bandwidth, hence the drop in the message flow rate inside the ESB than the real values without the Data Agent (BAM Mediator).
From here onwards, lets discuss a scenario of monitoring the ESB data flow rate in a mediation sequence when a sample web service is proxied and data is published by a BAM Mediator. For ease of explanation, we use a pseudo data generator publishing a sample set of messages that are similar to messages received from an ESB. For real deployment of monitoring a web service by an ESB and publishing events for a BAM, refer to the article recommended previously.
For this exercise, a binary distribution of WSO2 BAM version 2.3.0 can be downloaded from here. Activity monitoring sample packed with the distribution is used as the data generator for the scenario as it generates a similar format of events, actually generated by a BAM Mediator. Hive scripts will be provided step by step and finally will generate some gadgets in the BAM dashboard. This will be demonstrated in a *nix environment. So note that all the console commands are *nix commands.
Now, lets start with the BAM distribution. Unzip the archive, wso2bam-2.3.0.zip.
$ unzip wso2bam-2.3.0.zip
$ cd wso2bam-2.3.0/bin
After the server is successfully started, open a new terminal, go to the home directory of the BAM pack, and run the Activity Monitoring sample. Note that you need to install Apache Ant in your local machine to run the sample.
$ cd <BAM_HOME>
$ cd samples/activity-monitoring
Wait until all the 100 events are published and BUILD SUCCESSFUL message is printed. Now, 100 sample events similar to events from the ESB are published into the Cassandra database in BAM. Events can be seen by logging into BAM Management Console and going to the Main -> Cassandra Explorer -> Explore Cluster in the Management Console. Then click on the org_wso2_bam_activity_monitoring in EVENT_KS keyspace. Then click on View More button to see the content of each event.
This is the time to analyze the data and create the summary table of received data loads. In this demonstration, we aggregate the load into operation level. That means the total load is calculated for each operation of each service in each host. Time level aggregation is not explained here as it would increase the complexity. If interested, some idea can be gained on the time level drill down by studying the Hive scripts in Service Statistics and Mediation Statics toolboxes.
Now, lets deploy the Hive script given below in BAM as given in this document.
CREATE EXTERNAL TABLE IF NOT EXISTS ActivityDataTable
(messageID STRING, sentTimestamp BIGINT, activityID STRING, version STRING, soapHeader STRING,
soapBody STRING, host STRING, serviceName STRING, operationName STRING)
STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
WITH SERDEPROPERTIES (
"wso2.carbon.datasource.name" = "WSO2BAM_CASSANDRA_DATASOURCE" ,
"cassandra.cf.name" = "org_wso2_bam_activity_monitoring" ,
":key, payload_timestamp, correlation_bam_activity_id, Version, payload_soap_header,
payload_soap_body, meta_host, payload_service_name, payload_operation_name" );
CREATE EXTERNAL TABLE IF NOT EXISTS CategoryTable(host STRING, service_name STRING,
operation_name STRING, totalLoad BIGINT, totalMessages BIGINT)
STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler'
'hive.jdbc.update.on.duplicate' = 'true',
'hive.jdbc.primary.key.fields' = 'host,service_name,operation_name',
'CREATE TABLE CATEGORY ( host VARCHAR(100) NOT NULL, service_name VARCHAR(150),
operation_name VARCHAR(150), totalLoad BIGINT, totalMessages BIGINT )' );
INSERT OVERWRITE TABLE CategoryTable SELECT host, serviceName, operationName,
SUM(LENGTH(soapHeader) + LENGTH(soapBody)), COUNT(DISTINCT messageID)
FROM ActivityDataTable GROUP BY host, serviceName, operationName;
Note that, in this example, we assume that all the data flowing through the mediation sequence is the message header and message body. Properties and other data are neglected for simplicity. If required, they too can be extracted with the help of other mediators such as Property mediator as explained in the referred OT article. In this scenario, the number of characters is used as the measure of the size of the message, instead of bits or bytes, for simplicity. Hence, the data size passed per message is assumed as the addition of the number of characters in the synapse message header and synapse message body. To get started with Hive queries in BAM read this document.
Now run or schedule the Hive script. Then summarized data will be available in the H2 embedded database in BAM. The following screenshot shows them in the H2 client.
Note that total load (TOTALLOAD) and total count (TOTALMESSAGES) is counted for each tuple, <HOST, SERVICE_NAME, OPERATION_NAME>. If the requirement is to get the aggregated values for HOST only, service_name and operation_name fields can be removed from the Hive table, CategoryTable and SQL table, CATEGORY. If only services in each host are required to be aggregated, operation_name can be removed. These optimizations will reduce the summary table (i.e. RDBMS table) size and also will reduce the processing in SQL queries on the summary table.
The next step is to visualize them on the BAM dashboard using the summary table, CATEGORY. This document provides step-by-step instructions to generate the gadget. For this visualization, we use only bar charts. We will generate one bar chart for a selected scenario. Each of these scenarios has a unique importance. Three of such use cases and their usages are listed as below (each image was taken from the H2 client).
Total data load received for each host - used to identify highly and lowly loaded servers for load balancing purposes
SELECT HOST, SUM(TOTALLOAD) AS TOT_MESSAGE_LOAD FROM CATEGORY GROUP BY HOST
Number of messages received for each service - used to identify popular and useful services for sales perspectives
SELECT HOST, SERVICE_NAME, SUM(TOTALMESSAGES) AS TOT_MESSAGE_COUNT
GROUP BY HOST, SERVICE_NAME
Average data load received for each operation - used to identify heavily loaded operations and analyze overloading risk of that operation
SELECT HOST, SERVICE_NAME, OPERATION_NAME, TOTALLOAD/TOTALMESSAGES AS AVG_MESSAGE_LOAD
ORDER BY AVG_MESSAGE_LOAD DESC
Now start following the steps in the gadget generation tool. Steps are only shown for the first use case.
JDBC URL : jdbc:h2:repository/database/samples/BAM_STATS_DB;AUTO_SERVER=TRUE
Driver Class Name : org.h2.Driver
User Name : wso2carbon
Password : wso2carbon
Enter the SQL statement given in scenario 1 and preview.
Select Bar Graph, select axises appropriately and give suitable labels.
Enter title and filename suitably and generate the gadget XML location. Refresh Rate refers to the gadget refreshing rate.
Then copy and paste it as a new gadget into the dashboard.
Finally, the gadget will be shown in the dashboard.
Maninda Edirisooriya, Software Engineer, WSO2.