2013/06/26
26 Jun, 2013

Monitor Your Key Performance Indicators using WSO2 Business Activity Monitor

  • Kasun Gunathilake
  • Senior Software Engineer - WSO2

Applies To

WSO2 BAM Version 2.0.0 and above

Table of Content

  1. Introduction
  2. BAM architecture
  3. Use case
  • Collect information
  • BAM data receiver
  • Cassandra explorer
  • Data Analysis
  • Writing a hive script for analyzing captured data
  • Visualization
  • Introduction

    Key performance indicators are essential for your organization to measure the performance or success in terms of progress of its goals. Also It is crucial for you to choose the right KPIs with better understanding of what is important to the organization to measure and improve the business productivity of it. WSO2 Business activity monitor has an extremely flexible framework that allows to model your own key performance indicators to suit your monitoring requirement. In this article I am going to explain how you can build your own KPI using WSO2 Business Activity Monitor.  

    BAM architecture

    Following I have shown the BAM architecture which help us to achieve this greater flexibility.

    BAM architecture

    Data that needs to be monitored goes through the above modules and the data flow is as follows.

    1. The system that you are going to monitor have the BAM data agents and these agents will capture the required data and send it to BAM. In addition to the data agents users can send data to BAM through the REST API.
    2. Data receiver in BAM receive this data and store them in Cassandra data store.
    3. Analyzer engine will run the analytics written in hive language according to the monitoring requirement, the summarize data can be persist into RDBMS data store.
    4. In the presentation layer summarized data will fetch from RDBMS and show it in the dashboard/reports.

    Use case

    I am going to illustrate the capability of defining KPIs by using an use-case. Let's think You have hosted a news website and You need to monitor the traffic comes to your website and analyze it to understand how you can enhance your website and marketing strategies to increase the profit.

    KPIs for this use case

    1. No of unique visitors per day
    2. How many visitors does it take for your website to achieve its goals? Your profit will proportional to the no of visitors to your website. If this shows some growth day by day, which means you are making a progress.

    3. Most popular news category
    4. This category will depends on several factors (time, incidents, gossip, events etc.). Hence analyzing your traffic and identifying the current in demand news categories are essential for you to promote your contents more in those sources and increase the profits.

    5. Number of peoples from different locations
    6. This will helpful for identifying the regions that have low no of visitors and take some actions to improve the traffic from these areas.

    Now I am going to illustrate step by step on how to use BAM to define above KPIs.

    Collect information

    Website traffic information can be captured and published into BAM using following ways.

    1. Using Java API
    2. Using client generated by thrift IDL
    3. Using REST API

    BAM data-agent (Java API)

    BAM provides high performance, low latency, load-balancing (between multiple receivers ), non-blocking and multi-threaded API for sending large amount of business events over various transport (TCP,Http) using Apache Thrift. This API has been provided as a Java SDK and you can use it easily for publishing captured data from your Java based system to BAM for analysis. In addition to that these data-agents are compatible with WSO2 CEP that can be used for real time analysis.

    Custom data-agent

    Following I have written a simple asynchronous data-agent for publishing web traffic information to BAM. This will help you to understand the agent API usage. For more information about writing a data publisher for BAM, you can refer this article [1]

    import org.apache.log4j.Logger;
    import org.wso2.carbon.databridge.agent.thrift.Agent;
    import org.wso2.carbon.databridge.agent.thrift.AsyncDataPublisher;
    import org.wso2.carbon.databridge.agent.thrift.conf.AgentConfiguration;
    import org.wso2.carbon.databridge.agent.thrift.exception.AgentException;
    import org.wso2.carbon.databridge.commons.Event;
    
    public class DataAgent {
        private static Logger logger = Logger.getLogger(DataAgent.class);
        public static final String ONLINE_NEWS_STATS_DATA_STREAM = "online_news_stats";
        public static final String VERSION = "1.0.0";
    
        public static final String DATA =
                        "Kasun,Colombo,Sri Lanka vs Australia 3rd ODI,Sports,1366433587\n" +
                        "Amal,Kaluthara,Businessman killed in Expressway accident,Accidents,1366533595\n" +
                        "Kamal,Colombo,Navy intelligence investigating boat escape,Military,1366633695\n" +
                        "Kalum,Mathara,No leadership change - JVP,Politics,1366433520\n" +
                        "Nuwan,Galle,Marginal improvement at this year’s O/L exam results,Education,1366533987\n"+
                        "Sampath,Mathara,Marginal improvement at this year’s O/L exam results,Education,1366433987\n"+
                        "Chamath,Mathara,Attempts to replace Minister Sirisena: UNP,Politics,1366634587\n" +
                        "Prabath,Colombo,WikiLeaks; LTTE could have threatened Karunanidhi,War,1366535587\n" +
                        "Amila,Kandy,Private bus owners to launch one-day strike,Transport,1366732587\n" +
                        "Budhdhika,Colombo,ICC introduces new 'No ball' playing condition,Sports,1366731587\n" +
                        "Rangana,Nuwara Eliya,Annual horse racing event in Nuwara Eliya,Sports,1366534787\n" +
                        "Gamini,Colombo,O/L 2012 results released,Education,1366434987\n" +
                        "Tharindu,Colombo,O/L best results,Education,1366334087\n" +
                        "Janaka,Kaluthara,O/L best results,Education,1366839787\n" +
                        "Anuranga,Jaffna,Sri Lanka vs Australia 3rd ODI,Sports,1366336737\n" +
                        "Kasun,Colombo,Marginal improvement at this year’s O/L exam results,Education,1366338788\n"+
                        "Ranga,Kandy,Private bus owners to launch one-day strike,Transport,1366832727\n" +
                        "Denis,Kaluthara,ICC introduces new 'No ball' playing condition,Sports,1366534117\n" +
                        "Harsha,Galle,No leadership change - JVP,Politics,1366533333\n";
    
    
        public static void main(String[] args) {
            AgentConfiguration agentConfiguration = new AgentConfiguration();
            System.setProperty("javax.net.ssl.trustStore", "/opt/wso2bam-2.2.0/repository/resources/security/client-truststore.jks");
            System.setProperty("javax.net.ssl.trustStorePassword", "wso2carbon");
            Agent agent = new Agent(agentConfiguration);
    
            //Using Asynchronous data publisher
            AsyncDataPublisher asyncDataPublisher = new AsyncDataPublisher("tcp://127.0.0.1:7611", "admin", "admin", agent);
            String streamDefinition = "{" +
                    "  'name':'" + ONLINE_NEWS_STATS_DATA_STREAM + "'," +
                    "  'version':'" + VERSION + "'," +
                    "  'nickName': 'News stats'," +
                    "  'description': 'Stats of the news web site'," +
                    "  'metaData':[" +
                    "          {'name':'publisherIP','type':'STRING'}" +
                    "  ]," +
                    "  'payloadData':[" +
                    "          {'name':'userName','type':'STRING'}," +
                    "          {'name':'region','type':'STRING'}," +
                    "          {'name':'news','type':'STRING'}," +
                    "          {'name':'tag','type':'STRING'}," +
                    "          {'name':'timestamp','type':'LONG'}" +
                    "  ]" +
                    "}";
            asyncDataPublisher.addStreamDefinition(streamDefinition, ONLINE_NEWS_STATS_DATA_STREAM, VERSION);
            publishEvents(asyncDataPublisher);
    
        }
    
        private static void publishEvents(AsyncDataPublisher dataPublisher) {
            String[] dataRow = DATA.split("\n");
            for (String row : dataRow) {
                String[] data = row.split(",");
                Object[] payload = new Object[]{data[0], data[1], data[2],
                        data[3], Long.parseLong(data[4])};
                Event event = eventObject(null, new Object[]{"10.100.3.173"}, payload);
                try {
                    dataPublisher.publish(ONLINE_NEWS_STATS_DATA_STREAM, VERSION, event);
                } catch (AgentException e) {
                    logger.error("Failed to publish event", e);
                }
            }
        }
    
        private static Event eventObject(Object[] correlationData, Object[] metaData, Object[] payLoadData) {
            Event event = new Event();
            event.setCorrelationData(correlationData);
            event.setMetaData(metaData);
            event.setPayloadData(payLoadData);
            return event;
        }
    }
    

    Non Java Data-agent

    Thrift IDL can be used to generate thrift clients from different languages to publish data. In this case asynchronous data publishing implementations will be to developed by the developers the way similar to how it has been implemented in Java API.

    Languages are supported by thrift

    1. C++
    2. C#
    3. Cocoa
    4. D
    5. Delphi
    6. Erlang
    7. Haskell
    8. Java
    9. OCaml
    10. Perl
    11. PHP
    12. Python
    13. Ruby
    14. Smalltalk

    REST API

    You can capture the website traffic information from your system and use BAM REST API for publishing those data to BAM via Http transport.

    BAM data receiver

    Data send from above data agents will receive by the data receiver which use thrift and internal optimization techniques to achieve very high throughput. Those received data will directly write to Cassandra which is high performance and scalable big data storage. It persists events into a column family with name equal to the stream name. Also BAM data receiver can be used share the events with WSO2 CEP for real time analysis.

    Cassandra explorer

    It provides graphical user interface for viewing the data in a Cassandra column family.

    Now let's see the data published from our sample client.

    Go to Home → Manage → Cassandra Explorer → Connect to Cluster

    Type the connection details as below.

    Cassandra Explorer Connect

    You can see all the keyspaces and column families are listed in Cassandra explorer ui.

    KeySpaces UI

    Then go to online_news_stats in the EVENT_KS.

    Column Family

    You can view the data by going to the column family.

    Rows

    After successfully publish the data we can do the analysis.

    Data Analysis

    You can run data analytics on captured data by using the BAM analytics framework which is powered by Apache Hadoop for scaling out data processing operations on a large number of data processing nodes, in order to handle large data volumes. WSO2 BAM provides an easy way to do the map/reduce operation on Hadoop by using Apache Hive that provides a simple way to write query and managing large datasets residing in distributed storage using a SQL-like language called HiveQL.

    KPIs

    Following I have list down the KPIs that we are going to analyze.

    1. No of unique visitors per day
    2. Most popular news category
    3. No of peoples from different regions

    Writing a hive script for analyzing captured data

    This is the Hive script to analyze the above KPIs.

    CREATE EXTERNAL TABLE IF NOT EXISTS OnlineNewsStats (key STRING, userName STRING,region STRING, 
    news STRING, tag STRING, payload_timestamp BIGINT) STORED BY 
    'org.apache.hadoop.hive.cassandra.CassandraStorageHandler' WITH SERDEPROPERTIES ( "cassandra.host" = "127.0.0.1", 
    "cassandra.port" = "9160","cassandra.ks.name" = "EVENT_KS", 
    "cassandra.ks.username" = "admin","cassandra.ks.password" = "admin", 
    "cassandra.cf.name" = "online_news_stats", 
    "cassandra.columns.mapping" = ":key,payload_userName,payload_region,payload_news,payload_tag,payload_timestamp" ); 
    
    CREATE EXTERNAL TABLE IF NOT EXISTS UniqueVisitorsPerDay(count INT, day STRING) 
    STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES ( 
    'wso2.carbon.datasource.name'='WSO2BAM_DATASOURCE', 
    'hive.jdbc.update.on.duplicate' = 'true', 
    'hive.jdbc.primary.key.fields' = 'day', 
    'hive.jdbc.table.create.query' = 'CREATE TABLE UNIQUE_VISITORS_PER_DAY ( count INT, day VARCHAR(100))' ); 
    
    insert overwrite table UniqueVisitorsPerDay select count ( distinct username), from_unixtime(payload_timestamp,'yyyy-MM-dd') as day from OnlineNewsStats group by from_unixtime(payload_timestamp,'yyyy-MM-dd' ); 
    
    CREATE EXTERNAL TABLE IF NOT EXISTS PopularNewsCategory(count INT, category STRING) 
    STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES ( 
    'wso2.carbon.datasource.name'='WSO2BAM_DATASOURCE', 
    'hive.jdbc.update.on.duplicate' = 'true', 
    'hive.jdbc.primary.key.fields' = 'category', 
    'hive.jdbc.table.create.query' = 'CREATE TABLE POPULAR_NEWS_CATEGORY ( count INT, category VARCHAR(100))' ); 
    
    insert overwrite table PopularNewsCategory select count(tag), tag from OnlineNewsStats group by tag; 
    
    CREATE EXTERNAL TABLE IF NOT EXISTS PeoplesFromDifferentLocations(count INT, region STRING) 
    STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES ( 
    'wso2.carbon.datasource.name'='WSO2BAM_DATASOURCE', 
    'hive.jdbc.update.on.duplicate' = 'true', 
    'hive.jdbc.primary.key.fields' = 'region', 
    'hive.jdbc.table.create.query' = 'CREATE TABLE PEOPLES_FROM_DIFFERENT_LOCATIONS ( count INT, region VARCHAR(100))' ); 
    
    insert overwrite table PeoplesFromDifferentLocations select count(region),region from OnlineNewsStats group by region;
    

    RDBMS data storage can be configured from master-datasources.xml in WSO2BAM/repository/conf/datasources/ directory. You need to change the following section

    <datasource> 
               <name>WSO2BAM_DATASOURCE</name> 
               <description>The datasource used for analyzer data</description> 
               <definition type="RDBMS"> 
                   <configuration> 
                       <url>jdbc:h2:repository/database/samples/BAM_STATS_DB;AUTO_SERVER=TRUE</url> 
                       <username>wso2carbon</username> 
                       <password>wso2carbon</password> 
                       <driverClassName>org.h2.Driver</driverClassName> 
                       <maxActive>50</maxActive> 
                       <maxWait>60000</maxWait> 
                       <testOnBorrow>true</testOnBorrow> 
                       <validationQuery>SELECT 1</validationQuery> 
                       <validationInterval>30000</validationInterval> 
                   </configuration> 
               </definition> 
           </datasource>
    

    From BAM 2.3.0 onwards you can use Cassandra datasource in hive script and it can be configured from master-datasources.xml.

    Once you run the above script hive read data from the Cassandra storage and summarize it, then the summarized data will persist into RDBMS storage to visualize via dashboard.

    Visualization

    After you analyze and summarize data according to your KPIs, you need to visualize your kpis. Since those summarized data in the RDBMS storage you can plug third party visualization tools for the visualization. Here you can find an article about using different reporting frameworks with WSO2 Business Activity Monitor [2]

    Also you can use WSO2 Jaggery framework[3] or WSO2 BAM gadget generation tool to visualize your KPIs.

    Here I have used gadget generation tool for generating and deploying gadgets in dashboard. You can find information on how to use it from here [4]

    Dashboard

    Dashboard

    Number of Visitors Per Day

    Number of Visitors Per Day

    Popular News Category

    Popular News Category

    Visitors Distribution

    Visitors Distribution

    References

    1. https://wso2.org/library/articles/2012/07/creating-custom-agents-publish-events-bamcep
    2. https://wso2.org/library/articles/2012/09/using-different-reporting-frameworks-wso2-business-activity-monitor
    3. https://jaggeryjs.org/
    4. https://docs.wso2.org/wiki/display/BAM220/Gadget+Generation+Tool

    Author

    Kasun Weranga Gunathilake, Senior Software Engineer, WSO2 Inc.

     

    About Author

    • Kasun Gunathilake
    • Senior Software Engineer
    • WSO2 Inc