2013/08/19
19 Aug, 2013

How to Use WSO2 API Manager and Business Activity Monitor Together to Analyze API Usage Stats

  • Sanjeewa Malalgoda
  • Director - Engineering | Architect at WSO2 - WSO2

Applies to

WSO2 API Manager Version 1.3.0 and above
WSO2 BAM Version 2.1.0 and above

Table of contents

  1. Introduction
  2. Architecture
  3. How to add custom publisher
  4. How to add custom hive script to summarize data
  5. How to configure BAM2 and API Manager
  6. Add custom dashboard using gadget tool
  7. Conclusion

Introduction

WSO2 API Manager is a complete solution for publishing APIs, creating and managing a developer community and for scalable routing of API traffic. It leverages proven, production-ready, integration, security and governance components from the WSO2 Enterprise Service Bus, WSO2 Identity Server, and WSO2 Governance Registry. In addition, as it is also powered by the WSO2 Business Activity Monitor, the WSO2 API Manager is ready for massively scalable deployment immediately. In this post, we briefly describe how we can monitor API usage and generate customized reports and views.

Architecture

From the API Manager gateway, we accumulate all necessary data for API invocations. Then the data publisher agent in the API gateway will publish to the given server. This data publisher and server are configurable. All API usage-related configurations are available in the usage tracking section of the api-manager.xml file available under /repository/conf folder.

<APIUsageTracking> <APIManagerDist>

First, we will see how this usage story works. The API gateway is the place where we hit all API traffic. At that point, we have deployed a listener to listen to all API invocations. Therefore, when there is some API invocation, we collect all necessary information from the message and create usage data object. Generally, this data would have the  following (jason) format.

 'name':'org_wso2_apimgt_statistics_request', 
                                         'version':'1.0.0', 
                                         'nickName': 'API Manager Request Data', 
                                         'description': 'Request Data', 
                                         'metaData':[ 
                                                 {'name':'clientType','type':'STRING'} 
                                         ], 
                                         'payloadData':[ 
                                                 {'name':'consumerKey','type':'STRING'}, 
                                                 {'name':'context','type':'STRING'}, 
                                                 {'name':'api_version','type':'STRING'}, 
                                                 {'name':'api','type':'STRING'}, 
                                                 {'name':'resource','type':'STRING'}, 
                                                 {'name':'method','type':'STRING'}, 
                                                 {'name':'version','type':'STRING'}, 
                                                 {'name':'request','type':'INT'}, 
                                                 {'name':'requestTime','type':'LONG'}, 
                                                 {'name':'userId','type':'STRING'},
                                                 {'name':'hostName','type':'STRING'}   ] 
                                       }

Once we generate this data, we publish it to BAM2. Inside the BAM2 server, there is a Cassandra server running. So once the BAM2 event receiver hits the usage data message it will directly dump into the Cassandra store, and all received data will be stored inside Cassandra. We used this approach because it's a fast and scalable solution. Once we need to consume the above usage data, we need to read the Cassandra data source and summarize them according to our requirements. For this, we are have used hive. In API Manager, we have toolbox with all necessary data stream definitions and hive queries. We need to deploy the API Manager analytics toolbox into BAM2 servers. Then, it will write summarized data to relational database; hence, we can use them for any purpose. Now let’s see how we can use this data.

Add custom publisher

To configure data publisher, edit the following section of api-manager.xml file available under /repository/conf folder. You can add your custom class if necessary. With this approach, you will be able to publish API usage stats to any other billing server or data collector. If you need to integrate API Manager with any external billing, this would be a good approach for that. If you require further information please refer to article [3] listed in the References section; this article describes how we can add data publisher to CEP, so we can do the same for API Manager as well.

<PublisherClass>org.wso2.carbon.apimgt.usage.publisher.APIMgtUsageDataBridgeDataPublisher</PublisherClass>

How to configure BAM2 and API Manager

First we need to create a database that's named am_stats_db to store summarized data from BAM (you can add any name you wish to as well). You don't need to create any tables inside the database (analyzer scripts will create them when needed). Configure the data source definition in the master-datasources.xml file of API Manager and BAM2 as shown below. We need this data source in BAM to store summarized data (analyzer scripts do this). We use the same data source inside API Manager to pull out aggregated data to present it.

<name>WSO2AM_STATS_DB</name>
<description>The datasource used for getting statistics to API Manager</description>
<jndiConfig>
<name>jdbc/WSO2AM_STATS_DB</name>
</jndiConfig>
<definition type="RDBMS">
<configuration>
<url>jdbc:mysql://localhost:3306/am_stats_db</url>
<username>root</username>
<password>root</password>
<driverClassName>com.mysql.jdbc.Driver</driverClassName>
<maxActive>50</maxActive>
<maxWait>60000</maxWait>
<testOnBorrow>true</testOnBorrow>
<validationQuery>SELECT 1</validationQuery>
<validationInterval>30000</validationInterval>
</configuration>
</definition>
</datasource>

To enable API statistics collection, you need to configure the following properties in the api-manager.xml file of API Manager. In addition, you need to point to the created data source.

<Enabled>true</Enabled>
<DataSourceName>jdbc/WSO2AM_STATS_DB</DataSourceName>

Change port offset paramate to 1 by editing the repository/conf/carbon.xml(in BAM node). If you are running all servers in the same box (this step is required as we publish data to port 7712 from API Manager). For more details, go to the usage tracking section of the api-manager.xml file and change it as required.

Copy the API_Manager_Analytics.tbox(available under wso2am-1.4.0/statistics) to repository/deployment/server/bam-toolbox (create the bam-toolbox directory if it doesn't exist already). Copy the MySQL JDBC driver jar file into the INSTALL_HOME/repository/component/lib folder for all API Manager/BAM servers (added this step as we use a MySQL database as the usage database).

Once you have completed the above steps, you can view API usages for your invocations, APIs and apps created.

How to add custom hive script to summarize data

Sometimes we may need to summarize raw usage data in a customized way or in a formatted manner. So, in such a situation, we add a separate hive script to the default tool box. Let's see how we can customize the hive script to the BAM toolbox. For this example, we consider summarized API usage per month. A toolbox in BAM is an installable archive with a .tbox extension. It contains necessary artifacts that models a complete use case. These artifacts are optional and only the ones available in the toolbox will be deployed.

The artifacts are as follows:

Stream definitions. Stream definitions are descriptions of streams of data to be sent to WSO2 BAM in order to perform analysis.

Analytics. Analytics include the hive scripts to be deployed in WSO2 BAM

Dashboard components. Dashboard components contain the gadget xmls, jaggery files, etc.

More on toolboxes https://docs.wso2.org/wiki/display/BAM200/Introduction+to+BAM+Toolbox

Creating custom toolbox https://docs.wso2.org/wiki/display/BAM200/Creating+a+Custom+Toolbox

For this example, we do not create dashboard (UI) components or stream definitions. What we need to do here is add analytic scripts to summarize data in a desired format. The BAM tool box has the following structure.

Here is the script for summarizing API usage per given API+user combination on a monthly basis. Here, we read summarized API request data and summarize them on a monthly basis for the given API and user.

First we need to create custom hive scripts to generate a request summary for monthly API usages. If you are familiar with SQL then you will easily understand the following set of queries. Here, first we read the API request summary table and then we can create a new table if it does not exist. Then we can write summarized data to the new table.

CREATE EXTERNAL TABLE IF NOT EXISTS APIRequestSummaryDataReader (api STRING, api_version STRING, version STRING, month STRING, year STRING,  consumerKey STRING,userId STRING,context STRING, max_request_time BIGINT, total_request_count INT, time STRING, hostName STRING) STORED BY  'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES ( 'wso2.carbon.datasource.name'='WSO2AM_STATS_DB', 'mapred.jdbc.input.table.name'  = 'API_REQUEST_SUMMARY' );
CREATE EXTERNAL TABLE IF NOT EXISTS APIRequestSummaryMonthlyDataTemp (month STRING, year STRING, api STRING, api_version STRING, context STRING,  userId STRING, total_request_count INT) STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES (  'wso2.carbon.datasource.name'='WSO2AM_STATS_DB', 'hive.jdbc.update.on.duplicate' = 'true', 'hive.jdbc.primary.key.fields'='api,api_version,context,userId',  'hive.jdbc.table.create.query' = 'CREATE TABLE API_REQUEST_MONTHLY_SUMMARY_TEMP ( api VARCHAR(100),context VARCHAR(100), api_version VARCHAR(100),  userId VARCHAR(100), total_request_count INT,  year VARCHAR(100), month VARCHAR(100), PRIMARY KEY(api,api_version,userId,context))' );
insert overwrite table APIRequestSummaryMonthlyDataTemp select month,year, api, api_version ,context, userId, sum(total_request_count) from  APIRequestSummaryDataReader group by api, api_version ,context, userId, month, year ;

Add custom dashboard using gadget tool

WSO2 API Manager uses BAM2 to store and summarize all API usage-related data. BAM2 has report generation and data visualization capability as well. The gadget generation tool is embedded to the BAM2 server by default. The gadget generation tool is a step-by-step wizard that allows you to generate gadgets for your dashboard in a few simple steps. These steps are as follows:

Enter JDBC data source details

Enter SQL to fetch data

Pick UI element for gadget

Enter gadget details

Add gadget to dashboard

To use this tool, click on the "Gadget Gen Tool" located in the left menu. To explain these steps, let's create a simple gadget out of some summarized data off the KPI definition sample.

Enter JDBC data source details

Enter the necessary details for your data source. If you are using any other data source other than h2 (MySQL, Oracle, MSSQL, etc.) the relevant driver needs to be placed in $BAM_HOME/repository/components/lib and the server should be restarted. For this example we use same stats database.

The details entered here are as follows:

JDBC URL : jdbc:mysql://datacenter1:3306/am_stats_db

Driver Class Name :com.mysql.jdbc.Driver

Username : root

Password : root

To check whether the details are correct, you can click on "Validate Connection. If the connection is successful, you will get a success message, and a failure message if the connection cannot be made. Now, click "Next".

Enter SQL to fetch data

In this step, you need to enter the SQL that will fetch the required data for your gadget. You can click the "Preview SQL Results" button to preview the results returned by the SQL query.

Here the SQL statement used is : select * from brandSummary. The image below shows the results returned from this query. Now, click "Next". For different dashboards we will use following SQL queries.

Get API usage(request count) per user.

SELECT sum(total_request_count),userId FROM API_REQUEST_SUMMARY group by userId;

Get API usage(request count) per API.

SELECT sum(total_request_count),api FROM API_REQUEST_SUMMARY group by api;

Get API response time per API/version.

SELECT serviceTime, api_version FROM API_RESPONSE_SUMMARY;

Get API usage(request count) per data center.

SELECT hostName as data_center_domain, sum(total_request_count) FROM API_REQUEST_SUMMARY group by hostname;

Pick UI element for gadget

This step allows you to pick the UI element for visualizing your data. Bar graphs and tables are supported for this release. All options you pick are instantly reflected in the preview area.

Bar graph - The bar graph gives you some configuration options to customize the view of the generated gadget. These are: Chart Title - Title of the chart

Y-Axis Label - Title of the Y Axis

Y-Axis Column - Column for the Y Axis returned from the results

X-Axis Label - Title of the X Axis

X-Axis Column - Column for the X Axis returned from the results

The options you pick are reflected in the preview area as highlighted by the colors in the screen shot below.

Table - The table gives you a configuration option: Table Title : The title of the table

Let's create a bar graph with options shown in the bar graph image for this sample. Click "Next".

Enter gadget details. We are at the final stage of the gadget generation. We need to enter the following details:

Gadget Title : Each gadget will have a title that describes the gadget's functionality. Enter the generated gadget's title here.

Gadget Filename : The file name of the gadget as it will be stored in the system. In this case, it will be stored in the embedded registry.

Refresh rate : The gadget can have a refresh rate, where it will automatically refresh and pull the latest contents out of the data source.

Now, click "Generate!".

Add gadget To dashboard

Now, the gadget has been generated. Copy the location of the gadget displayed in the screen. Next, click "View Portal" from the left hand side menu. Click on add gadget on the left hand side of the portal page. Expand the "Enter Gadget Location" section on the right hand side and paste the gadget location. Then, click on "Add Gadget". Now, the gadget will appear in the dashboard as shown in the screen below. Continue the same process if you need to add more graphs to the dashboard. Once you go to gadget UI you will see the following dashboard with graphs we created.

Conclusion

WSO2 API Manager and BAM2 are compatible with each other and we can deploy them together in production environments. WSO2 BAM comes with SQL-like flexibility for writing analysis algorithms via Apache Hive and tools for creating customized dashboards with zero code. In addition, it comes with high performance and low latency API for receiving large volumes of business events over various transports including Apache Thrift. In WSO2 API Manager, API usage is published to the pluggable analytic framework. We can use this combination to view the metrics by user, API and other parameters.

References

[1] https://en.wikipedia.org/wiki/Multitenancy

[1] https://en.wikipedia.org/wiki/Multitenancy

[2] Data publisher source code svn

[3] Setting up multi receiver data agent

Author

Sanjeewa Malalgoda, Senior Software Engineer, WSO2 Inc.

 

About Author

  • Sanjeewa Malalgoda
  • Director - Engineering | Architect at WSO2
  • WSO2