How Distributed Logging Works in WSO2 Stratos

  • By Amani Soysa
  • 9 Nov, 2012

Applies To

WSO2 Carbon 4.x.x

Contents

Introduction

WSO2 Stratos is a distributed clustered set up where we have several applications such as ESB Servers, Application Servers, Identity Servers, Governance Servers, Data Services Sever, etc.  deployed together to work with each other to serve as a Platform as a Service (PaaS). Each of these servers are deployed in a clustered environment, where there will be more than one node for a given server and depending on the need, there will be new nodes spawned dynamically inside this cluster. All these servers are fronted through an Elastic Load Balancer. Depending on the request the load balancer will send the requests to a selected node in a round robin fashion.

What do you do when an error occurs in a deployment like above; where there are 13 different types of servers running in production, and each of these servers are clustered and load balanced across 50+ servers? It would be a nightmare for system administrators to log-in to each server and grepping for the logs to identify the exact caus of the error. This is why distributed application deployments need to keep centralized application logs. These centralized logs should also be kept in a high scalable data storage, and ordered manner with easy access. So that the users (administrators and developers) can easily accesses logs, whenever something unexpected happens, with the least amount of filtering to pinpoint the exact cause of the issue.
 
When designing a logging  system such as above, there are several things you need to consider.
  1. Capturing the right information inside the LogEvent – You have to make sure all the information you need in order to monitor your logs are aggregated in the LogEvent. For example in a cloud deployment setup you have to make sure that not only the basic log details (logger, date, log level) are enough to  point a critical issue. You further need tenant information (user/domain), Host information (to identify which node is sending what), Name of the server (from which server you are getting the log) etc. This information is very critical when it comes to analyzing and monitoring logs in an efficient way.
  2. Send logs to a centralized system in a nonblocking asynchronous manner so that monitoring will not affect the performance of the applications.
  3. High availability and Scalability
  4. Security – Stratos can be deployed and hosted in public clouds. Its important to make sure the logging system is highly secured.
  5. How to display system/application logs in an efficient way with filtering options along with log rotation.
These are the 5 main aspects when designing the distributed logging architecture. Since WSO2 Stratos support multi-tenancy we made sure that logs can be separated by tenants, services, and applications. 
 

MT-Logging with WSO2 BAM 2.0

WSO2 BAM 2.0 provides a rich set of tools for aggregation, analyzing and presentation for large scale data sets and any monitoring scenario can be easily modelled according to the BAM architecture. We selected WSO2 BAM as the backbone of our logging architecture mainly because it provides high performance with non intrusiveness along with high scalability and security. Since those are the crucial factors essential for a distributed logging system WSO2 BAM became the ideal candidate for MT-Logging architecture.
 

 

Publishing Logs to BAM 

We implemented a Log4JAppender to send LogEvents to BAM. There we used BAM Data agents get Log Data across to BAM. BAM data agents send data using thrift protocol which gives us high performance message throughput as well as it is non-blocking and asynchronous. When publishing Log events to BAM, we make sure the Data Stream is created per tenant, per server, per date. When the data stream is initialized there will be an unique column family created per tenant, per server, per date and the logs will be stored in that column family in a predefine keyspace in a Cassandra cluster.

The Data stream defines the set of information which needs to be stored for a particular LogEvent and can be modelled into a Data Model.

 

Data Model which is used for Log Event

{'name':'log. tenantId. applicationName.date','version':'1.0.0', 'nickName':'Logs', 'description':'Logging Event',
'metaData':[{'name':'clientType','type':'STRING'} ], 
'payloadData':[
   {'name':'tenantID','type':'STRING'},
   {'name':'serverName','type':'STRING'},
   {'name':'appName','type':'STRING'},
   {'name':'logTime','type':'LONG'},
   {'name':'priority','type':'STRING'},
   {'name':'message','type':'STRING'},
   {'name':'logger','type':'STRING'},
   {'name':'ip','type':'STRING'},
   {'name':'instance','type':'STRING'},
   {'name':'stacktrace','type':'STRING'}
 ] } 


We extend org.apache.log4j.PatternLayout a in order to capture tenant information, server information and node information and wrap it with log4j LogEvent

Log Rotation and Archiving

Once we send the log events to BAM the logs will be saved in a Cassandra cluster. WSO2 BAM provides a rich set of tools to create analytic and schedule task. Therefore, we used these Hadoop tasks to rotate logs daily and archive them and store it in a secure environment. In order to do that we use a Hive query which will run daily as a cron job. It will read Cassandra data store, retrieve all the column families per tenant per application and archive them in to gzip format.


The Hive Query which is used to rotate logs daily
set logs_column_family = %s;
set file_path= %s;
drop table LogStats;
set mapred.output.compress=true;
set hive.exec.compress.output=true;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec;

CREATE EXTERNAL TABLE IF NOT EXISTS LogStats (key STRING,
payload_tenantID STRING,payload_serverName STRING,
payload_appName STRING,payload_message STRING,
payload_stacktrace STRING,
payload_logger STRING,
payload_priority STRING,payload_logTime BIGINT) 
STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler' 
WITH SERDEPROPERTIES ( "cassandra.host" = %s,
"cassandra.port" = %s,"cassandra.ks.name" = %s,
"cassandra.ks.username" = %s,"cassandra.ks.password" = %s,
"cassandra.cf.name" = ${hiveconf:logs_column_family},
"cassandra.columns.mapping" = 
":key,payload_tenantID,
payload_serverName,payload_appName,payload_message,
payload_stacktrace,payload_logger,payload_priority,
payload_logTime" );
INSERT OVERWRITE LOCAL DIRECTORY 'file:///${hiveconf:file_path}' 
select 
concat('TID[',payload_tenantID, ']',
'[',payload_serverName,']',
concat('LogTime[',
(from_unixtime(cast(payload_logTime/1000 as BIGINT),'yyyy-MM-dd HH:mm:ss.SSS' )),']\n') as LogTime
concat(payload_priority,'','{',payload_logger,'}','-',payload_message,'',payload_stacktrace)
from LogStats
ORDER BY LogTime

Once we have archived the logs, we will send these archived logs to a secured Apache server. This will get accessed by the log viewer to display and download archived log files.

 

Advantages of sending Logs to WSO2 BAM

  1. Asynchronous and None-Blocking Data publishing
  2. Receives and Stores Log Events Cassandra Cluster which is a highly scalable and Big Data Repository
  3. Rich tool set for analytics
  4. Can be shared with WSO2 CEP for real time Log Event analysis.
  5. Can provide Logging tool boxes and dashboards for system administrators using WSO2 BAM 
  6. High Performance and non-intrusiveness

Monitoring and Analyzing System Logs

  • Using the Log Viewer
    Both application and system logs can be displayed using the management console of a given product. Simply log-in to Management console and under monitor there are two links 1. System logs, which has system logs of the running server 2) Application Logs, which has application level logs (this can be services/web applications) for a selected application. This makes it easy for users to filter logs by the application they develop and monitor logs up to application level.
  • Dashboards and Reports
    System administrators can log-in to WSO2 BAM and create their own dashboards and reports, so they can monitor their logs according to their Key performance Indicators. For example if they want to monitor the number of fatal errors that occur per given month for a given node.
  • SMS Alerts and Emails
    Not just dashboards and Reports, but combining WSO2 BAM with WSO2 CEP you can get real time alerts like trigger emails and text messages (SMS) so that System administrators can instantly get to know when your system is going through an unexpected behaviour.

View System Logs Using the Log Viewer - Current Log

View Logs Using the Log Viewer - Archived Logs

View Application Logs Using the Log Viewer - Current Log

Summary

In summary, a proper  logging framework is very important in a distributed environment in order to find issues effectively. And also its very important to have a proper mechanism to monitor your applications through logging. WSO2 Stratos provide a rich set of tools for distributed logging through WSO2 BAM and it further allows you to monitor/analyze your logs effectively. This rich set of monitoring capabilities can be in built into your deployment using WSO2 Stratos Distributed Logging system, where you don’t have to worry about always going to the system administrator for logs, whenever something goes wrong in your application.

 

Author

Amani Soysa, Senior Software Engineer , WSO2 Inc