How Distributed Logging Works in WSO2 Stratos

- Amani Soysa
- Associate Technical Lead - WSO2
Applies To
WSO2 Carbon | 4.x.x |
Contents
- Introduction
- MT-Logging with WSO2 BAM 2.0
- Publishing Logs to BAM
- Log Rotation and Archiving
- Advantages of sending Logs to WSO2 BAM
- Monitoring and Analyzing System Logs
- Summary
Introduction
WSO2 Stratos is a distributed clustered set up where we have several applications such as ESB Servers, Application Servers, Identity Servers, Governance Servers, Data Services Sever, etc. deployed together to work with each other to serve as a Platform as a Service (PaaS). Each of these servers are deployed in a clustered environment, where there will be more than one node for a given server and depending on the need, there will be new nodes spawned dynamically inside this cluster. All these servers are fronted through an Elastic Load Balancer. Depending on the request the load balancer will send the requests to a selected node in a round robin fashion.
-
Capturing the right information inside the LogEvent – You have to make sure all the information you need in order to monitor your logs are aggregated in the LogEvent. For example in a cloud deployment setup you have to make sure that not only the basic log details (logger, date, log level) are enough to point a critical issue. You further need tenant information (user/domain), Host information (to identify which node is sending what), Name of the server (from which server you are getting the log) etc. This information is very critical when it comes to analyzing and monitoring logs in an efficient way.
-
Send logs to a centralized system in a nonblocking asynchronous manner so that monitoring will not affect the performance of the applications.
-
High availability and Scalability
-
Security – Stratos can be deployed and hosted in public clouds. Its important to make sure the logging system is highly secured.
-
How to display system/application logs in an efficient way with filtering options along with log rotation.
MT-Logging with WSO2 BAM 2.0

Publishing Logs to BAM
We implemented a Log4JAppender to send LogEvents to BAM. There we used BAM Data agents get Log Data across to BAM. BAM data agents send data using thrift protocol which gives us high performance message throughput as well as it is non-blocking and asynchronous. When publishing Log events to BAM, we make sure the Data Stream is created per tenant, per server, per date. When the data stream is initialized there will be an unique column family created per tenant, per server, per date and the logs will be stored in that column family in a predefine keyspace in a Cassandra cluster.
Data Model which is used for Log Event
{'name':'log. tenantId. applicationName.date','version':'1.0.0', 'nickName':'Logs', 'description':'Logging Event', 'metaData':[{'name':'clientType','type':'STRING'} ], 'payloadData':[ {'name':'tenantID','type':'STRING'}, {'name':'serverName','type':'STRING'}, {'name':'appName','type':'STRING'}, {'name':'logTime','type':'LONG'}, {'name':'priority','type':'STRING'}, {'name':'message','type':'STRING'}, {'name':'logger','type':'STRING'}, {'name':'ip','type':'STRING'}, {'name':'instance','type':'STRING'}, {'name':'stacktrace','type':'STRING'} ] }
We extend org.apache.log4j.PatternLayout a in order to capture tenant information, server information and node information and wrap it with log4j LogEvent
Log Rotation and Archiving
Once we send the log events to BAM the logs will be saved in a Cassandra cluster. WSO2 BAM provides a rich set of tools to create analytic and schedule task. Therefore, we used these Hadoop tasks to rotate logs daily and archive them and store it in a secure environment. In order to do that we use a Hive query which will run daily as a cron job. It will read Cassandra data store, retrieve all the column families per tenant per application and archive them in to gzip format.
set logs_column_family = %s; set file_path= %s; drop table LogStats; set mapred.output.compress=true; set hive.exec.compress.output=true; set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec; CREATE EXTERNAL TABLE IF NOT EXISTS LogStats (key STRING, payload_tenantID STRING,payload_serverName STRING, payload_appName STRING,payload_message STRING, payload_stacktrace STRING, payload_logger STRING, payload_priority STRING,payload_logTime BIGINT) STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler' WITH SERDEPROPERTIES ( "cassandra.host" = %s, "cassandra.port" = %s,"cassandra.ks.name" = %s, "cassandra.ks.username" = %s,"cassandra.ks.password" = %s, "cassandra.cf.name" = ${hiveconf:logs_column_family}, "cassandra.columns.mapping" = ":key,payload_tenantID, payload_serverName,payload_appName,payload_message, payload_stacktrace,payload_logger,payload_priority, payload_logTime" ); INSERT OVERWRITE LOCAL DIRECTORY 'file:///${hiveconf:file_path}' select concat('TID[',payload_tenantID, ']', '[',payload_serverName,']', concat('LogTime[', (from_unixtime(cast(payload_logTime/1000 as BIGINT),'yyyy-MM-dd HH:mm:ss.SSS' )),']\n') as LogTime concat(payload_priority,'','{',payload_logger,'}','-',payload_message,'',payload_stacktrace) from LogStats ORDER BY LogTime
Advantages of sending Logs to WSO2 BAM
- Asynchronous and None-Blocking Data publishing
- Receives and Stores Log Events Cassandra Cluster which is a highly scalable and Big Data Repository
- Rich tool set for analytics
- Can be shared with WSO2 CEP for real time Log Event analysis.
- Can provide Logging tool boxes and dashboards for system administrators using WSO2 BAM
- High Performance and non-intrusiveness
Monitoring and Analyzing System Logs
-
Using the Log ViewerBoth application and system logs can be displayed using the management console of a given product. Simply log-in to Management console and under monitor there are two links 1. System logs, which has system logs of the running server 2) Application Logs, which has application level logs (this can be services/web applications) for a selected application. This makes it easy for users to filter logs by the application they develop and monitor logs up to application level.
-
Dashboards and ReportsSystem administrators can log-in to WSO2 BAM and create their own dashboards and reports, so they can monitor their logs according to their Key performance Indicators. For example if they want to monitor the number of fatal errors that occur per given month for a given node.
View System Logs Using the Log Viewer - Current Log
View Logs Using the Log Viewer - Archived Logs
View Application Logs Using the Log Viewer - Current Log
Summary
In summary, a proper logging framework is very important in a distributed environment in order to find issues effectively. And also its very important to have a proper mechanism to monitor your applications through logging. WSO2 Stratos provide a rich set of tools for distributed logging through WSO2 BAM and it further allows you to monitor/analyze your logs effectively. This rich set of monitoring capabilities can be in built into your deployment using WSO2 Stratos Distributed Logging system, where you don’t have to worry about always going to the system administrator for logs, whenever something goes wrong in your application.
Author
Amani Soysa, Senior Software Engineer , WSO2 Inc