Tag Archives: Analytics

UNRWA and Capgemini: Creating a Refugee Centric Data Model for Better Insights

The United Nations Relief and Works Agency for Palestine Refugees (UNRWA) has over 5 million registered refugees requiring education, healthcare and social safety assistance, among others. UNRWA aids refugees across five countries – namely Lebanon, Jordan, the West Bank, Syria, and the Gaza Strip which has over 500,000 students, 692 schools as of now, and hundreds of primary health facilities.

In order to automate several processes across the region, the team based in Gaza had already developed the Education Management Information System (EMIS) consisting of three modules (students, staff and premises) and reporting tools. EMIS captures information and manages the educational progress of half a million students, by integrating data from registration, health, facility management and human resources systems that are already in existence.

Yet, given the numbers and scale of its operations, a central data model that has the capacity to integrate data from several entities was the need of the hour to support its regional operations and EMIS. To transform their information management system, UNRWA and Capgemini used WSO2 technology to create a model which mirrors UNRWA’s organizational ethos – placing the refugees at the heart of all their operations.

“The technology is there, but it’s really about the people,” says Francesco Lacoboni, Managing Consultant at Capgemini. Accordingly, the main drivers of the new UNRWA Enterprise Architecture are built upon the strategic principles of people, information, collaboration, and security. People influence how the information is created, managed, and consumed. The platform is an information-centric one – rather than managing documents, it manages open data and content. Its shared approach design aims to improve collaboration, reduce costs, maintain standards, and ensure consistency across the board. Security and privacy features for data protection round off the principles of this platform.

Before the new model was introduced, there was a time where the information that streamed through the system was physically replicated via the transaction log. For reasons of ease and efficiency, UNRWA and Capgemini decided to provide a common set of APIs to all the developers, not only to fulfill the needs of the specific application, but to also create the framework for future use of this semantic concept. Every entity has a credible API that can be used to navigate the knowledge, eliminating the need to design a new API. The resultant Common Data Model (CDM) was created using OWL (Web Ontology Language), and its architecture and governance completed using WSO2’s integration and API management platforms.

For Luca Baldini, Chief of Information Management Services at UNRWA, it was the first time such an approach was used and now that it has been rolled out, he praises its benefits: “The new model has been very productive, as it created a common language between IT specialists and our business representatives. We can use different kinds of technology for data retrieval and distribution.” Francesco believes one of the main benefits of the new model is that it helps increase the transparency of UNRWA’s operations. Now that the new model is successfully in practice, analytics is the next frontier and they hope to leverage WSO2’s analytics capabilities to meet their requirements. Spurred by the possibilities of analytics, plans are in the pipeline to use this data model along with unstructured data provided from the field to improve operations and add further value.

You can watch Luca’s and Francesco’s presentation at WSO2Con USA 2017 to hear more about their project.

Learn more about WSO2’s integration, API management and analytics capabilities if you would like to use them in your enterprise.

Perfecting the Coffee Shop Experience With Real-time Data Analysis

Picture a coffee shop.

The person who runs this shop (let’s call her Sam) operates an online coffee ordering service. Sam intends to differentiate her value offering by providing a more personalized customer experience.

Offering customers their favorite coffee as they walk into the store, rewarding loyal customers with a free drink on special occasions – these are some of the things on her mind.

Further the value creation is not limited to her customers but extends to business operations such as real-time monitoring and management of inventory. Sam wants:

  • A reward system where points will be calculated based on order value. Once a reward tier point value is reached, the customer will be notified in real-time about an entitlement for a free drink
  • Inventory levels are updated in real-time on order placement. An automated notification is sent to suppliers in real-time as predicted re-ordering levels are reached

Overview of the solution

Understanding the customer is the first action in  providing a personalized experience. To do this, one must collect intelligence. In today’s digital business, customers pass through many ‘touchpoints’, leaving a digital trail. For example, many would search ‘health benefits of coffee’, some would publish a review on their favourite coffee type – and so on.           

Application Program Interfaces (or APIs) come into play here. In a business context, APIs are a way that businesses could expose their services externally, so that consumers, using an app or some technological interface, can subscribe to and access these services.

For example, Sam can have an “Order API”  that provides a way for consumers to order coffee from her shop using their mobile app.

What we now need is a simple way to create and publish said API and a central place for consumers to find and subscribe for this API. We also need proper security and an access control mechanism.

Data leaving through the API needs to be collected, stored and analyzed to identify patterns. For example, Sam would like to know what the most used combination of ‘coffee and flavors’ is, at which point of the day, by which type of users – which would be helpful for targeted promotion campaigns. For this, we need to understand the data that comes through. 

In base terms, the system requirements for developing such a solution are to: 

  • Design an API for end user access
  • Publish end user attributes (API data) for analytics
  • Process API data in real-time
  • Communicate outcomes

The solution requires to integrating API Management with real time event processing ,where the API end user attributes can be published to a steaming analytic engine for real time processing. There are many offering in the market that provides separate offering, however integrating these offering has it’s own challenges.

WSO2 offers a completely integrated 100% open source  enterprise platform that enables this kind of use case – on-premise, in the cloud, and on mobile devices.

We offer both an API management and streaming analytics product, architected around the same underlying platform, which enables seamless integration between these offerings.

WSO2 API Manager is a fully open source solution for managing all aspects of APIs including creating, publishing, and exposing APIs to users in a secure and scalable manner. It is a production-ready API management solution that has the capability of managing all stages of the API lifecycle in a massively scalable production environment.

WSO2 CEP is one of the fastest open source solutions available today, find events patterns in real-time milliseconds. It utilizes a high-performance streaming processing engine which facilitates real time event detection, correlation and notification of alerts, combined with rich visualization tools to help build monitoring dashboards.

WSO2 MSF4J is a lightweight framework that offers a fast and easy programming model and an end-to-end microservices architecture to ensure agile delivery and flexible deployment of complex, service-oriented applications.

Building an API for end user access

Let’s examine how we can build this with what we’ve listed above.

WSO2 API Manager includes architectural components, the API Gateway, API Publisher and API Store (Developer Portal), Key Manager, Traffic Manager and API Analytics. The API Publisher provides the primary capability to create and publish an API. The developer portal provides a way for subscribers to access the API.

API data to the streaming analytics engine is published through a default message flow. The solution we have in mind requires changing this default flow to capture and publish custom user data.

This is implemented  as a custom ‘data publishing mediator’ (see mediation extensions).  

In a nutshell, message mediation for simplification can be described as the inflow processing of messages, which could be modified,transformed, routed and many other ‘logics’.  Mediators are the implemented component of the logic, which when linked together creates a sequence or flow of the messages.  With API Manager tooling support, a custom flow is designed using a class mediator to decode, capture and publish end user attributes.

The custom sequence extracts the decoded end user attributes passed via JWT headers. The class mediator acts as a data agent that publishes API data to WSO2 CEP. The parameters passed to the class mediator include the connection details to CEP and the published event stream.

Real-time processing of API Data

To capture API data for real-time processing, the same stream definition and event receiver  is created and mapped to the stream. WSO2 provides a comprehensive set of extensionspredictive analytics capabilities  are added via the WSO2 ML extension.

Coffee reordering

The mechanics of reordering coffee based on a real-time analysis goes thus:

An event table represents the inventory details (‘drink name’ ‘ordered quantity’ , available quantity). The API data stream is joined with the event table and the available quantity in stock is reduced using the order quantity as and when events are received. When the reorder quantity level is reached, a email notification is published.

Real-time rewards

Similar to the approach above, the API data is joined with an event table, the event table represents the end user and the reward points generated per order. The reward points are equated to the order size and reward points are added with each new order placed. A reward limit threshold is defined, and when the limit is reached for a new order a notification is sent to the end user, offering a free drink.

Communicating outcomes

To communicate the  outcome of the real time processing event processing, WSO2 CEP provides capability to generate alerts via an SMS, email, user interface  etc. through event publishers. Email notification can be generated to alert management when re-order level are reached, as well as send an SMS to the client to notify offer for a free drink.

Meanwhile, the backend service for order processing is developed as a Java Microservice using WSO2 MS4FJ, which processes the order and can respond with the order id and cost.

Why Open Source?

As a small business, Sam’s resources are limited. Her best strategy for implementing the solution is open source, which offers lower startup costs and effort compared to the high licensing fee and complications involved with the commercial vendors.

Being open source also allows Sam to download, learn and evaluate the product without a high investment, thus minimizing her business risks. Depending on the results of her evaluations, he could go forward or ‘throw away’.

To grow in a competitive business  environment requires companies to differentiate. For small scale business  it becomes more of a challenge to implement such solution due to resource limitations. The seamless integrated capability provided by the open-source WSO2 Platform provides business a low risk and cost effective technology to build and deliver real-time business value to their clients.

The code for this use case

Listed below are what you need to recreate this discussion as a demo:



Down the following products and set the port offsets to run the servers on the same server. WSO2 APIM runs on the default offset (0) while the WSO2 CEP offset is 4.

WSO2 API Manager 2.0.0
WSO2 Complex Event Processor 4.2.0
WSO2 MSF4J (WSO2 MicroServices Framework for Java)
WSO2 App Cloud

For simplification purposes the inventory details are stored as tables of a MySQL database.

Execute MySQL database script db_script.mysql to create ‘Inventory’ Database and ‘Rewards’ and ‘orders’ table.


  1. Execute the Kopi-service’ java microservice
    1. <WSO2_MSFJ_HOME>/kopi-service/target / Java -jar kopi-service-0.1.jar

Alternatively the java microservice can be deployed in the WSO2 App Cloud.

WSO2 CEP Setup

  1. Setup email configuration for the output event publisher
  2. Copy the JDBC driver JAR file for your database to <CEP_HOME>/repository/components/lib.
  3. Startup the server
  4. Configure a data source as “CEP-DS”.  Select Mysql as the RDBMS and set the database as ‘Inventory’ created.
  5. The created datasource is referenced when defining ‘Event Tables’ when creating the Siddhi queries.
  6. Deploy “Streaming-CApp” CApp . The correct deployment should visualize an event flow as depicted.

WSO2 API Manager Setup

  1. Configure WSO2 API Manager to pass end user attributes as JWT Token.
  2. Copy the custom data publisher implementation (org.wso2.api.publish-1.0-SNAPSHOT.jar ) library to $API_MGR_HOME /repository/components/lib
  3. Startup the Server.
  4. Login to the API Publisher:
  5. Create and publish an  API with the following details
    1. Context -<context>
    2. Version – <version>
    3. API Definition
      1. GET – /order/{orderId}
      2. POST -/order
    4. Set HTTP Endpoint: http://<server-ip:port>/WSO2KopiOutletPlatform/services/w_s_o2_kopi_outlet_service
    5. Change the default API call request flow by enabling message mediation and uploading file  datapublisher.xml as the ‘In Custom Sequence’.
  6. Login to the API Store and subscribe to the created API
  7. Invoke API with an order exceeding available quantity { “Order”:{ “drinkName”:”Doppio”, “additions”:”cream”, “orderQuantity”:2000 } }


Predicting re-order levels

The re-order quantity is initially calculated based on a ‘re-order factor(ROF) and order quantity formula (ROF * order quantity). Siddhi provides a machine learning extension for predictive analytics. The reorder quantity can be predicted using machine learning model.  

The re-order data points calculated previously (with the formula) can be used as data sets to generate a machine learning model with WSO2 Machine Learner. A predicted re-order quantity is calculated based on the “Linear Regression” algorithm, with the “Reorder factor (ROF) and coffee type  as the features.

The siddhi query for predicting reorder quantity is commented under ‘Predict reorder quantity using Machine Learning extensions. It can be executed by replacing the query under ‘Calculating reorder quantity’.

Appendix: code

Custom Sequence

<?xml version=”1.0″ encoding=”UTF-8″?>

<sequence name=”publish-endUser” trace=”disable” xmlns=”http://ws.apache.org/ns/synapse”>

 <log level=”full”/>

 <property expression=”get-property(‘$axis2:HTTP_METHOD’)” name=”VERB”

   scope=”default” type=”STRING” xmlns:ns=”http://org.apache.synapse/xsd”/>

 <property expression=”get-property(‘transport’,’X-JWT-Assertion’)”

   name=”authheader” scope=”default” type=”STRING” xmlns:ns=”http://org.apache.synapse/xsd”/>

 <log level=”custom”>

   <property expression=”base64Decode(get-property(‘authheader’))”

     name=”LOG_AUTHHEADER” xmlns:ns=”http://org.apache.synapse/xsd”/>


 <property expression=”base64Decode(get-property(‘authheader’))”

   name=”decode_auth” scope=”default” type=”STRING” xmlns:ns=”http://org.apache.synapse/xsd”/>

 <script description=”” language=”js”><![CDATA[var jsonStr= mc.getProperty(‘decode_auth’);

var val= new Array();


var decoded= new Array();

decoded= val[1].split(“enduser\”\:”);

var temp_str= new Array();



 <property expression=”get-property(‘end_user’)” name=”endUser”

   scope=”default” type=”STRING”/>

 <log level=”custom”>

   <property expression=”get-property(‘endUser’)” name=”Log_Enduser”/>


 <class name=”org.wso2.api.publish.PublishMediate”>

   <property name=”dasPort” value=”7619″/>

   <property name=”dasUsername” value=”admin”/>

   <property name=”dasPassword” value=”admin”/>

   <property name=”dasHost” value=”localhost”/>

   <property name=”streamName” value=”Data_Stream:1.0.0″/>



Siddhi Query

/* Enter a unique ExecutionPlan */


/* Enter a unique description for ExecutionPlan */

— @Plan:description(‘ExecutionPlan’)

/* define streams/tables and write queries here … */


define stream APIStream (drinkName string, additions string, orderQuantity double, endUser string);


define stream allOrderstream (drinkName string, qtyAvl double, qtyPredict double);


define stream predictStream (drinkName string, qtyPredict double);


define stream orderStream (drinkName string, orderQty double, qtyAvl double, qtyOrder double, ROF double); 


define stream reOrderStream (drinkName string, qtyAvl double, qtyPredict double);


define stream outOrderStream (drinkName string, qtyOrder double, qtyReorder double, ROF double);


define stream ULPointStream (subScriber string, points double);


define stream totPointStream (subScriber string, totPoints double);


define stream FreeOrderStream (subScriber string, points double); 

@from(eventtable=’rdbms’, datasource.name=’CEP-DS’, table.name=’orders’)

define table drinkEventTable(drinkName string, qtyAvl double, qtyOrder double, ROF double);

@from(eventtable=’rdbms’, datasource.name=’CEP-DS’, table.name=’rewards’)

define table pointEventTable(subscriber string, points double);

from APIStream#window.length(0)as t join drinkEventTable as d

on t.drinkName==d.drinkName

select t.drinkName as drinkName, t.orderQuantity as orderQty, d.qtyAvl as qtyAvl,d.qtyOrder as qtyOrder, d.ROF as ROF

insert into orderStream; 

/* ——Drink Reordering————- */

/* —–Calculating reorder quantity———– */ 

from orderStream#window.length(0) as p join drinkEventTable as o

on o.drinkName==p.drinkName

select o.drinkName,o.qtyAvl,(p.orderQty* p.ROF) as qtyPredict

insert into allOrderstream;

/*———————Predict reorder quantity using Machine Learning extentions—————*/


from orderStream

select drinkName,ROF

insert into ROF_Incoming;

from ROF_Incoming#ml:predict(‘registry://_system/governance/ml/Reorder.Model’,’double’,drinkName,ROF)

select drinkName, qtyReorder as qtyPredict

insert into predictStream;

from predictStream#window.length(0) as p join drinkEventTable as o

on o.drinkName==p.drinkName

select o.drinkName,o.qtyAvl, p.qtyPredict

insert into allOrderstream;



partition with (drinkName of allOrderstream)

begin @intro(‘query3’)

from allOrderstream[qtyPredict>=qtyAvl]

select drinkName,qtyAvl,qtyPredict

insert into #tempStream2;

from e2=#tempStream2

select e2.drinkName, e2.qtyAvl,e2.qtyPredict

insert into reOrderStream


from orderStream[(qtyAvl-orderQty)>=0]#window.length(0)as t join drinkEventTable as d

on t.drinkName==d.drinkName

select t.drinkName as drinkName,(d.qtyAvl – t.orderQty) as qtyAvl

update drinkEventTable

on drinkName==drinkEventTable.drinkName; 

/*——————————————– */

 /*—– Offer free drink ——-*/

from APIStream

select endUser as subScriber ,orderQuantity as points

insert into ULPointStream;

 from ULPointStream as u join pointEventTable as p

on u.subScriber == p.subscriber

select u.subScriber as subscriber ,(u.points+p.points) as points

update pointEventTable

on subscriber==pointEventTable.subscriber;

from ULPointStream[not(pointEventTable.subscriber==subScriber in pointEventTable)]

select subScriber as subscriber,points

insert into pointEventTable;

from ULPointStream as u join pointEventTable as p

on u.subScriber == p.subscriber

select u.subScriber as subScriber,p.points as totPoints

insert into totPointStream;

 partition with (subScriber of totPointStream)

begin @info(name = ‘query4’)

from totPointStream[totPoints>=100]

select *

insert into #tempStream;

 from e1= #tempStream

select subScriber, totPoints as points

insert into FreeOrderStream

end ;


Better Transport for a better London: How We Won TfL’s Data in Motion Hackathon

Transport for London (TfL)  is a fascinating organization. The iconic red circle is practically part and parcel of the everyday life of the 1.3 billion people that the TfL network transports across London.

As part of their mandate, TfL is constantly on the search for ways better manage traffic, train capacity, maintenance, and even account for air quality during commutes. These are some very interesting challenges, so when TfL, Amazon Web Services and Geovation hosted a public hackathon, we at WSO2 decided to come up with our own answers to some of these problems.

Framing the problem

TfL’s Chief Technical Architect, Gordon Watson, catches up with the WSO2 team. Photo by TFL.

TfL pushes out a lot of data regarding the many factors that affect public transport within Greater London; a lot of this is easily accessible via the TfL Unified API from https://api.tfl.gov.uk/. In addition to volumes of historical data, TfL also controls a network of SCOOT traffic sensors deployed across London. Given a two-day timeframe, we narrowed our focus down to three main areas:

  1. To use historical data regarding the number of passengers at stations to predict how many people would be on a selected train or inside a selected station
  2. To use Google Maps and combine that with sensor data from TfL sensors across the city to pick the best routes from point A to B, while predicting traffic, five to ten minutes into the future, so that commuters could pick the best routes
  3. To pair air quality data from any given region and suggest safer walking and cycling routes for the denizens of Greater London

Using WSO2 Complex Event Processor (which holds our Siddhi CEP engine) with Apache Spark and Lucene (courtesy of WSO2 Data Analytics Server), we were able to use TfL’s data to build a demo app that provided a solution for these three scenarios.


For starters, here’s how we addressed the first problem. With data analysis, it’s not just possible to estimate how many people are inside a station; we can break this down to understand traffic from entrance to a platform, from a platform to the exit, and between platforms. This makes it possible to predict incoming and outgoing crowd numbers. The map-based user interface that you see above allows us to represent this analysis.

The second solution makes use of the sensor network we spoke of earlier. Here’s how TfL sees traffic.


The red dots are junctions; yellow dots are sensors; dashed lines indicate traffic flow. The redder the dashed lines are, the denser the traffic at that area. We can overlay the map with reported incidents and ongoing roadworks, as seen in the screenshot below:

3Once this picture is complete, we have the data needed to account for road and traffic conditions while finding optimal routes.

This is what Google suggests:


We can push the data we have to WSO2 CEP, which runs streaming queries to perform flow, traffic, and density analytics. Random Forest classification enables us to use this data to build a machine learning model for predicting traffic – a model which, even with relatively little data, was 88% accurate in our tests.  Combining all of this gives us a richer traffic analysis picture altogether.


For the third problem – the question of presenting safer walking and cycling routes using air quality – our app pulled air pollution data from TfL’s Unified API.

This helps us to map walking routes; since we know where the bike stations are, it also lets us map safer cycling routes. It also allows us to push weather forecasts and air quality updates to commuters.

A better understanding of London traffic

In each scenario, we were also able to pinpoint ways of expanding on, or improving what we hacked together. What this essentially means is that we can better understand traffic inside train stations, both for TfL and for commuters. We can use image processing and WiFi connections to better gauge the number of people inside each compartment; we can show occupancy numbers in real-time across screens in each station, and on apps, and assist passengers with finding the best platform to catch a less crowded compartment.

We can even feed Oyster Card tap data into WSO2 Data Analytics Server, apply machine learning to build a predictive model, and use WSO2 CEP to predict source to destination travel times. Depending on screen real estate, both air quality and noise level measures could be integrated to keep commuters better informed of their travelling conditions.

How can we improve on traffic prediction? By examining historical data, making a traffic prediction, then comparing that with actual traffic levels, we could potentially predict  traffic incidents that our sensors might have missed. We could also add location-based alerts pushed out the commuters – and congestion warnings and time-to-target countdowns on public buses.

We have to say that there were a number of other companies hacking away on excellent solutions of their own; it was rather gratifying to be picked as the winners of the hackathon. For more information, and to learn about the solutions that we competed against, please read TfL’s blog post on the hackathon.

WSO2 API Manager 2.0.0: Better Statistics


Any API management tool requires the ability to view statistics – accurately and with details essential to the operation of APIs. WSO2 API Manager 2.0.0 introduces powerful analytics features along these lines by using the capabilities of WSO2 Data Analytics Server (WSO2 DAS), bringing all API, application and subscription related statistics to one location.


New additions

In order to help you make sense of these analytics, we’ve added the following graphs:

1.Published APIs Over Time

This graph can be used to check API creation rates. A user can check APIs based on the API provider or can view statistics related to all API providers.


2. API Latency

The API Latency breakdown graph can be used to check time consumption for each level in the message flow. It shows total time taken to handle a request and provides the time each component spends on that request – for example, a user can check the amount of time spent on the authentication process and so on. This graph is useful for identifying delays in the API.

image013. API Usage Across Geo Locations

This graph provides statistics based on the location of the API request. It uses X-Forwarded-For header to identify the location of the user.


4. API Usage Across Usage Agent

This graph provides information related to the access mechanism of the API, and helps provide insight into the kind of devices and platforms used to access the API – mobiles, computers and so on.


We’ve also added an Applications Created Over Time graph under the Applications section. As with the API creation graph, this provides statistics related to application creation. Users can drill down the data based on the registered user and the API in question.


In addition to these come the subscription graphs: Developer Signups Over Time and Subscriptions Created Over Time, both of  which are self-explanatory.



How These Statistics Work

WSO2 Analytics Server for API Manager is essentially a fully functional WSO2 DAS server.


In a nutshell, WSO2 API Manager 2.0.0 sends the following event streams to API Manager Analytics:







The APIM Analytics Server process these events and writes the summarized data to a shared database. WSO2 API Manager then queries statistics from this database to display in the API Manager Publisher and Store.


One significant change in WSO2 API Manager 2.0.0 statistics is the removal of the graphical user interface (GUI) for configuring statistics. In WSO2 API Manager 1.10, a GUI was provided to handle this DAS-related configuration. We’ve moved this functionality back into the api-manager.xml file.

We’ve also taken out the DAS Rest client – with 2.0.0, the default and only implementation is APIUsageStatisticsRdbmsClientImpl, which uses an RDBMS client to get data from the database.

Since this is essentially a DAS server, you can use features such as gadgets and realtime analytics (using Siddhi) for better or more complex visualizations of the events sent from WSO2 API Manager.

Eurecat: using iBeacons, WSO2 and IoT for a better shopping experience

Eurecat, based in Catalonia, Spain, is in the business of providing technology. A multinational team of researchers and technologists spread their efforts into technology services, consulting and R&D across sectors ranging from Agriculture to Textiles to the Aerospace industry. By default, this requires them to work in the space of Big Data, cloud services, mobile and the Internet of Things.

One of their projects happened to involve iBeacons in a store. In addition to transmitting messages, the low-energy, cross-platform Bluetooth BLE-based sensors can detect the distance between a potential user and themselves – and transmit this information as ‘frames’. Using this functionality, a customer walking outside the store would be detected and contacted via an automated message.


Upon arriving at the entrance to the store, the customer would be detected by beacons at the front of the shop (near) and at the back of the shop (far). This event itself would be a trigger for the system – perhaps a notification for a store clerk to attend to the customer who just walked in. The possibilities aren’t limited to these use cases: with the combination of different positions and detection patterns, many other events can be triggered or messages pushed.

To implement this, Eurecat architected the system thusly.


The process is set in motion by the iBeacon, which keeps broadcasting frames. These are picked up by the smartphone, which contacts the business services. Complex Event processing would occur here to sort through all these low-level events in real-time. The bus then funnels this data to where it needs to go – notification services, third parties, interfaces and databases.

The WSO2 Complex Event Processor (CEP) and the WSO2 Enterprise Service Bus (ESB) fit in readily, with the ESB collecting the events and passing them on to the processing layer.


Jordi Roda of Eurecat, speaking at WSO2Con EU 2015, detailed why they choose to go with WSO2: the real-time processing capabilities of CEP, the array of protocols and data formats it can handle, and the Siddhi language, which enabled them to easily construct the queries that would sift through the events. The ESB, said Jordi, they selected because of its performance, security and connectivity it offered.

At the time of speaking, Eurecat had improvements pending: data analytics, a wifi-based location service, better security and scalability.


At WSO2, we’re delighted to be a part of Eurecat’s success – and if your project leads you along similar paths, we’d like to hear from you. Contact us.[a] If you’d like to try us out before you talk to us, our products are 100% free and open source – click here to explore the WSO2 Enterprise Service Bus or here to visit the WSO2 Complex Event Processor.

Meet the WSO2 Enterprise Service Bus 5.0

The WSO2 Enterprise Service Bus is renowned as a lightweight, 100% open source ESB; it helps everything from big, digital-first companies (like eBay) to transnational governing bodies (like the Eurasian Union) achieve the kind of enterprise integration scenarios they need to work. We’ve now released version 5.0 of the WSO2 ESB – and here’s why you should take a look. 

As part of our new product strategy, the WSO2 ESB 5.0 release is composed of three key components – the runtime, tooling, and analytics. We’ve also added several new, major features:

  • Mediation Debugger for error detection in mediation flows
  • Data mapper to convert and transform data
  • Mediation statistics and tracing using ESB Analytics component
  • Support for JMS 2.0 and WebSocket transports

Interested? Let’s explore in more detail.

WSO2 ESB Tooling 5.0

The new WSO2 Developer Studio-based ESB tooling plug-in is built to work on top of Eclipse Mars2, the widely used IDE. It offers developers with improved usability and brings some new capabilities to the table:

Mediation debugging

At the operational level, a series of mediators called sequences determine the behavior of messages. Users were previously unable to figure out if these mediators -including their inputs, processing, and outputs – were operating as intended. The newly introduced Mediation Debugger allows users to debug this message mediation flow proactively, to identify any errors and rectify them in the development environment itself (figure 1).

image05Figure 1: Debugging mediation flow

Data conversion and transformation


WSO2 Data Mapper is one of the most awaited features by our customers. Having been built as a mediator that can be included in the mediation flow, it is capable of converting the data format of a message between JSON, XML or CSV (figure 2). It also comes with an operations palette that supports a number of operations to help users perform different functions in data mapping (figure 3). For instance, arithmetic functions enable users to add, subtract, multiply, divide or carry out other functions on data as a part of its transformation. Users can easily drag and drop operators into the mapping diagram and edit on the go.

The main categories of operations are listed below.

  • Common
  • Arithmetic
  • Conditional
  • Boolean
  • Type conversion
  • String

We’ve made it more appealing to developers by introducing a graphical mapping experience that also generates the execution files required to run on the Data Mapper Engine.

image09Figure 2: Data mapping

What’s special about WSO2 Data Mapper is that it can function independently of other WSO2 products or components.

image00Figure 3: Adding operations in data transformation

WSO2 ESB Analytics 5.0

The analytics component is a new addition to the ESB that lets users publish information on ESB processes. Previously, users had the ability to connect WSO2 DAS and define their own ESB dashboards. However, we’ve made the experience better by providing an analytics dashboard out of the box, with pre-defined statistics for ESB artifacts such as proxy services, APIs, sequences, endpoints and inbound endpoints. Users have the option here to choose which artifacts’ statistics and/or tracing should be published by enabling or disabling this through the runtime instance.

The dashboard displays a summarized view of the requests processed, their success/ failure rates, throughput, and message count over a specified time period, and top artifacts by request count (Figures 4, 4a, 4b, 4c respectively). For example, users can identify the top sequences, endpoints or the top APIs by request count in separate gadgets.

image04Figure 4: Statistics dashboard

image03Figure 4a: Overall transactions per second graph


Figure 4b: Message count graph


Figure 4c: Top endpoints by request count graph

The dashboard provides many features – from configurable timelines and detailed stats by artifact type to visual additions, such as theming and the customizability to portray your brand and explain statistics (figure 5).


Figure 5: Dashboard with configurable timelines, theming buttons and navigation pane

Tracing mediation flows

Not only is it important to monitor statistics, but the ability to drill down into anomalies encountered in ESB processes is essential for DevOps teams, especially in production environments. Hence, we’ve introduced message tracing capabilities to provide more visibility into message mediation flows so developers can identify problems in live environments.

For a particular artifact type such as an API or proxy service, operational teams can dive into obtaining overall message count, latency and processing details of each message. Consider a scenario where messages pass through a REST API: the message gadget (figure 6) helps DevOps teams proactively identify which messages failed, as shown below.

image06Figure 6: Messages and their status

We’ve also made it possible to further scrutinize into the message flow (figure 7), and dive into a mediator and view its properties in relation to a payload, transport properties and context properties (figure 8), both before and after the mediator. It provides a mechanism by which operational teams can verify if message flows operate as they are intended, and fix any errors.

image02Figure 7: Message flow detailed view

image10Figure 8: Mediator properties

These features also are useful in testing environments to evaluate changes done to artifacts (sequences, APIs, endpoints etc ) or message mediation flows, as a feedback mechanism prior to moving into production.

Support for JMS 2.0 and Websocket

Last but not least, we’ve also added two transports – JMS 2.0 and WebSocket – to be able to support more integration use cases.

With JMS 2.0, the WSO2 ESB supports new messaging features – shared topic subscription, getting the JMSX delivery count and specifying message delivery delays. A typical scenario would be using the JMS endpoint to act as a shared topic listener so that it can connect to shared topic subscribers and help share the workload. We’ve also introduced WebSocket as a transport and inbound protocol to support bidirectional message mediation mostly advantageous in near-real-time communication for IoT, web, and mobile scenarios.

We’ve bumped the product quite a bit in this release and we’d love to hear what you think about the all new WSO2 ESB 5.0.

Go ahead and try it yourself – it’s free to download and use!

Incremental Analytics with WSO2 Data Analytics Server

The duration of the batch process is critical in production environments. Incremental processing – the simple concept of processing only what needs to be processed –  is one way of introducing major boosts to the efficiency of this process.

Consider a product that does not support incremental processing: say an analytics script that summarizes data every day. The first time the summarization script is run, it would process the whole data set and summarize the data.

The next day, when the process is called again, this script needs to process the whole dataset in order to process the unprocessed data. Thus, it’ll not only end up processing today’s data: it’ll waste time processing yesterday’s data. As time goes on, this script ends up processing weeks and months of data just to get a day’s worth of insight.

With incremental processing, the batch job only processes the data partition that’s required to be processed, not the whole dataset (which has already been processed); this improves the efficiency drastically. The new script would would only process the last day’s worth of data: which reduces the overhead of processing the already processed data again.

Think of how it can improve the performance in summarizations, starting from minutes, running all the way to years.

Using incremental analytics with the new DAS

Incremental analytics uses the timestamps of the events sent when when retrieving the data for processing. So firstly, when defining streams for incremental analytics, you need to add an extra field to the event payload as _timestamp LONG to facilitate this.

When sending the events, you have the ability to either add the timestamp to the _timestamp attribute or set it for each event at event creation.

In the spark script you use when defining the table, you need to add extra parameters to the table definition for it to support incremental analytics.

If you do not provide these parameters, it will be treated as a typical analytics table and for each query which reads from that table, would get the whole table.

Here’s an example:

create temporary table orders using CarbonAnalytics options (tableName “ORDERS”, schema “customerID STRING, phoneType STIRNG, OrderID STRING, cost DOUBLE, _timestamp LONG -i”, incrementalParams “orders, DAY”);

When you are done with the summarization,  you need to commit the status indicating the reading of the data is successful. This is done via INCREMENTAL_TABLE_COMMIT orders;


incrementalParams has two required parameters and an optional parameter.

incrementalParams uniqueID, timePeriod, #previousTimePeriods


This is the unique ID of the incremental analytics definition. When committing the change, you need to use this ID in the incremental table commit command as shown above.


The duration of the time period that you are processing. If you are summarizing per DAY (the specified timePeriod in this case), then DAS has the ability to process the timestamp of the events and get the DAY they belong to.

Consider the situation with the following received events list. The requirement is to get the total number of orders placed per each minute.

Customer ID Phone Type Order ID Cost _timestamp
1 Nexus 5x 33slsa2s 400 26th May 2016 12:00:01
12 Galaxy S7 kskds221 600 27th May 2016 02:00:02
43 iPhone 6s sadl3122 700 27th May 2016 15:32:04
2 Moto X sdda221s 350 27th May 2016 16:22:10
32 LG G5 lka2s24dkQ 550 27th May 2016 19:42:42

And the last processed event is,

12 Galaxy S7 kskds221 600 27th May 2016 15:32:04

Assume that in the summarized table for the day 27th May 2016 there would be 2 events since when the script ran last. Now, there were only two events left for that particular day.

This is where the timePeriod parameter is used. For the last processed event, DAS calculates the “time period” it belongs to and pulls the data from the beginning of that time period onwards.

In this case the last processed event

12 Galaxy S7 kskds221 600 27th May 201615:32:04

Would trigger DAS to pull data from 27th May 2016 00:00:00 onwards.

#previousTimePeriods – Optional (int)

Specifying this value would allow DAS to pull from previous time periods onwards. For example, if you had set this parameter to 30, then it would fetch 30 more periods worth of data.

As per the above example, it would pull from 27th April 2016 00:00:00 onwards.

That’s incremental analytics, which we’re bringing in the  3.1.0 version of DAS. For more information do drop by wso2.com/products/data-analytics-server/

Transform Your Enterprise IT: Integrate and Automate

Most enterprises deal with a variety of common IT problems to which they would find quick fixes. One such example is the need to maintain five different usernames and passwords to login to five different systems. Another typical example is the closing of a sales deal – the sales department would conclude the deal and ensure the goods are delivered; this would be updated on the sales records, however, when the finance department reconciles invoices against sales at the end of the quarter, there might be mismatches because the invoicing process was missed.


To address these issues, most enterprises will use a combination of basic IT and collaboration software to manage day-to-day requirements. And over time, these requirements will change, prompting a slight shift in the enterprise’s IT landscape too. This may result in a situation where different teams within the organization will find the most efficient ways to carry out tasks and meet their IT requirements with the use of packaged software, possibly by building their own, or even subscribing to more SaaS-type offerings.

While this might temporarily fix specific problems, it will pose long-term challenges as such measures are often not pre-planned or do not follow a particular IT roadmap. The actual negative effects of individual teams working in silos would only be felt when the company starts to grow and the use of various systems increase as well. Eventually, the use of several systems that don’t talk to each other will cause operational issues and even hurt motivation among employees.

The recurrent problems with these multiple systems working in silos include extensive manual effort, errors, blame, rework, frustration, complaints, and the need to manage multiple passwords. These in turn result in inefficiencies.

To address these challenges, the enterprise needs an easy-to-implement, cost-effective solution. There’s no guarantee though that there would be a plug and play type of system or one that could be customized to meet the enterprise’s exact requirements. The enterprise would seek a unique, bespoke solution that would either mean they change the way they work with existing software or rethink the software itself.

The most viable option would be to integrate the systems (which, of course, have proven to be efficient to meet a specific requirement) used by different functions and then explore some sort of automation that will provide relief to employees.

WSO2’s highly-acclaimed open-source middleware platform has the capabilities that enable seamless integration of IT applications, thus streamlining day-to-day business activities of a given enterprise. This in turn will boost efficiency and integration across business functions and teams and improve overall productivity as well.

For instance, WSO2 Identity Server (WSO2 IS) can define an identification for a user in a particular organization, enabling him/her to log into multiple systems on-cloud or on-premise with a single username/password.

The enterprise too will benefit as WSO2 IS offers provisioning capabilities that allow your IT to register and auto-provision new employees across multiple systems as well as easily de-provision them when they leave the organization.

WSO2 Enterprise Service Bus can meet all your integration challenges with its capability to connect various systems that speak different languages. It also comes with a defined set of connectors to further support integration of systems, be it on the cloud or on-premise.

Once all of your systems have been integrated, you can leverage WSO2 Data Analytics Server (WSO2 DAS) to pull reports from different functions within your organization and automatically collate data that will translate to valuable information required to make business decisions. WSO2 DAS has in-built dashboard capabilities that will automatically create and publish dashboards on a real-time basis.

Moreover, all WSO2’s products are 100% open source, which gives enterprises the freedom of choice and empowers the business with limitless possibilities to expand.

Learn more about WSO2’s comprehensive and open platform for your connected enterprise.

For more details on how to establish friendly enterprise IT and get more love from your team, watch this talk by WSO2’s VP Operations, Shevan Goonetilleke.