Tag Archives: Analytics

Eurecat: using iBeacons, WSO2 and IoT for a better shopping experience

Eurecat, based in Catalonia, Spain, is in the business of providing technology. A multinational team of researchers and technologists spread their efforts into technology services, consulting and R&D across sectors ranging from Agriculture to Textiles to the Aerospace industry. By default, this requires them to work in the space of Big Data, cloud services, mobile and the Internet of Things.

One of their projects happened to involve iBeacons in a store. In addition to transmitting messages, the low-energy, cross-platform Bluetooth BLE-based sensors can detect the distance between a potential user and themselves – and transmit this information as ‘frames’. Using this functionality, a customer walking outside the store would be detected and contacted via an automated message.

image00

Upon arriving at the entrance to the store, the customer would be detected by beacons at the front of the shop (near) and at the back of the shop (far). This event itself would be a trigger for the system – perhaps a notification for a store clerk to attend to the customer who just walked in. The possibilities aren’t limited to these use cases: with the combination of different positions and detection patterns, many other events can be triggered or messages pushed.

To implement this, Eurecat architected the system thusly.

image01

The process is set in motion by the iBeacon, which keeps broadcasting frames. These are picked up by the smartphone, which contacts the business services. Complex Event processing would occur here to sort through all these low-level events in real-time. The bus then funnels this data to where it needs to go – notification services, third parties, interfaces and databases.

The WSO2 Complex Event Processor (CEP) and the WSO2 Enterprise Service Bus (ESB) fit in readily, with the ESB collecting the events and passing them on to the processing layer.

image02

Jordi Roda of Eurecat, speaking at WSO2Con EU 2015, detailed why they choose to go with WSO2: the real-time processing capabilities of CEP, the array of protocols and data formats it can handle, and the Siddhi language, which enabled them to easily construct the queries that would sift through the events. The ESB, said Jordi, they selected because of its performance, security and connectivity it offered.

At the time of speaking, Eurecat had improvements pending: data analytics, a wifi-based location service, better security and scalability.

image03

At WSO2, we’re delighted to be a part of Eurecat’s success – and if your project leads you along similar paths, we’d like to hear from you. Contact us.[a] If you’d like to try us out before you talk to us, our products are 100% free and open source – click here to explore the WSO2 Enterprise Service Bus or here to visit the WSO2 Complex Event Processor.

Capgemini, WSO2 and the new UN ecosystem

Ibrahim Khalili is a system integration analyst at Capgemini, a multinational that’s one of the world’s foremost providers of management consulting, technology and outsourcing. Headquartered in Paris, Capgemini has been running since 1967, and now makes over 11 billion EUR in revenue.

ibrahim

Capgemini and WSO2 have a history of working together. One of Capgemini’s recent projects was for the United Nations – to build a new reference architecture for UN agencies to function across a connected technology platform. Khalili, speaking at WSO2Con Asia 2016 in Colombo, Sri Lanka, outlined the three major goals of the new platform.

Whatever they designed had to allow beneficiaries, donors, citizens and the UN’s increasingly mobile workforce to access the functionality and information of agencies regardless at “anywhere, anytime, on any device”; it had to handle information, people and devices in a much smarter and more cost-efficient way that the UN was doing already. It also had to break out the data and bring the UN’s agencies into the world of an API ecosystem.

To put this into finer context, we’re talking about a system that can handle assets, finances, information and humans across a diverse array of agencies – including the nitty gritty of fundraising, running initiatives, and reporting that are key to most UN operations. What they required was what Khalili calls a “platform enabled agency” – more or less a complete update to operational infrastructure, with APIs exposing services, information, and functionality across the board.

Their solution starts with an integration layer that connects to all legacy systems, providing a view of all the data that can be managed. On top of that goes the process layer, which contains the functionality, and on top an API layer exposing the platform’s services and data.

capgemini-abstract

Once the logical framework was done, Capgemini started filling it in. At the very bottom go IaaS services like VMWare. On top of that comes an ERP universe of sorts – functionality from SugarCRM, Talend, WSO2 Application Server, WSO2 Complex Event Processor, and others, connected by the WSO2 ESB. WSO2 Enterprise Mobility Manager, WSO2 API Manager, and WSO2 User Engagement Server face outwards, allowing this functionality to be used. WSO2 Identity Server wraps around the entire platform, handling ID and authentication.

 

That gives Capgemini – and the UN – not only a cleaner, layered architecture, but one that brings in better scalability as well as a Devops approach. But above all, the chief advantage, says Khalili, is that it’s also open source. With WSO2 products, Capgemini has complete freedom to customize, take apart or rebuild whatever’s required to make a better platform. There’s no stopping innovation.

Capgemini’s not the only one who can leverage our technology. All WSO2 products are free and open source.

Go to http://wso2.com/products/ to download and use any part of our middleware platform. For more information on Capgemini’s solution for the UN, watch Ibrahim Khalili’s full presentation at WSO2Con here.  

 

Incremental Analytics with WSO2 Data Analytics Server

The duration of the batch process is critical in production environments. Incremental processing – the simple concept of processing only what needs to be processed –  is one way of introducing major boosts to the efficiency of this process.

Consider a product that does not support incremental processing: say an analytics script that summarizes data every day. The first time the summarization script is run, it would process the whole data set and summarize the data.

The next day, when the process is called again, this script needs to process the whole dataset in order to process the unprocessed data. Thus, it’ll not only end up processing today’s data: it’ll waste time processing yesterday’s data. As time goes on, this script ends up processing weeks and months of data just to get a day’s worth of insight.

With incremental processing, the batch job only processes the data partition that’s required to be processed, not the whole dataset (which has already been processed); this improves the efficiency drastically. The new script would would only process the last day’s worth of data: which reduces the overhead of processing the already processed data again.

Think of how it can improve the performance in summarizations, starting from minutes, running all the way to years.

Using incremental analytics with the new DAS

Incremental analytics uses the timestamps of the events sent when when retrieving the data for processing. So firstly, when defining streams for incremental analytics, you need to add an extra field to the event payload as _timestamp LONG to facilitate this.

When sending the events, you have the ability to either add the timestamp to the _timestamp attribute or set it for each event at event creation.

In the spark script you use when defining the table, you need to add extra parameters to the table definition for it to support incremental analytics.

If you do not provide these parameters, it will be treated as a typical analytics table and for each query which reads from that table, would get the whole table.

Here’s an example:

create temporary table orders using CarbonAnalytics options (tableName “ORDERS”, schema “customerID STRING, phoneType STIRNG, OrderID STRING, cost DOUBLE, _timestamp LONG -i”, incrementalParams “orders, DAY”);

When you are done with the summarization,  you need to commit the status indicating the reading of the data is successful. This is done via INCREMENTAL_TABLE_COMMIT orders;

Parameters

incrementalParams has two required parameters and an optional parameter.

incrementalParams uniqueID, timePeriod, #previousTimePeriods

uniqueID : REQUIRED

This is the unique ID of the incremental analytics definition. When committing the change, you need to use this ID in the incremental table commit command as shown above.

timePeriod : REQUIRED (DAY/MONTH/YEAR)

The duration of the time period that you are processing. If you are summarizing per DAY (the specified timePeriod in this case), then DAS has the ability to process the timestamp of the events and get the DAY they belong to.

Consider the situation with the following received events list. The requirement is to get the total number of orders placed per each minute.

Customer ID Phone Type Order ID Cost _timestamp
1 Nexus 5x 33slsa2s 400 26th May 2016 12:00:01
12 Galaxy S7 kskds221 600 27th May 2016 02:00:02
43 iPhone 6s sadl3122 700 27th May 2016 15:32:04
2 Moto X sdda221s 350 27th May 2016 16:22:10
32 LG G5 lka2s24dkQ 550 27th May 2016 19:42:42

And the last processed event is,

12 Galaxy S7 kskds221 600 27th May 2016 15:32:04

Assume that in the summarized table for the day 27th May 2016 there would be 2 events since when the script ran last. Now, there were only two events left for that particular day.

This is where the timePeriod parameter is used. For the last processed event, DAS calculates the “time period” it belongs to and pulls the data from the beginning of that time period onwards.

In this case the last processed event

12 Galaxy S7 kskds221 600 27th May 201615:32:04

Would trigger DAS to pull data from 27th May 2016 00:00:00 onwards.

#previousTimePeriods – Optional (int)

Specifying this value would allow DAS to pull from previous time periods onwards. For example, if you had set this parameter to 30, then it would fetch 30 more periods worth of data.

As per the above example, it would pull from 27th April 2016 00:00:00 onwards.

That’s incremental analytics, which we’re bringing in the  3.1.0 version of DAS. For more information do drop by wso2.com/products/data-analytics-server/

Solving the DEBS 2016 Grand Challenge using WSO2 CEP

The ACM DEBS Grand Challenge is a yearly competition where the participants implement an event-based solution to solve a real world high-volume streaming data problem.

This year’s grand challenge involves developing a solution to solve two (real world) problems by analyzing a social-network graph that evolves over times. The data for the DEBS 2016 Grand Challenge has been generated using Linked Data Benchmark Council (LDBC) social network data generator. The ranking of the solutions is carried out by measuring their performance using two performance metrics: (1) throughput and (2) average latency.

WSO2’s been submitting solutions to the grand challenge since 2013, and our previous grand challenge solutions have been ranked as one of the top solutions among the submissions. This year, too, we submitted a solution using WSO2 CEP/Siddhi. Based on its performance, this year’s solution has also been selected as one of the best solutions. As a result, we’ve been invited to submit a full paper to the DEBS 2016 conference to be held from 20 June to June 24.

In this blog I’ll present some details of DEBS queries, (a brief) overview our solution and some performance results.

Query 1

As pointed out earlier, DEBS 2016 involves developing an event-based solution to solve two real world use cases of an event processing application.

The first problem (query) deals with the identification of posts that currently trigger the most activity in a social network. This query accepts two input streams namely the posts and comments.

Think of a Facebook post with comments. Our goal is to compute the top three active posts where the score of a post is computed as the sum of its own score and the score of its related comments. The initial score of a post is 10 and it decreases by 1 every 24 hours. Similarly, the initial score of a comment is also 10 and decreases by 1 in the same manner.

Note that the score of a post/comment cannot reach below zero; a post whose total score is greater than zero is defined as an active post.

Query 2

The second deals with the identification of large communities that are currently involved in a topic.

This query accepts three input streams : 1) comments 2) likes and 3) friendships.

The aim is to find the k comments with the largest range, where the comments were created more than d seconds ago. Range here is defined as the size of the largest connected components in the graph defined by the persons who have liked that comment and know each other.

The friendship stream plays an important role in this query, as it establishes the friendships between the users in the system. The following figures shows the friendship graph when the system receives 10 and 100 friendship events respectively.

0-u-8_tlg5Vlvr4AsJ-

Figure 1: Friendship Graph (Number of Events = 10)

0-hVm124A9qzSR1FDA-

Figure 2: Friendship Graph (Number of Events = 100)

Further analysis of the friendship graph indicates that the degree of distribution of the friendship graph is long-tailed (see Figure 3). This means that there are very small number of users who have a large number of friends and a large number of users have a few friends.

0-Q_RFjNte1iLEjwgW-

Figure 3: Degree Distribution of Friendship Graph

Solution Overview

We implemented the solution using WSO2 CEP as an extension to Siddhi. The solution is a multi-threaded: it processes the two queries in parallel.

Each query is processed as a pipeline where the pipeline consists of three phases: 1) data loading, 2) event-ordering and 3) processing. Each phase in the query is processed using one or more threads. In the data loading phase the data streams are loaded from the files (i.e.disk) and placed in (separate) buffers. Each event stream has its own buffer which is implemented as a blocking queue.

The purpose of the event-ordering phase is to order the events based on their timestamps prior to sending them to the event processor (note: As far as events in an event buffer is concerned, they are already ordered based on their timestamps. The purpose of the ordering done in this phase is to ensure that the merged event-stream that is sent to event processor is ordered based on their timestamps). The core calculation modules of the queries are implemented in the processing thread.

Performance results

The solution was tested on a four core/8GB virtual machine running Ubuntu Server 15.10. As discussed earlier, the two performance metrics used for evaluating the system are the throughput and the mean latency. The performance evaluation has been carried out using two data sets of different sizes (see here and here).

The throughput and mean latency of query 1 for the small data set are 96,004 events/second and 6.11 ms respectively. For the large data set the throughput and mean latency of the query 1 are 71,127 events/sec and 13 ms.

The throughput and mean latency of query 2 for the small data set are 215,642 events/second and 0.38 ms respectively. For the large data set the throughput and mean latency of the query 2 are 327,549 events/sec and 0.73 ms.

A detailed description of the queries and specific optimization techniques that we have used in our queries can be found in a paper titled Continuous Analytics on Graph Data Streams using WSO2 Complex Event Processor, which will be presented shortly  in DEBS 2016: the 10th ACM International Conference Event-Based Systems, June 2016.

Connected Health – Reinventing Healthcare with Technology

Demands for more personalized and convenient services from healthcare providers has steadily increased during the past decade. Increase in populations, life expectancy, and the advancement of technology are a few key contributors to this uptick in demand. These demands have resulted in creating a global eHealth market that is supposed to reach $308 billion by 2022, as predicted by Grand View Research INC.

The essence of a connected healthcare business is to deliver an efficient, effective service to its users by connecting disparate systems, devices, and stakeholders. It aims to automate most tasks and eliminate human error, trigger intelligent events for the hospitals and other stakeholders, and provide medical information via a range of devices at various locations. By becoming a connected ecosystem, hospitals have the opportunity to reduce costs, increase revenue, as well as offer a high-quality service to patients.

The success of a connected healthcare business though depends on how the enterprise will look to address key challenges via comprehensive solutions.

In the white paper “Connected Health Reference Architecture” Nuwan Bandara, a solutions architect at WSO2, discusses the significance of creating a connected healthcare system and explains how a middleware platform can be used to address each and every challenge faced at implementation.

Screen Shot 2016-04-26 at 3

One of the key challenges he highlights is the ability to deliver aggregated information without  any latency issues between sources. In order to overcome this, you would need a centralized system that enables smooth integration of devices, services, and workflows. The use of multiple devices that take various measurements in different formats makes it a bit more difficult compared to other connected ecosystems; however, this can be addressed by consolidating the gathered data, and making it easily accessible to various services and applications from different locations.

Given that all this data is private information, it is vital to have fool proof security measures in place as well to restrict access only to authorized personnel, Nuwan notes.

Furthermore, it is important for hospitals to be geared to manage high capacities during crisis situations. If the system is unable to cope with high loads during these times, the system will crash and disrupt all workflows. Hospitals overcome this by equipping their systems with elastic scaling to handle high loads.

To learn more about the Connected Health reference architecture, download and read the white paper here .

Enabling Microservice Architecture with Middleware

Microservices is rapidly gaining popularity among today’s enterprise architects to ensure continuous, agile delivery and flexible deployments. However many mistake microservice architecture (MSA) to be a completely new architectural pattern. What most don’t understand is that it’s an evolution of Service Oriented Architecture (SOA). It has an iterative architectural approach and development methodology for complex, service-oriented applications.

microservices

Asanka Abeysinghe, the vice president of solutions architecture at WSO2, recently wrote a white paper, which explores how you can efficiently implement MSA in a service-oriented system.

Here are some insights from the white paper.

When implementing MSA you need to create sets of services for each business unit in order to build applications that benefit their specific users. When doing so you need to consider the scope of the service rather than the actual size. You need to solve rapidly changing business requirements by decentralizing governance and your infrastructure should be automated in such a way that allows you to quickly spin up new instances based on runtime features. These are just a few of the many features of MSA, some of which are shared by SOA.

MSA combines the best practices of SOA and links them with modern application delivery and tooling (Docker and Kubernetes) and technology to carry out automation (Puppet and Chef).

In MSA you need to give importance to how you scope out a service rather than the size. The inner architecture of an MSA addresses the implementation architecture of the microservices, themselves. But to enable flexible and scalable development and deployment of microservices, you first need to focus on its outer architecture, which addresses its platform capabilities.

Enterprise middleware plays a key role in both the inner and outer architecture of MSA. Your middleware needs to have high performance functionality and support various service standards. It has to be lean and use minimum resources in your infrastructure as well as be DevOps-friendly. It should allow your system to be highly scalable and available by having an iterative architecture and being pluggable. It should also include a comprehensive data analytics solutions to ensure design for failure.

This may seem like a multitude of functionality and requirements that are just impossible to meet. But with WSO2’s complete middleware stack, which includes the WSO2 Microservices Framework for Java and WSO2 integration, API management, security and analytics platforms, you can easily build an efficient MSA for your enterprise.

MSA is no doubt the way forward. But you need to incorporate its useful features into your existing architecture without losing applications and key SOA principles that are already there. By using the correct middleware capabilities, enterprises can fully leverage the advantages provided by an MSA to enable ease of implementation and speed of time to market.

For more details download Asanka’s whitepaper here.

Everything you need to know about architecture patterns: a quick reference for Solution Architects  

The success of a solutions architect depends on the approach taken from the beginning. The role can be challenging with the need to carefully balance the organization’s business as well as technical requirements. That’s why we had a dedicated track on architecture patterns at WSO2Con Asia 2016 held earlier this year  in Colombo, Sri Lanka, to help SAs understand today’s best practices and how they can deliver value more quickly. If you missed out, here’s a recap of the patterns we discussed with the link to recordings of each talk.

Iterative Architecture: Your Path to On-Time Delivery

Agility is key for enterprises to optimize business functions, introduce new business capabilities, and explore new markets. Thus, enterprise software systems should support both evolutionary as well as revolutionary changes that will impact core business functions.

Screen Shot 2016-04-25 at 12

WSO2’s VP – Solutions Architecture, Asanka Abeysinghe, discussed the advantages of adopting an iterative approach when introducing architectural changes to support business and technical requirements. He demonstrates this with real-world examples of successful implementation of architectures in iterations. 

Breaking Down Silos with Service Oriented Architecture

Service-oriented architecture (SOA) has outrun the notion of systems silos with its use of standard protocols and specifications at integration points, which allows systems to communicate with each other in a much more flexible manner. Nadeesha Gamage, associate lead – solutions engineering at WSO2, explained the drawbacks of having a siloed architecture and how they can be avoided by moving to SOA, thereby enabling greater agility. He discussed how SOA can be broken down further to a finer-grained microservice architecture and, as a result, how an enterprise can benefit using the WSO2 suite of products. 

Event Driven Architecture: Managing Business Dynamics for Adaptive Enterprise

SOA implements a synchronous request-response model to connect remote processes in distributed system; it creates an inherent rigidity and additional dependencies when applied in modelling business processes and workflows. In contrast, event driven architecture (EDA) is based on an asynchronous message-driven communication model to propagate information throughout an enterprise, thus supporting a more natural alignment with an enterprise’s operational model and processes/workflows. In this session, Solutions Architect at WSO2, Dassana Wijesekera, analyzes key business challenges that encourage the use of EDA and discusses a pragmatic approach of designing and implementing an EDA using the WSO2 integration framework.

Moulding Your Enterprise with Resource-Oriented Architecture

An enterprise environment is typically heterogeneous, often spanning across organizational boundaries. Building such systems require tools that promote intrinsic interoperability and provide ease of integrating over boundaries. It also needs to use technology that promotes simplicity and is easy to handle. Resource-oriented architecture (ROA) supports this by focusing on entities and interactions for effective enterprise integration. Shiroshika Kulatilake, solutions architect at WSO2, explained the idea behind having a ROA in your organization, both externally and internally and also talked about how WSO2 technology can help you built your enterprise system in a resource oriented manner. 

Building Web Apps Using Web-Oriented Architecture

Web-oriented architecture (WOA) or SOA + WWW + REST  takes you several steps further by filling the blanks of SOA and helping you build an end-to-end complete web application. In addition to APIs, WOA identifies user interfaces and application states as first-class components of an architecture. Most of what we build today is actually WOA, though the abbreviation might not be that popular.

Screen Shot 2016-04-25 at 2

Lead Solutions Engineer at WSO2, Dakshitha Ratnayake, discussed the changes to WOA over the years, today’s trends, and how you can leverage WOA to build web apps. 

Reinforcing Your Enterprise With Security Architectures

WSO2’s VP – Engineering, Selvaratnam Uthaiyashankar presented an informative session on

leveraging the extensive feature set and extensible nature of the WSO2 platform to provide a robust security architecture for your enterprise. He also explained some of WSO2’s experiences with customers in building a security architecture and thereby extracting commonly used security architecture patterns.

Understanding Microservice Architecture

Today many organizations are leveraging microservice architecture (MSA), which is becoming increasingly popular because of its many potential advantages. MSA itself is divided into two areas – inner and outer architectures –  which require separate attention. Moreover, MSA requires a certain level of developer and devops experience too. Sagara Gunathunge, architect at WSO2, presented an awareness session about MSA and also discussed WSO2’s strategic initiatives in both the platform level and WSO2 MSF4J framework level. 

Deployment Patterns and Capacity Planning

Identifying the right deployment architecture is key when providing smooth operation of a production system. In the next step, it’s crucial to determine the size of the deployment by understanding the number of servers/VMs/containers necessary to support the minimum, average and possible maximum load that the system is expected to handle. Solutions Engineer at WSO, Chathura Kulasinghe, in this talk focused on how you could take a fact based approach to determine the size of your deployment. 

Pattern-Driven Enterprise Architecture: Applying Patterns in Your Architecture

It’s no secret that architectural patterns help you build beautiful enterprise architecture. High-level patterns such as SOA, ROA, EDA, MSA and WOA provide many best practices for enterprise architects who are looking to evolve their existing enterprise architecture or for those creating newer enterprise architecture strategies. Mifan Careem, director – solutions architecture at WSO2, analyzed the good, the bad and the ugly (if any) of the various architectural patterns in his talk. He discussed practical examples of the patterns in practice and also went on to build a solution architecture from scratch using WSO2 components with the help of patterns. 

Still interested in meeting the experts and discussing these topics and more? Sign up now for WSO2Con EU, which will be held in London from June 7 to 9. Be sure to grab the early bird offer before May 8.

 

WSO2Con Insights: Why West Interactive built an app-based cloud platform with WSO2

West Corporation is a spider in a web. Andrew Bird, Senior Vice President at West, speaking at WSO2Con USA 2015, described it as a 2.5-billion dollar giant situated right at the heart of America’s telecommunications. Close to a third of the world’s conference calls run through the West network. To give you some perspective, Google+ and Cisco run calls on West networks – as does the 911 call system.

Screen Shot 2016-04-18 at 10

According to Andrew (who runs product management, development and innovation there) depending on where you are in America, 60% of the time, any call you place would go through the West network.

However, networks aren’t all that West does. West has a division called West Interactive Services which builds IVR systems for customers that need complex customer interaction networks. Here’s what Andrew had to say about how West Interactive used integrated, modular WSO2 middleware to drastically speed the delivery of service and enhance these systems – for both the customers and for themselves.

The challenge: customer interaction

Screen Shot 2016-04-18 at 9
IVR systems involve providing customer interaction platforms, application design services, multi-channel communication systems, and often goes beyond building solutions for Fortune 100 companies. The services involved are often complex –  context identification, notifications, chat, call, data collection, routing, message delivery, provisioning, identity – and the ability to communicate across Web, IVR, mobile and social platforms.

To represent its work, Andrew played a demo where a customer dials into a call center from an iPhone. The automated system on the other end recognized the customer, recognizes that fact that he is on a mobile device and addresses him by name. It then proceeds to interact with the customer via text and speech – all of this without needing an app.

Context is key here: Andrew Bird – and West – believes that customers should not have to repeatedly tell systems who they are. They should not have to waste time identifying themselves, their devices and the context in which they’re calling. Systems should be able to figure out that Mr Smith is calling from such and such a location and that’s probably because of this reason. West’s systems are designed to understand this kind of context, and they’re very good at it.

The solution: a middleware platform for West

But of course, building is not enough: scaling these kinds of systems is the challenge.

At some point, West apparently realized that while they were the best at scale, running “a couple of complex event processing engines, a couple of business rules managing engines, a couple of databases” – was neither sustainable nor particularly supportable. For one customer, for instance, they were managing 43 APIs, all of which were completely different. They needed everything on common standards, able to work with each other instead of in little silos of their own.

West’s solution was to build cloud-enabled middleware platform that sits between West’s proprietary services and the applications running across different channels. West’s managed services are exposed through the platform via APIs.

This is where WSO2 came in. The WSO2 ESB serves as the SOA backbone, providing mediation and transformation between West’s different applications; WSO2 Governance Registry provides run-time SOA governance, and WSO2 Analytics platform monitors SOA metrics.

Screen Shot 2016-04-18 at 9
Other, more specific functionality is provided by the likes of WSO2 Complex Event Processor, Application Server, Data Services Server and Machine Learner. The multi-channel access services  – those that face the world – rely on WSO2 Identity Server and WSO2 API Manager, providing a way to expose APIs to internal or external applications that may integrate with the platform.

Context is everything

For West to rely on WSO2 for the backbone of their middleware platform is, for us, an indicator of the amount of faith they have in our products. West, after all, is a company that supports some of the biggest organizations in the world. They cannot afford to fail.

But perhaps the best statement was Andrew’s recollection of how much their customers trust WSO2. “I was once meeting with a customer, talking about our vision,” he says, “and they were like ‘so what are you using for an ESB?’ I said, “WSO2”. No more questions. Done. They were using the same thing as well. I needed something like that – something where if I go talk to a customer who I’m trying to take care of, I don’t need to spend my time justifying myself.”

If you’re interested in knowing more, check out Andrew’s complete keynote talk at WSO2Con USA here. For more details on the deployment, read our case study on West Interactive here.

 

Big Data and Politics: How the Internet sees the US Election

Nothing is a hotter topic than the US Election, especially if you’re a statistician at heart. Legions of us have been mesmerised by the idea of predicting who gets to be the most powerful President on the planet.

This year, however, it’s far more fun to kick back and watch the Internet collectively explode over each and every one of the candidates in the limelight. What with Clinton’s emailgate, Bernie’s economics, Ted Cruz’s household issues and Donald Trump’s existence …

WSO2 is a technology company. We looked around and realized that we had the tools to observe this theater on an unprecedented scale. We’d like you to join us.

Which is why we present to you the WSO2 Election Monitor.

At its heart, the Election Monitor is the WSO2 Enterprise Service Bus (ESB), Data Analytics Server (DAS) and Complex Event Processor (CEP). The ESB scans Twitter, pulling conversations about the US Election every second. DAS and CEP go to work on these tweets.

 The first thing we’ve done is build this (real-time) counter of the number of unique Twitter accounts talking about each camp. In a 24-hour time window, as of the time of writing, the Republicans seem to be dominating the Twittersphere.

image04

That’s a huge margin, isn’t it? Let’s find out why as we go along.

image01

This is firstly a live feed of what we’re getting from Twitter. The gray columns are the interesting ones: they display the most popular recent tweets – recent being not more than 12 hours ago. Donald Trump often dominates both fields. Occasionally, Bernie seems to break through. As of the time of writing, in the “Popular from candidates” column, Donald Trump has three tweets, one of them about a reporter touching him. The others are one tweet from Clinton Enough is enough”  and one from Bernie talking about deficits.

image00

This is consistent for what we’ve seen so far; ever since the site went live,  Trump’s snazzy one-liners have consistently gotten more retweets and favourites than Bernie and Clinton’s policy-centric tweets. It would appear that one man / tweep from the Republican party is more popular than every other candidate put together… are we really surprised that there’s more people talking about the Republicans than the Democrats?

But what about their followers? Using candidates’ hashtags, we can peek into the conversation by sifting through tweets and finding the most used conversations in that space.

Trump’s people are talking about the border. No surprise there. They’re also talking about New York. That corresponds with the fact that Hillary Clinton just took aim at Trump in a N.Y. ad. It shows a white Trump supporter sucker-punching an African American protester.  

Clinton? The email scandal hasn’t left her behind. There’s talk of war, probably because Clinton tweeted about defeating ISIS recently. There is a LOT of discussion regarding an upcoming debate with Bernie.

Bernie’s community, too, is talking about the debate. There’s few other clues in his wordcloud at the moment.

 Ted Cruz’s community is talking about his wife. That’s because he’s mired in a bit of controversy now: the family man is being dodgy about questions regarding his marriage. There’s a lot of questions about his principles.  

There’s one man missing from this: John Kasich. As of the time of writing, he’s got 143 votes. Cruz had 463. Trump has 736. They all need to hit 1,237 for nomination.

As remote as Kasich’s chances look in the polls, he barely exists on Twitter. For now, we must exclude him.

image02

Step three of the site is the community graph – or, as we call it, the attention graph. Here we map out the most popular accounts talking about the US election. The larger an account’s bubble is, the more popular it is.

What do we see? Donald Trump has gathered more attention to himself than any other tweep. It’s not even a small margin. Dan Scavino comes in at a distant second. Everyone else is miniscule, like little asteroids orbiting Planet Trump. And yet even those tiny accounts get over 2000 likes and retweets. These are the people who are essentially driving opinion on Twitter.

The fourth and final part is how the media’s opinion of a candidate changes over time. By analyzing news articles published online, we can determine shifts as campaigns unfold.

Consider how attitudes have changed towards Hillary. Here’s her standing on the 15th of March:

image05

Here’s her standing on the 17th:

image06

Opinion has swung her way. Examine the titles of the news articles on those days. On the 15th of March: “Was Hillary Clinton Bribed For Her Iraq War Vote?” And “The Cure to Hillary Clinton’s Problem With Millennials? Donald Trump.” Not that good.

On the 17th? “How Hillary Clinton Triumphed on Tuesday” and “Hillary Clinton Becomes Kween of Broad City”.  Short on the heels of a victory comes better press.

It’s fascinating to see how the American media react to candidates as they take on world events. Opinion on Trump, for example, hit rock bottom over his views on China and implications that supporters could go haywire.

Our collection of insights has just gotten started, of course. As the election unfolds, all of this will be running. While we can’t say that Internet is go along to predict who wins, we think it’s a pretty interesting gauge of what the people and the press of America are thinking.

Drop by https://wso2.com/election2016/. The project has been deprecated, but we’ve preserved a snapshot of the data so you can see what it was like.

WSO2Con Insights: Experian Uses WSO2 to Uncover Credit Intelligence

Some call Experian a credit score checking service, but that would perhaps be an injustice: this company, which now counts some 17,000 people among its employees, is the credit information company. So deeply ingrained are they that in certain countries, it’s common to be told “Go talk to Experian” when you have a problem with your credit. Nor does it stop there. Experian’s products have long since expanded beyond credit and into everything from financial education to digital user analytics: it’s now a business with revenues in the billions of dollars.

Experian has a very interesting set of needs. Day in and day out, customers arrive at Experian looking not only for credit reports, but for financial advice. Experian, analyzing their spending patterns and the ripple effects of those, is in a position to tell customers what to buy, what cards to keep, how to handle their bank accounts and loans, and a myriad of other details. In his talk at WSO2Con EU 2015 Rafael Garcia-Navarro, Head of Analytics at Experian explained how shifting from huge volume/low speed batch data processing to small volume/high speed data execution, helped them get their big data into shape.

The problem of real-time

Given the nature of what they do, Experian needs a lot of intelligence and data analysis power. In the world of credit intelligence, everything is linked – from where a user votes to the loans they’ve taken to the smartphone plan that he or she is on. In the past, they would process vast amounts of data offline and use that to make analyses.

To this, Experian added a requirement: real-time operation – defined by them as systems that could take data from marketing channels, process and react with the required information under the average human reaction time of 200 milliseconds.

More specifically, they needed systems that detect patterns at very high speeds, passing data in such a way that as to enable the full machinery to deliver complete results in under 200 milliseconds.

This is where the WSO2 Complex Event Processor comes into the picture. Experian were working with some serious names in data analytics – like Google – and they began using the WSO2 CEP to analyze the customer data in real-time.

The first step, is taking log files from digital platforms at the user level – cookies, if you will – to develop batch prediction models which help them decide what to promote to different users. The next step was to move out of purely historical data. Experian developed a Java application that simulates Google data; this data streams into WSO2 CEP.

“What happens there is Siddhi is running the queries to identify the events that are relevant for further analysis, and driving that in into a Java-based platform,” said Garcia-Navarro “We take the latest events that we’ve identified from the streaming application, and we take those events to re-run the score with the latest information that is available to users, and re-optimizing that with MarketSwitch.”

The system would constantly re-examine their data, updating it and fine-tuning it with the latest information, and drive the final, optimised decision back for execution on the marketing platforms. The challenge? In order to keep the whole system’s operation under 200 milliseconds, this particular sub-system had to do all of this at a mere 50 milliseconds. That’s a staggeringly small amount of time.

After a pause, he added, “This 50 milliseconds has now been brought down to between 3 and 5 milliseconds.”

From code to credit

WSO2’s involvement began, ironically, not in the field of marketing analytics, but with analyzing credit risk. Experian had a product (now called PowerCurve) traditionally built for mainframes in the credit risk space; it allowed credit risk analysts to design business rules visually. They wanted to use this along with MarketSwitch to examine a user’s propensity to buy something.

After the initial QuickStart program, Experian’s internal integrator – they have a team set aside for this – took it to the rest of the company. Even within Experian’s ocean of established technology stacks and software, the WSO2 CEP made a splash big enough to be a critical product. The first implementation connected to WSO2 CEP through WSO2 ESB. Later iterations directly connected to the Siddi processing engine.

Experian likes the way WSO2 has worked for them on this.

“We explored all the typical suspects,” said Garcia-Navarro. “The CEP world is well known, and CEP for high-frequency trading had been in use for years. We explored all those commercial providers, but we chose WSO2 for three key reasons:
The first is because it’s open source. We believe that whenever possible we need to start embracing open source much more widely in business.
The second one is the depth of knowledge of the support provided. WSO2 takes a lot of pride in their support model; they claim – rightly – that they don’t have pre-engineers, but engineers who work on on the product providing the support needed for clients. And when you start working with them you see the depth of skills and expertise that they have. That’s a big plus for us.
The final one is the depth of offerings. CEP we’ve built the prototype for and implemented in house in our data centers and infrastructure. We’re starting to look into many aspects – the next one we’re looking into is ESB, but not the only one.“

Right now, Experian is pushing Complex Event Processor to the limits. Because of the nature of their business, they’re heavily interested in the next steps that we take with CEP and some of the new things we’re working on in the data and analytics space.

For more information on Experian’s work with WSO2, view Rafael’s presentation at WSO2Con EU 2015.