By Paul Fremantle
Co-Founder & CTO, WSO2
IoT is an umbrella term that includes multiple different categories:
And many more
The result is that no single architecture will suit all these areas and the requirements each area brings. However, a modular scalable architecture that supports adding or subtracting capabilities, as well as supporting many requirements across a wide variety of these use cases is inherently useful and valuable. It provides a starting point for architects looking to create IoT solutions as well as a strong basis for further development. This paper proposes such a reference architecture. The reference architecture must cover multiple aspects including the cloud or server-side architecture that allows us to monitor, manage, interact with and process the data from the IoT devices; the networking model to communicate with the devices; and the agents and code on the devices themselves, as well as the requirements on what sort of device can support this reference architecture.
The reference architecture we propose is inherently vendor neutral and not specific to a set of technologies, although it is highly influenced by best-of-breed open-source projects and technology. In addition, we provide a mapping of this reference architecture to WSO2’s open source products and projects and we have demonstrated an implementation of this reference architecture on the WSO2 platform.
We also explore areas where this reference architecture can be extended further and as well as areas where we expect to see further work.
The Internet of Things, or IoT, refers to the set of devices and systems that interconnect real-world sensors and actuators to the Internet. This includes many different systems, including
The growth of the number and variety of devices that are collecting data is incredibly rapid.A study by Cisco1 estimates that the number of Internet-connected devices overtook the human population in 2010, and that there will be 50 billion Internet-connected devices by 2020.
There are of course two key aspects to the IoT: the devices themselves and the server-side architecture that supports them. In fact there is often a third-category as well; in many cases there may be a low power gateway that performs aggregation, event processing, bridging, etc. that might sit between the device and the wider Internet.
In both cases, the devices probably have intermittent connections based on factors such as GPRS connectivity, battery discharging, radio interference, or simply being switched off.
There are effectively three classes of devices:
The communication between devices and the Internet or to a gateway includes many different models:
Figure 1 below illustrates the two major modes of connectivity.
Figure 01
This section has provided a short overview of IoT devices and systems. It is not designed to be comprehensive or even extensive, but simply to provide enough background to support the discussion of requirements and capabilities below. There are many further resources available, which are too numerous to list. However, we can point readers to an academic survey, which is available here2.
There are several reasons why a reference architecture for IoT is a good thing:
Our aim is to provide an architecture that supports integration between systems and devices.
In the next section, we will dig into these requirements deeper and outline the specific requirements we are looking for in a range of categories.
There are some specific requirements for IoT that are unique to IoT devices and the environments that support them, e.g. many requirements emerge from the limited formfactors and power available to IoT devices. Other requirements come from the way in which IoT devices are manufactured and used. The approaches are much more like traditional consumer product designs than existing Internet approaches. Of course there are a number of existing best practices for the server-side and Internet connectivity that need to be remembered and factored in.
We can summarize the overall requirements into some key categories:
Existing protocols, such as HTTP, have a very important place for many devices. Even an 8-bit controller can create simple GET and POST requests and HTTP provides an important unified (and uniform) connectivity. However, the overhead of HTTP and some other traditional Internet protocols can be an issue for two main reasons. Firstly, the memory size of the program can be an issue on small devices. However, the bigger issue is the power requirements. In order to meet these requirements, we need a simple, small and binary protocol. We will look at this in more detail below. We also require the ability to cross firewalls.
In addition, there are devices that connect directly and those that connect via gateways. The devices that connect via a gateway potentially require two protocols: one to connect to the gateway, and then another from the gateway to the cloud.
Finally, there is obviously a requirement for our architecture to support transport and protocol bridging, e.g. we may wish to offer a binary protocol to the device, but allow an HTTP-based API to control the device that we expose to third parties.
While many IoT devices are not actively managed, this is not necessarily ideal. We have seen active management of PCs, mobile phones, and other devices become increasingly important, and the same trajectory is both likely and desirable for IoT devices. What are the requirements for IoT device management? The following list covers some widely desirable requirements:
The list is not exhaustive, and conversely covers aspects that may not be required or possible for certain devices.
A few IoT devices have some form of UI, but in general IoT devices are focused on offering one or more sensors, one or more actuators, or a combination of both. The requirements of the system are that we can collect data from very large numbers of devices, store it, analyze it, and then act upon it.
The reference architecture is designed to manage very large numbers of devices. If these devices are creating constant streams of data, then this creates a significant amount of data. The requirement is for a highly scalable storage system, which can handle diverse data and high volumes.
The action may happen in near real time, so there is a strong requirement for real-time analytics. In addition, the device needs to be able to analyze and act on data. In some cases this will be simple, embedded logic. On more powerful devices we can also utilize more powerful engines for event processing and action.
Any server-side architecture would ideally be highly scalable, and be able to support millions of devices all constantly sending, receiving, and acting on data. However, many “high-scalability architectures” have come with an equally high price – both in hardware, software, and in complexity. An important requirement for this architecture is to support scaling from a small deployment to a very large number of devices. Elastic scalability and the ability to deploy in a cloud infrastructure are essential. The ability to scale the serverside out on small cheap servers is an important requirement to make this an affordable architecture for small deployments as well as large ones.
Security is one of the most important aspects for IoT. IoT devices are often collecting highly personal data, and by their nature are bringing the real world onto the Internet (and viceversa). This brings three categories of risks:
The first category includes simple things such as locking down open ports on devices (like the Internet-attached fridge that had an unsecured SMTP server and was being used to send spam).
The second category includes issues specifically related to IoT hardware, e.g. the device may have its secure information read. For example, many IoT devices are too small to support proper asymmetric encryption. Another specific example is the ability for someone to attack the hardware to understand security. Another example - university security researchers who famously reverse-engineered and broke the Mifare Classic RFID card solution3. These sort of reverse engineering attacks are an issue compared with pure web solutions where there is often no available code to attack (i.e. completely server-side implementation).
Two very important specific issues for IoT security are the concerns about identity and access management. Identity is an issue where there are often poor practices implemented. For example, the use of clear text/ Base64 encoded user IDs/passwords with devices and machine-to-machine (M2M) is a common mistake. Ideally these should be replaced with managed tokens such as those provided by OAuth/OAuth24.
Another common issue is to hard-code access management rules into either client- or server-side code. A much more flexible and powerful approach is to utilize models such as "Attribute Based Access Control" and "Policy Based Access Control". The most well known of these approaches is that provided by the XACML standard5. Such approaches remove access control decisions from hard-coded logic and externalize them into policies, which enabled the following:
Our security requirements therefore should support
This concludes the set of requirements that we have identified for the reference architecture. Of course, any given architecture may add further requirements. Some of those may already be met by the architecture, and some may require further components to be added. However, our design is for a modular architecture that supports extensions, which copes with this demand.
In the next section we introduce the architecture and approach.
The reference architecture consists of a set of components. Layers can be realized by means of specific technologies, and we will discuss options for realizing each component. There are also some cross-cutting/vertical layers such as access/identity management.
Figure 02: Reference architecture for IoT
The layers are
The cross-cutting layers are
The bottom layer of the architecture is the device layer. Devices can be of various types, but in order to be considered as IoT devices, they must have some communications that either indirectly or directly attaches to the Internet. Examples of direct connections are
There are many more such examples of each type.
Each device typically needs an identity. The identity may be one of the following:
For the reference architecture we recommend that every device has a UUID (preferably an unchangeable ID provided by the core hardware) as well as an OAuth2 Refresh and Bearer token stored in EEPROM.
The specification is based on HTTP; however, (as we will discuss in the communications section) the reference architecture also supports these flows over MQTT.
The communication layer supports the connectivity of the devices. There are multiple potential protocols for communication between the devices and the cloud. The most wellknown three potential protocols are
Let's take a quick look at each of these protocols in turn.
HTTP is well known, and there are many libraries that support it. Because it is a simple textbased protocol, many small devices such as 8-bit controllers can only partially support the protocol – for example enough code to POST or GET a resource. The larger 32-bit based devices can utilize full HTTP client libraries that properly implement the whole protocol.
There are several protocols optimized for IoT use. The two best known are MQTT6 and CoAP7. MQTT was invented in 1999 to solve issues in embedded systems and SCADA. It has been through some iterations and the current version (3.1.1) is undergoing standardization in the OASIS MQTT Technical Committee8. MQTT is a publish-subscribe messaging system based on a broker model. The protocol has a very small overhead (as little as 2 bytes per message), and was designed to support lossy and intermittently connected networks. MQTT was designed to flow over TCP. In addition there is an associated specification designed for ZigBee-style networks called MQTT-SN (Sensor Networks).
CoAP is a protocol from the IETF that is designed to provide a RESTful application protocol modeled on HTTP semantics, but with a much smaller footprint and a binary rather than a text-based approach. CoAP is a more traditional client-server approach rather than a brokered approach. CoAP is designed to be used over UDP.
For the reference architecture we have opted to select MQTT as the preferred device communication protocol, with HTTP as an alternative option.
The reasons to select MQTT and not CoAP at this stage are
However, both protocols have specific strengths (and weaknesses) and so there will be some situations where CoAP may be preferable and could be swapped in.
In order to support MQTT we need to have an MQTT broker in the architecture as well as device libraries. We will discuss this with regard to security and scalability later.
One important aspect with IoT devices is not just for the device to send data to the cloud/ server, but also the reverse. This is one of the benefits of the MQTT specification: because it is a brokered model, clients connect an outbound connection to the broker, whether or not the device is acting as a publisher or subscriber. This usually avoids firewall problems because this approach works even behind firewalls or via NAT.
In the case where the main communication is based on HTTP, the traditional approach for sending data to the device would be to use HTTP Polling. This is very inefficient and costly, both in terms of network traffic as well as power requirements. The modern replacement for this is the WebSocket protocol9 that allows an HTTP connection to be upgraded into a full two-way connection. This then acts as a socket channel (similar to a pure TCP channel) between the server and client. Once that has been established, it is up to the system to choose an ongoing protocol to tunnel over the connection.
For the reference architecture we once again recommend using MQTT as a protocol with WebSockets. In some cases, MQTT over WebSockets will be the only protocol. This is because it is even more firewall-friendly than the base MQTT specification as well as supporting pure browser/JavaScript clients using the same protocol.
Note that while there is some support for WebSockets on small controllers, such as Arduino, the combination of network code, HTTP and WebSockets would utilize most of the available code space on a typical Arduino 8-bit device. Therefore, we only recommend the use of WebSockets on the larger 32-bit devices.
An important layer of the architecture is the layer that aggregates and brokers communications. This is an important layer for three reasons:
The aggregation/bus layer provides these capabilities as well as adapting into legacy protocols. The bus layer may also provide some simple correlation and mapping from different correlation models (e.g. mapping a device ID into an owner’s ID or vice-versa).
Finally the aggregation/bus layer needs to perform two key security roles. It must be able to act as an OAuth2 Resource Server (validating Bearer Tokens and associated resource access scopes). It must also be able to act as a policy enforcement point (PEP) for policy-based access. In this model, the bus makes requests to the identity and access management layer to validate access requests. The identity and access management layer acts as a policy decision point (PDP) in this process. The bus layer then implements the results of these calls to the PDP to either allow or disallow resource access.
This layer takes the events from the bus and provides the ability to process and act upon these events. A core capability here is the requirement to store the data into a database. This may happen in three forms. The traditional model here would be to write a serverside application, e.g. this could be a JAX-RS application backed by a database. However, there are many approaches where we can support more agile approaches. The first of these is to use a big data analytics platform. This is a cloud-scalable platform that supports technologies such as Apache Hadoop to provide highly scalable mapreduce analytics on the data coming from the devices. The second approach is to support complex event processing to initiate near real-time activities and actions based on data from the devices and from the rest of the system.
Our recommended approach in this space is to use the following approaches:
The reference architecture needs to provide a way for these devices to communicate outside of the device-oriented system. This includes three main approaches. Firstly, we need the ability to create web-based front-ends and portals that interact with devices and with the event-processing layer. Secondly, we need the ability to create dashboards that offer views into analytics and event processing. Finally, we need to be able to interact with systems outside this network using machine-to-machine communications (APIs). These APIs need to be managed and controlled and this happens in an API management system.
The recommended approach to building the web front end is to utilize a modular front-end architecture, such as a portal, which allows simple fast composition of useful UIs. Of course the architecture also supports existing Web server-side technology, such as Java Servlets/ JSP, PHP, Python, Ruby, etc. Our recommended approach is based on the Java framework and the most popular Java-based web server, Apache Tomcat.
The dashboard is a re-usable system focused on creating graphs and other visualizations of data coming from the devices and the event processing layer.
The API management layer provides three main functions:
Device management (DM) is handled by two components. A server-side system (the device manager) communicates with devices via various protocols and provides both individual and bulk control of devices. It also remotely manages software and applications deployed on the device. It can lock and/or wipe the device if necessary. The device manager works in conjunction with the device management agents. There are multiple different agents for different platforms and device types.
The device manager also needs to maintain the list of device identities and map these into owners. It must also work with the identity and access management layer to manage access controls over devices (e.g. who else can manage the device apart from the owner, how much control does the owner have vs. the administrator, etc.)
There are three levels of device: non-managed, semi-managed and fully managed (NM, SM,FM).
Fully managed devices are those that run a full DM agent. A full DM agent supports:
Non-managed devices can communicate with the rest of the network, but have no agent involved. These may include 8-bit devices where the constraints are too small to support the agent. The device manager may still maintain information on the availability and location of the device if this is available.
Semi-managed devices are those that implement some parts of the DM (e.g. feature control, but not software management).
The final layer is the identity and access management layer. This layer needs to provide the following services:
The identity layer may of course have other requirements specific to the other identity and access management for a given instantiation of the reference architecture. In this section we have outlined the major components of the reference architecture as well as specific decisions we have taken around technologies. These decisions are motivated by the specific requirements of IoT architectures as well as best practices for building agile, evolvable, scalable Internet architectures. Of course there are other options, but this reference architecture utilizes proven approaches that are known to be successful in real-life IoT projects we have worked on.
A reference architecture is useful as-is. However, it is even more useful if there is a real instantiation. In this section we provide a mapping into products and capabilities of the WSO2 platform to show how the reference architecture can be implemented.
The WSO2 platform is a completely modular, open-source enterprise platform that provides all the capabilities needed for the server-side of this architecture. In addition, we also provide some reference components for the device layer – it is an intractable problem to provide components for all possible devices, but we do provide either sample code and/or supported code for certain popular device types.
An important aspect of the WSO2 platform is that it is inherently multi-tenant. This means that it can support multiple organizations on a single deployment with isolation between organizations (tenants). This is a key capability for deploying this reference architecture as a software-as-a-service (SaaS) offering. It is also used by some organizations on-premise to support different divisions or departments within a group.
The WSO2 platform supports deployment on three different targets:
The WSO2 platform is based on a technology called WSO2 Carbon, which is in turn based on OSGi. Each product in the platform shares the same kernel based on Carbon. In addition, each product is made from features that are composed to provide the required functionality. Features can be added and subtracted as needed. All the products work together using standard interoperable protocols, such as HTTP, MQTT and AMQP10. All the WSO2 products are available under the Apache Software License v2.0 which is a businessfriendly, non-viral Open Source License11. Figure 3 shows the IoT reference architecture layered with the corresponding WSO2 product capabilities.
Figure 03: IoT reference architecture mapped to WSO2 platform components
We support any device. We have a reference device management capability on any Linuxbased or Android-based device, which can be ported to other 32-bit platforms.
WSO2 also can help with MQTT client code for many device platforms ranging from Arduino to Android.
We provide two core products that implement this layer:
WSO2 offers a complete platform for data analytics with WSO2 Data Analytics Server (WSO2 DAS, available in Q4 2015), an industry first that combines the ability to analyze the same data at rest and in motion with predictive analysis. WSO2 DAS replaces WSO2 Business Activity Monitor and WSO2 Complex Event Processor. WSO2’s analytics platform offers the flexibility to scale to millions of events per second, whether running on-premises and in the cloud.
Our mapping provides the capabilities of this layer with the following products:
WSO2 Enterprise Mobility Management (EMM) provides
WSO2 Identity Server supports this aspect, and provides the following capabilities:
The WSO2 platform is the only modular, open source platform to provide all these capabilities (and more). As such it is the ideal basis for creating and deploying this IoT reference architecture.
One further aspect that is highly worth considering is the use of a platform-as-a-service (PaaS). WSO2 provides the WSO2 Private PaaS product that’s based on the Apache Stratos project. This provides a managed, elastically scalable, HA deployment of the products mentioned above and also manages tenancy, self-service subscription, and many other aspects. It also supports managing many other useful server-side capabilities including PHP, MySQL, MongoDB and others. We have not shown the PaaS layer on the IoT reference architecture as some deployments may not need this capability.
In this paper we have outlined the following:
The IoT space is evolving rapidly and we expect this paper – and the associated technologies – to evolve in sync. However, despite the emerging nature of this space, this paper and the reference architecture are based on real-world projects that we have deployed with customers to support IoT capabilities. As such, we have great confidence this is a useful, deployable, and effective reference architecture. WSO2 is due to release WSO2 IoT Server in 2016 featuring platform support for IoT.
[10]AMQP is an enterprise messaging protocol that supports pub/sub as well as queuing. It provides considerably higher qualities of service than MQTT including transactions. See https://amqp.org
[11] https://www.apache.org/licenses/LICENSE-2.0.htmlFor more details about our solutions or to discuss a specific requirement