2008/09/23
23 Sep, 2008

Mashup, a new and exciting aspect of Web 2.0

  • Rohitha Fernando
  • Senior Software Engineer - WSO2

Table of Contents

Introduction

The term mashup has its roots in music, where people combine the vocal and instrumental tracks from two or more songs to create a new song. Also, innovative disc jockeys used this to combine an old song with a new track having the same beat to create a wholly different dance experience.

A software mashup is comprised of two or more content components from different sources but presented to the user as a single, seamless "experience," namely a new application. Highly customized within mobile or Web portals, mashups strive to provide rich functionality and content tailored to specific, narrowcast audiences.

There are many types of mashups - consumer mashups, data mashups, and business mashups. Some of the most common early use cases include maps, but there are also video mashups, photo mashups, search and shopping mashups, and news mashups. The distinction is somewhat arbitary because the underlining technology is the same regardless of the use case.

The most common mashup is the consumer mashup. Consumer mashups combine different forms of media from multiple sources and combine them into a single graphical interface. Consumer mashups are aimed at the general public, that is, consumers. An example would be a news mashup where RSS feeds are used by many blogs and news organizations as a means of distributing (or syndicating) news headlines and story summaries. So that makes it possible for mashup developers to create personalized newspapers that meet many interests. Other mashups combine keyword from news stories with maps and photo sharing sites.

Underlying process in brief

Content used in mashups is typically sourced from a third party via a public interface or API (Web services). Other methods of sourcing content for mashups include Web feeds (e.g. RSS or ATOM), and screen scraping.

RSS is a family of XML-based syndication formats. Syndication means that a Web site that wants to distribute content creates an RSS document and registers the document with an RSS publisher. An RSS-enabled client can then check the publisher's feed for new content and react to it in an appropriate manner. ATOM is a newer, but similar syndication protocol and seeks to maintain better metadata than RSS, providing better and more rigorous documentation and incorporates the notion of constructs for common data representation. So, the two technologies - namely RSS and ATOM - are great for mashups that aggregate event-based or update-driven content, such as news and weblog aggregators.

Scraping is the process of using software tools to parse and analyze content that was originally written for human consumption in order to extract semantic data structures representative of that information that can be used and manipulated programmatically. Mashups use screen scraping technology for data acquisition, especially when pulling data from the public sectors. A Mashup project that scrapes data is XMLTV, a collection of tools that aggregate TV listings from all over the world.

In a nutshell, mashups use APIs provided by different content sites to aggregate and reuse content in another way and hence users can use data feeds and application programming interfaces (APIs) provided by established sites such as Google, Yahoo, Microsoft, Amazon, eBay and others, which are created specifically to encourage mashups.

Mashup explained

  • Classification - Using newer "Web 2.0" techniques.
  • Philosophy/Approach - Uses APIs provided by different content sites to aggregate and reuse the contents in another way.
  • Content dependencies - Can operate on pure XML content and also on presentation-oriented content (e.g., HTML).
  • Location dependencies - Content aggregation can take place either on the server or on the client.
  • Aggregation style - Individual content may be combined in any manner, resulting in arbitrarily structured hybrid content.
  • Event model - Create, read, update and delete operations are based on REST architectural principles, but no formal API exists.
  • Relevant standards - Base standard is XML Data Interchange with REST semantics. RSS and Atom are commonly used.

Mashup is a Web application

A mashup is a Web application that combines data from more than one source into a single integrated tool.

  • Mashups use a browser as the client-side solution component that provides a user interface (UI) to a given application.
  • Mashups are centrally managed from a server in the enterprise, and are deployed to the user’s browser when a URL is used to access a Web resource on the server (for example, an HTML, JSP, Java Servlet, ASP, CGI, or PHP resource).
  • A Mashups business logic can be executed on enterprise systems and data can be retrieved from enterprise databases, public data sources, or external services.
  • The code that defines and makes up the UI and information extracted from internal or external business systems are rendered within the browser, providing users with the means to view and act on the data.
  • The content sent to the browser can include JavaScript and scripting libraries that execute in the browser’s runtime to allow custom logic to be run locally in the browser. In many cases, the JavaScript and scripting libraries are used for simple field validation or the implementation of complex UI controls to provide an interactive and rich user interface.

Mashup Architecture

Comprised of three different components - API/content providers, the mashup site, and the client's Web browser.

  • The API/content providers - To facilitate data retrieval, providers often expose their content through Web-protocols such as REST, Web services, and RSS/Atom. However, many interesting potential data-sources do not conveniently expose APIs. Mashups that extract content from sites like Wikipedia, TV Guide, and virtually all government and public domain Web sites do so by a technique known as screen scraping. In this context, screen scraping facilitates a process by which a tool attempts to extract information from the content provider by attempting to parse the provider's Web pages, which were originally intended for human consumption.
  • The mashup site - The place where the mashup is hosted and the logic resides, it is not necessarily where it is executed. Mashups can be implemented as traditional Web applications using server-side dynamic content generation technologies like Java servlets, CGI, PHP or ASP. Alternatively, mashed content can be generated directly within the client's browser through client-side scripting such as JavaScript or applets. This client-side logic is often the combination of code directly embedded in the mashup's Web pages as well as scripting API libraries or applets referenced by these Web pages. Mashups using this approach can be termed rich internet applications (RIAs), meaning that they are oriented towards the interactive user-experience. Rich internet applications are one hallmark of what's now being termed "Web 2.0", the next generation of services available on the World Wide Web. Often mashups use a combination of both server and client-side logic to achieve their data aggregation. Many mashup applications use data that is supplied directly to them by their user base, making one of the data sets local. Additionally, performing complex queries on multiple-sourced data requires computation that would be infeasible to perform within the client's Web browser.
  • The client's Web browser - The place where the application is rendered graphically and where user interaction takes place. As described above, mashups often use client-side logic to assemble and compose the mashed content.

Mashups use SOAP and REST as Web protocols

SOAP and REST Web protocols are considered as platform independent protocols to communicate with remote services. And hence, SOAP and REST can be used to interact with remote services without knowledge of their underlying platform implementation. The functionality of a service is by means of passing description of the messages that it requests and responds with.

SOAP is a key technology in the Web Services construct. Its original meaning Simple Object Access Protocol has been re-termed as Services-Oriented Access Protocol because its focus has shifted from object-based systems towards the interoperability of message exchange. SOAP specification has two key components: first is the use of an XML message format for encoding, and the second is the message structure which consists of a header and a body. The header exchanges contextual information that is not specific to the body such as authentication information. The message body encapsulates the application-specific payload. WSDL documents describe SOAP APIs for Web services, and the same describes what operations a service exposes, the format for the messages that it accepts by using XML Schema, and how to address it. SOAP messages are typically conveyed over HTTP transport while other transports such as JMS or e-mail are also possible.

REST is a technique of Web-based communication using just HTTP and XML. REST is considered as the document-literal style of SOAP services because REST fundamentally supports only a few operations like POST, GET, PUT, DELETE that are applicable to all pieces of information. The emphasis in REST is on the pieces of information themselves, called resources.

Mashups add value to the Semantic Web

Semantic Web is the next generation that the existing Web can be increased to supplement the content designed for humans with equivalent machine-readable information. In Semantic Web, the term information is different from data; data becomes information when it conveys meaning. Its goal of creating Web infrastructure that augments data with metadata to give meaning, thus making it suitable for automation, integration, reasoning and re-use. Resource Description Framework (RDF) serves this purpose of providing methodologies to establish word structures that describe data. For this, XML in itself is not sufficient and RDF-Schema adds to RDF's ability to encode concepts in a machine-readable way. RDF constructs of relationships between data objects through subject-predicate-object triples ("subject S has relationship R with object O"). The combination of data model and graph of relationships allows for the creation of ontologies which are hierarchical structures of knowledge that can be searched. For example, if you define a model "guests" as a subclass of "hotel-type" with the constraint that its "comfortable" than other "hotel-type", and create two instances of it: one would populate data of all the guests, another concerning comfortable hotels. By mashing these separate model instances you could reason that high-end guests might go to 5 star hotels but not 2 star hotels.

Nowadays, RDF data is quickly finding adoption in a variety of domains including social networking applications where you could find a Friend of a Friend, and syndication such as RSS explained above.

Mashups in an enterprise environment

Mashups can play a big role in the enterprise environment. Next generation providers of Web services are powering this mashup revolution, making a wealth of new business applications possible.

Mashups create new services for consumers which allow them to create a lot of new possibilities. It is becoming more straightforward for end-users to create mashup sites regardless of their technical skill level by using a site’s Application Programming Interface (API). This provides a simple way to implement mashups and new consumers can create a mashup easily. With all APIs available, developers can easily build applications that are cheaper, more reusable and more maintainable. "Web 2.0" is all about the personalisation of information. In that context, mashups can grow faster because every mashup can offer new features from existing Web sites. It is easy to imagine how you can create a personalised information service when you have the ability to add features with mashups. As they are a part of "Web 2.0," mashups increase the level of personalization of the Internet. Many people are convinced in the power of mashups and view mashups as the future of the web.

Mashups are a form of technology integration as they can be adapted to join together many technologies and use many types of implementation languages. Now, corporate developers are beginning to combine various Web services from a range of made-to-order vendor solutions to deliver business users their very own enterprise mashups. This new way of coding is marching hand-in-hand with the Service Oriented Computing revolution, SOA where applications and systems share services with each other in an open, and integrated way.

Perhaps in the real world, software developers with this new user-centric will take to asking each other, "How are your users' mashup experiences?... and what are the value additions to them?"

Some mashups can be as simple as mixing together some JavaScript code with XML to create a new and innovative web-based service. Other larger mashups that are the main focus of their respective website will take advantage of services such as Google Maps and a database which uses street addresses to link the two together and project information onto the map.

More complex mashups approach composite applications (those that are made up of many services), an advanced SOA concept. For instance, you could mash up a customer database with marketing metrics, then mash up the results even further with sales forecast processes. You own and maintain some of the information and services; some are accessible over the Internet.

WSO2 Mashup Server combines best of Web 2.0 and SOA

WSO2 Mashup Server combines the simplicity and richness of mashups with the reusability, security, reliability and governance required for a service-oriented architecture (SOA). In doing so, it allows enterprises to empower their Web developers, business analysts, and other power users to develop valuable situational applications. These mashups can be published, shared, rated, tagged, commented upon, and searched to maximize their value to the enterprise.

Create, deploy and consume Web services mashups - WSO2 open source mashup server for easy Web service composition and aggregation using JavaScript. Retrieve information from Web services and pages, feeds and databases, manipulate using JavaScript and expose the result as as a Web service, page, feed or google gadget, while also interacting with the user via e-mail and instant messages. Secure services easily, tag, rate, and comment on mashups and share them with other Mashup Servers.

WSO2 Mashup Server 1.5.1 is now released with a lot of new features including integrated Data Services support, openID support etc. Also, it offers a Dashboard similar to iGoogle powered by Apache Shindig.

WSO2 Mashup Server Features

  • Support for consuming and deploying services using dynamic scripting languages.
  • Trivial deployment and redeployment.
  • Automatic and UI-based generation of Web services artifacts (e.g. wsdl, schema, policy).
  • A set of gateways into a variety of information sources, including SOAP and POX/REST Web services, as well as plain old Web pages.
  • Human-consumable results through a variety of user interfaces including Web pages, portals, e-mail, Instant Messenger service, Short Message Service (SMS), etc.

WSO2 Mashup Server's Community Portal

WSO2 Mashup Server's Community Portal - https://mooshup.com supports sharing and hosting of mashups developed using WSO2 Mashup Server. You can upload your mashups to this portal using the sharing functionality available on WSO2 Mashup Server Admin Console. Mashup developers around the world can now easily download your mashups to their own mashup servers.

Resources

References

Author

Rohitha Fernando, Senior Software Engineer - User Interfaces, WSO2 Inc. [email protected]

 

About Author

  • Rohitha Fernando
  • Senior Software Engineer
  • WSO2 Inc