WSO2Con 2013 CFP Banner

Binary with ADB

Discuss this article on Stack Overflow
By Chatra Nakkawita
  • 10 Aug, 2006
  • Level: 
  • Reads: 10582

Service Oriented Architecure (SOA) developers are not so familiar with techiniques of sending binary content along with SOAP messages and this article by Ajith Ranabahu aims to provide the missing bits and pieces and also show how the Axis2 Databinding (ADB) framework supports binary content.

Chatra Nakkawita

Introduction

Sending binary content in SOAP messages has been one of those interesting issues that have been in the discussion table for a while. There are quite afew improvements to the binary transmission techniques lately. Let's take a closer look in to these techniques.

How do you send Binary with SOAP Anyway?

When the SOAP message needs to contain an arbitrary array of bytes that causes a problem. The most important one is the serialization problem. SinceSOAP is based on XML and XML is primarily a protocol that works on text, the arbitrary bytes need to be converted into serializable characters. This can be done in two ways:

  1. Hex Conversion
  2. Place the bits in a sequence and pick blocks of four bits. Each 4 bit block matches to a hexadecimal number. Hence, use that relevant hexadecimal character in place of the four bits. This doubles the size of the message!

  3. base64 Conversion
  4. Place the bits in a sequence and pick blocks of six bits. Each six bit block matches to a number in the range of 0 to 63 (inclusive). A set of 64 characters is picked, namely A to Z (uppercase), a to z (lowercase), 0 to 9 (digits) and two other characters, '+' and '/'. The relevant character is used in place of the six bits. This results in a 33.3% increase in the size of the message. Also note that unlike in the case of Hex, there is also a need to add a 'padding' to the binary stream in some cases! (when number of bits is not a multiple of 6). Resource section contains a base64 reference that explains exactly how the padding is added.

Obviously the base64 conversion yields a better size and has been favored over the hex technique. However, even the base64 conversion results in a conversion overhead and an added increase of size. Optimizers have always tried to attack this point and there are two alternate ways to optimize the binary transmission in HTTP (which happens to be the most widely used transport!)

  1. Add the Content as a MIME Attachment

    MIME has been one of the proven binary transmission techniques. The SOAP message contains href attributes and contains the CID of the relevant MIME part. The disadvantage in this technique is the fact that the binary content is never a part of the SOAP message's infoset. It is treated as an attachment not only during the wire transfer but also at the API level, requiring special treatment and effectively alienating it from the infoset!

  2. MTOM'ize the Message

    This is a new technique that brings best of both worlds. When a message is MTOMized it is optimized only at the wire level and not at the API level. In short, the API can handle it as a usual infoset item but the wire transfer is optimized! To learn more about MTOM have a look at the resources section.

In the next section we will look at the optimizing capabilities of the Apache Axis2 framework and show how the optimization support of the ADB framework has been effectively utilized.

Axis2, ADB and MTOM

It is obvious that even when the transmission optimizations are available,if the SOAP framework is unable to use that to a performance gain, then there is no real use of it. Axis2 is built with first class support to MTOM and Axis2 code generator supports a fully optimized way of supporting binary data.

Take the following WSDL file fragment which has a Base64 binary node inits input message. (The full WSDL can be found in the resources section)

...<types>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://schemas.xmlsoap.org/wsdl/"
elementFormDefault="qualified">
<xsd:element name="image">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="meta-info" type="xsd:string"/>
<xsd:element name="image-attachment"
type="xsd:base64Binary"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema></types>
<message name="imageUploadRequest">
<part name="image" element="image"/>
</message>

...

<portType name="ImageHandlerPortType">
<operation name="uploadImage">
<input name="image" message="imageUploadRequest"/>
...
</operation>
</portType>

When this file is used in code generation (with ADB) it results in an ADBbean which has a javax.activation.DataHandler field. This is the field for the binary content. The following code fragment shows the piece of code from the Image class. The full source code for this article is avaialble in the resources section.

...        
/** field for ImageAttachment */
protected javax.activation.DataHandler localImageAttachment;
/**
* Auto generated getter method
* @return javax.activation.DataHandler
*/
public javax.activation.DataHandler getImageAttachment() {
return localImageAttachment;
}
/**
* Auto generated setter method
* @param param ImageAttachment
*/
public void setImageAttachment(javax.activation.DataHandler param) {
this.localImageAttachment = param;
}
...

However, the most interesting piece of code is inside the parsemethod of the Factory class. Following are fragments of code from the parse method of the Image class.

...boolean isReaderMTOMAware = false;

try {
isReaderMTOMAware = java.lang.Boolean.TRUE.equals(reader.getProperty
(org.apache.axiom.om.OMConstants.IS_DATA_HANDLERS_AWARE));

} catch (java.lang.IllegalArgumentException e) {
isReaderMTOMAware = false;
}
...
if (reader.isStartElement()&&new javax.xml.namespace.QName(
"http://schemas.xmlsoap.org/wsdl/","image-attachment").
equals(reader.getName())) {
if (isReaderMTOMAware && java.lang.Boolean.TRUE.equals
(reader.getProperty(org.apache.axiom.om.OMConstants.IS_BINARY))
){
//MTOM aware reader - get the datahandler directly and put it in the object
object.setImageAttachment((javax.activation.DataHandler) reader.
getProperty(org.apache.axiom.om.OMConstants.DATA_HANDLER));
}
else{
//Do the usual conversion
java.lang.String content = reader.getElementText();
object.setImageAttachment(org.apache.axis2.databinding.
utils.ConverterUtil.convertToBase64Binary(content)); }
reader.next();
} ...

The rationale behind this rather odd looking bit of code is as follows:

The AXIOM representation which is optimized for the handling of MTOMized messages generates the usual OMText object for optimized base64 blocks but keeps the content as a DataHandler, read straight from the raw MIME part. If the text/string is required then there is no other choice than converting the content into a base64 string. However, if the data handler is needed, then there is no conversion overhead and everything would work nicely and efficiently.

For this seemingly straight forward scenario to work, there is one barrier! The world of OM is exposed to the world of data binding through theXMLStreamReader interface. Hence the information passed between these two worlds should take the form specified by theXMLStreamReader, which in the case of base64 Binary, happens to be text!

Axis2 worked around this seemingly impossible problem by using the readersgetProperty method.

  1. In XMLStreamReader implementations provided by AXIOM, the getProperty method returns the Boolean.TRUE object for the key org.apache.axiom.om.OMConstants.IS_DATA_HANDLERS_AWARE ("IsDatahandlersAwareParsing")
  2. During a START_ELEMENT event the getProperty() method returns Boolean.TRUE for the key org.apache.axiom.om.OMConstants.IS_BINARY("Axiom.IsBinary") if the content of that node is optimized binary
  3. When the condition above is satisfied, the readers getProperty() method returns the data handler object for the key org.apache.axiom.om.OMConstants.DATA_HANDLER("Axiom.DataHandler")

By this technique, AXIOM provides a parser that is capable of handling binary the optimal way and ADB generates code that makes use of that optimization. If the binary content was received as a MIME part, ADB makes sure that there are no unwanted conversions in the middle.

Sending MTOMized Messages

If the readers run the sample code and look at the wire message, they would notice that there is no optimization at all in the wire message! This is in fact correct since there was no indication in the WSDL that the wire message needs to be optimized. The optimization can be forced by using the XMIME extension in the schema and the following WSDL schema fragment shows how this can be done.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"            
xmlns:xmime="http://www.w3.org/2004/06/xmlmime"
targetNamespace="http://schemas.xmlsoap.org/wsdl/"
elementFormDefault="qualified">
<xsd:import namespace='http://www.w3.org/2004/06/xmlmime' />
<xsd:element name="image">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="meta-info" type="xsd:string"/>
<xsd:element name="image-attachment">
<xsd:complexType>
<xsd:simpleContent>
<xsd:extension base="xsd:base64Binary">
<xsd:attribute ref="xmime:contentType"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>

</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>

Unfortunately even though this fragment shows the correct way to force optimization, ADB does not yet process simpleContent extensions and hence will complain when the above schema is encountered.

The possible way to optimize the content is to force the Axis2 runtime to optimize the base64 blocks that it encounters. This can be done by setting a parameter in the options object.

 
ImageHandlerServiceStub imageHandlerServiceStub = new ImageHandlerServiceStub();
Options options = imageHandlerServiceStub._getServiceClient().getOptions();
options.setProperty(org.apache.axis2.Constants.Configuration.ENABLE_MTOM,
Boolean.TRUE);

Conclusion

Although handling binary is tricky with SOAP, Apache Axis2 comes up with very good support to handle binary content. The Axis2 Databinding framework also include features that helps it to fully utilize the Axis2 binary handling capabilities.

Resources

  1. Details on Base64 encoding
  2. Image Uploader Service WSDL
  3. XMIME enabled Image Uploader Service WSDL
  4. Full Code sample

Author

Ajith Ranabahu, Senior Software Engineer, WSO2 Inc. ajith AT WSO2 DOT com
WSO2Con 2014 USA