E4X Quick Start Guide

Table of Contents


First came Javascript, introduced by Netscape and having no real non-marketing relation to the Java programming language. Then came JScript, Microsoft's Javascript implementation. Then came ECMAscript, the standardized version of the language which unified the flavors (at least the basics) and provided consistent behavior and licensing terms to implementations. Soon there were lots of ECMAscript implementations, in browsers from Firefox and Internet Explorer and Opera and Safari, through to Adobe's Flash. ECMAscript, the heir to a long series of unfortunate and confusing names, serves as the basis to much of the dynamic content on the Web.

But ECMAScript, which I'll still informally call Javascript for the remainder of this article, doesn't play terribly well with the primary markup languages of the web -- XML and HTML. These languages are generally exposed to script using a set of objects called the Document Object Model (DOM), a notoriously difficult way to navigate and shape the XML structure.

Enter E4X, officially "ECMAscript for XML", a standard extension to Javascript that makes XML (and therefore XHTML) a first-class datatype within the language. E4X support is still not universal -- it's not supported in Internet Explorer yet, but it is supported in Firefox and implementations based on the open-source Rhino Javascript implementation, including the WSO2 Mashup Server.

This quick start guide should not only give you the basics of E4X, but also point out some of the tricky cases that an intermediate user will likely encounter. Knowing where these traps are will lead you quickly to an enjoyable and productive use of E4X. A working knowledge of XML and Javascript is required. For those with a working knowledge of XPath, we highlight differences in assumptions to make the transition smoother.


Literal XML

E4X introduces a new type, "XML", which holds an XML element. You can create literal XML values by writing XML directly.

var order = <order id="i1000423">
    <part id="p343-3456" quantity="2"/>
</order> ;

Trick: You can't use an arbitrary XML element as an XML literal value. If you try you might get an error. Watch out for the following circumstances:

If you're having problems, you might try parsing the XML as a string, using the XML constructor:

var order = new XML ('<order id="i1000423"><part id="p343-3456" quantity="2"/></order>');

But if so you should watch out for backslashes in the XML -- as they are interpreted as Javascript escaping mechanism. You will need to replace any occurances of "\" with "\\".

E4X also introduces the "XMLList" type, which holds a set of XML structures. This is obviously useful in queries (for instance, when asking for all the children an element, an XMLList is created). But E4X provides a literal notation for an XML list as well: wrap the list of elements in "<>" and "</>".

var parts = <>
    <part id="p343-3456" quantity="2"/>
    <part id="p343-2110" quantity="1"/>
</> ;

XML lists can also be parsed directly from concatenated strings representing XML elements.

var parts = new XMLList('<part id="p343-3456" quantity="2"/><part id="p343-2110" quantity="1"/>');

In E4X, the line between an XML value and an XMLList value with a single item is intentionally blurry. As a result in practice there is little difference between an XML value and an XMLList value with a single item.


Once you have an XML object, you can access its name and namespace, children, attributes, and text content using the familiar Javascript dot operator.

var c = <customer number="1721">
    <name>
        <first>John</first>
        <last>Smith</last>
    </name>
    <phone type="mobile">888-555-1212</phone>
    <phone type="office">888-555-2121</phone>
</customer> ;
var name = c.name.first + " " + c.name.last;
var num = c.@number;
var firstphone = c.phone[0];

The XML literal can contain comments and processing instructions as well as elements, attributes, and text.

A child element can be accessed by name as a property of the parent. If there are more than one child with that name, an XML List is returned, which can be further qualified by an index or other qualifier.

Trick: Know your context. Notice that one writes "c.name" to access the "name" child of customer. One doesn't write "customer.name" as an XPath user might be tempted to ("customer/name"). The name of the variable is separate from the name of the XML element value. I've adopted the convention of naming the variable the same as the XML element to avoid this. That is, in the above example, I'd typically name the variable "customer" rather than "c". In XPath terms the context of the "." query is the element itself, not a document node that contains the element as a child.

A child attribute can be accessed by name using the "@" prefix, or with the attribute() method.

Trick: Escape reserved words when they appear as XML element names. Javascript has a fair number of reserved words that appear fairly commonly as XML element names. In this case, dot notation cannot be allowed, and an escaped syntax must be employed.

Trick: Escape names with "." or "-" when they appear as XML element names. Dot and hyphen already have meaning within Javascript, so when referring to an XML element containing one of those characters, an escaping mechanism must be employed.

var transcript = <transcript>
    <class>History 101</class>
    <abstract>Introduction to World History</abstract>
    <public can-register="yes"/>
</transcript>;

The following expressions are invalid E4X because of reserved words or invalid name characters.

var c = transcript.class;
var a = transcript.abstract;
var r = transcript.public.@can-register;

Instead use the following synonymous syntax:

var c = transcript["class"];
var a = transcript["abstract"];
var r = transcript["public"].attribute("can-register");

For your reference, here is a list of reserved words in Javascript and E4X:

abstract    boolean    break       byte        case         catch
char        class      const       continue    debugger     default
delete      do         double      each        else         enum
export      extends    final       finally     float        for
function    goto       if          implements  import       in
instanceof  int        interface   long        namespace    native
new         package    private     protected   public       return
short       static     super       switch      synchronized this
throw       throws     transient   try         typeof       var
void        volatile   while       with        xml

The text value of an element with a primitive value (no element children) or of an attribute can be obtained explicitly as a string by the toString() method, but in most circumstances the toString() method will be called automatically when a string value is needed. For a complex element (one with children), toString() returns the XML syntax representation of the element. To obtain the XML syntax representation of a node explicitly (including elements with primitive values), use the toXMLString() method.

c.name.toString()
'<name><first>John</first><last>Smith</last></name>'
c.name.toXMLString()
'<name><first>John</first><last>Smith</last></name>'
c.@number.toString()
'1721'
c.@number.toXMLString()
'1721'
c.phone[0].toString()
'888-555-1212'
c.phone[0].toXMLString()
'<phone type="mobile">888-555-1212</phone>'

One useful construct in XPath is the predicate notation (e.g. "customer/phone[@type='mobile']") to filter a node list. E4X has a similar construct:

c.phone.(@type == "mobile")

In this notation the XML List of elements matching c.phone is filtered to those with a type attribute with the value "mobile". If no elements match, the result is an empty XML list, best checked for empty using ".length() == 0".

Trick: Children hide variables of the same name. If there is a choice within the filtering expression between a child element name or a variable, the child element name will take precedence, sometimes with unexpected results, as shown below:

var customer = <customer>
    <phone type="mobile">888-555-1212</phone>
    <phone type="office">888-555-2121</phone>
    <preferred>mobile</preferred>
</customer> ;
var preferred = "office";
var x = customer.(phone[0].@type == preferred);

In this example, one might expect that x would be empty, as the type attribute of the first phone element is not equal to the value of the "preferred" variable. But instead, the value of the "preferred" element is used.

The rule to remember is that within a filter expression, names are assumed to be relative to the context XML. Only if no such name is found are variables checked.

The following summary provides some XPath equivalents for common XML navigation operations:

XPath

Meaning

E4X Equivalent

element/*

Select all children of element

element.*

element/@*

Select all attributes of element

element.@*

element//descendent

Select all descendents (children, grandchildren, etc.) of element

element..descendent

.. or parent::element

Select the parent of element

element.parent()

xmlns:foo="..."

element/foo:bar

Select the foo:bar child of element where foo is the prefix of a declared namespace

var foo = new Namespace(...);

element.foo::bar

name(element)

Return the full name (including prefix if any) of element

element.name()

local-name(element)

Return the local name of element

element.localName()

namespace-uri(element)

Return the namespace uri (if any) of element

element.namespace()

element/namespace::*

Return the collection of namespaces as an Array of Namespace objects (E4X) or a nodeset of Namespaces nodes (XPath)

element.inScopeNamespaces()

element/processing-instructions(name)

Return the processing instruction children of element with the specified name (if omitted, all are returned).

element.processingInstructions(name)

string(element)

Return the concatenated text nodes of this element and all its descendants

stringValue(element);

stringValue.visible = false;
function stringValue(node) {
    var value = "";
    if (node.hasSimpleContent()) {
        value = node.toString();
    } else {
        for each (var c in node.children()) {
            value += stringValue(c);
        }
    }
    return value;
}


Constructing XML

Besides the literal use of XML as a value, or parsing XML text into an XML value, E4X provides a templating mechanism to construct complex XML structures from variable and expression values. Within text content, curly braces can be used to insert a value into the XML:

var nextId = 1234;
var first = "John";
var last = "Smith";
var c = <customer number={nextId++}>
          <name>
            <first>{first}</first>
            <last>{last}</last>
          </name>
        </customer> ;

Attribute values can be determined by replacing the whole attribute value (including quotes!) with an expression. Curly braces within quotes will be treated literally.

Although it's rarely needed, element and attribute names can be evaluated too:

var phonetype = "mobile";
var identifiertype = "id";
var c = <{phonetype} {identifiertype}={nextId++} />888-555-2112</{phonetype}>;

XML lists can be created by using the addition operator on individual XML elements:

var employees = <employee name="Joe"/> +
                <employee name="Arun"/> + 
                <employee name="Betty"/>; 


Modifying XML

The value of an element or attribute can be changed by assigning a new value to it:

c.@number = 1235;
c.phone.(@type='mobile') = "650-555-1234";

Deleting an XML element or attribute from a structure is accomplished with the delete operator:

delete c.phone.(@type='office');

If one wishes to add an child, one can use the += operator to insert a new element at a particular location:

c.phone += <phone type='home'>650-555-1414</phone>; // append new phone child.
c.phone[0] += <phone type='home'>650-555-1414</phone>; // insert new phone child after the first phone child.

E4X also provides many DOM-like capabilities, often synonyms for the functionality already discussed. The methods of the XML object include the following methods. Objects marked with a * are also available on XML List objects.

Methods of the XML object

Meaning

addNamespace(namespace)

Adds the namespace to the in-scope namespaces of the element.

appendChild(child)

Adds child as a new child of the element, after all other children.

attribute(attributeName) *

Returns the attribute of with the requested name.

attributes() *

Returns the attributes of this element.

child(propertyName) *

Returns the child element with the given propertyName, or if propertyName is an integer, returns the child in that position.

childIndex()

Returns the index of this children among its siblings.

children() *

Returns all the children of this object.

comments() *

Returns all the comments that are children of this XML object.

contains(value) *

Compares this element with the value, primarily provided to enable an XML element and an XML list to be used interchangeably.

copy() *

Returns a deep copy of the element. The parent property of the copy will be set to null.

descendants([name]) *

Returns the descendant elements (children, grandchildren, etc.). If a name is provided, only elements with that name are returned.

elements([name]) *

Returns the child elements. If a name is provided, only elements with that name are returned.

hasComplexContent() *

Returns true for elements with child elements, otherwise false.

hasSimpleContent() *

Returns true for attributes, text nodes, or elements without child elements, otherwise false.

inScopeNamespaces()

Returns an array of Namespace objects representing the namespaces in scope for this object.

insertChildAfter(child1, child2)

Inserts child2 immediately after child1 in the XML object's children list.

insertChildBefore(child1, child2)

Inserts child2 immediately prior to child1 in the XML object's children list.

length() *

Returns 1 for XML objects (allowing an XML object to be treated like an XML List with a single item.)

localName()

Returns the local name of this object.

name()

Returns the qualified name of this object.

namespace([prefix])

Returns the namespace associated with this object, or if a prefix is specified, an in-scope namespace with that prefix.

namespaceDeclarations()

An array of Namespace objects representing the namespace declarations associated with this object.

nodeKind()

A string representing the kind of object this is (e.g. "element").

normalize() *

Merge adjacent text nodes and eliminate empty ones.

parent() *

The parent of this object. For an XML List object, this returns undefined unless all the items of the list have the same parent.

processingInstructions([name]) *

A list of all processing instructions that are children of this element. If a name is provided, only processing instructions matching this name will be returned.

prependChild(value)

Add a new child to an element, prior to all other children.

removeNamespace(namespace)

Removes a namespace from the in-scope namespaces of the element.

replace(propertyName, value)

Replace a child with a new one.

setChildren(value)

Replace the children of the object with the value (typically an XML List).

setLocalName(name)

Sets the local name of the XML object to the requested value.

setName(name)

Sets the name of the XML object to the requested value (possibly qualified).

setNamespace(ns)

Sets the namespace of the XML object to the requested value.

text() *

Concatenation of all text node children.

toString() *

For elements without element children, returns the values of the text node children. For elements with element children, returns same as toXMLString. For other kinds of objects, the value of the object.

toXMLString() *

Serializes this XML object as parseable XML.

valueOf() *

Returns this XML object.


Trick: E4X uses methods where DOM uses properties. DOM users will be used to accessing information about a node (like localName) as properties of the XML Node object. In E4X, children elements are exposed as properties, so similar accessors are modeled as methods instead.


Iteration

The for operator is extended to more easily traverse the properties of an object (especially children of an XML object, or items in an XMLList).

var firstCustomerDate = new Date();
for (var customer in c.customer) {
    var since = new Date(customer.since);
    if (since < firstCustomerDate) {
        firstCustomerDate = since;
    }
}

(N.B. Is there any difference between for..in and for each..in?)


Further Reading

This information is extracted from the ECMA-357 specification [1]. Further information about E4X, including the details of Namespace and QName objects, can be found in the E4X specification. Contribute your ideas for improving this document to mashup-user@wso2.org.

[1] ECMA-357.pdf