Co-occurrence constraints and Conditional Type Assignment, with XML Schema 1.1
May 29, 2018
1. Introduction
This article discusses the XML Schema 1.1 language and specifically its following two features in details: "Co-occurrence constraints", and "Conditional Type Assignment". The article assumes that reader has knowledge of XML, XML Namespaces and basic knowledge of XML Schema (likely the 1.0 version of XML Schema language) as well. The very basics of these technologies won't be covered in this article. XML Schema 1.0 Second Edition has been a W3C standard since October 2004, and it is widely implemented in numerous products and libraries. XML Schema 1.1 became a W3C standard in April 2012, and its implementation is already available in few products and libraries. This article uses Xerces-J to test the examples presented. The other compliant products will exhibit similar behavior.
Some form of co-occurrence constraints has been available in XML Schema 1.0 as well, which would be discussed as well in this article. Conditional Type Assignment is a completely new facility introduced in XML Schema 1.1 language.
XML Schema 1.1 language is backward compatible with XML
Schema 1.0 language. This means that, XML Schema 1.0 validations will run fine
with an XML Schema 1.1 processor. An XML Schema document, describes the
structure and data-types for a certain class of XML documents (for example, a
schema document can describe how a purchase order should look like as an XML
document). To do this, the schema document uses notions like element and
attribute declarations, and complex and simple type definitions. An XML Schema
document also is a well-formed XML document, therefore it can be processed just
like any XML document. But XML Schema documents are special in a sense that,
they have elements and attributes from the XML namespace "http://www.w3.org/2001/
2. XML Schema example
Before going deeper into the main topics of this article, I would like to present a simple example mentioning an XML instance document, its corresponding schema and the process of validation of the XML instance document with a schema. This intends to set the right technical context, for the larger body of this article.
Consider the following example, where a XML document has data about a person that would be validated with a XML Schema.
XML Schema document:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="fName" type="xs:string"/> <xs:element name="mName" type="xs:string" minOccurs="0"/> <xs:element name="lName" type="xs:string"/> <xs:element name="gender"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="M"/> <xs:enumeration value="F"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="dob" type="xs:date" minOccurs="0"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
The following XML instance document is valid according to above schema document:
<person> <fName>Mukul</fName> <lName>Gandhi</lName> <gender>M</gender> </person>
(the XML instance document has chosen not to mention the elements "mName" and "dob", which are declared optional in the XML schema).
As a well known convention, XML documents that are validated with a schema will be referred as XML instance documents in this article.
3. Why use co-occurrence constraints and conditional type assignments in XML validations?
The nature of data (in particular XML data) is such that, co-occurrence constraints and conditional type assignment are one of the natural constraints that can exist in XML documents. And the users of XML Schema language, have been demanding these features in XML Schema language since long time; and these are now available in XML Schema 1.1 language. Let's look at the benefits of these features as been described in paragraphs below.
Co-occurrence constraints: There is certainly a requirement that, different elements and attributes in an XML document may relate to each other by certain conditions. Some examples of these are
- On an XML element, attribute "min" must be less than attribute "max".
- The sum of data in a sequence of elements must meet a certain condition (like it must be equal, less than or greater than some value, etc).
- A specific element values must relate in a certain way, to a specific attribute's value.
We can imagine many other such constraints, which may collectively be termed as "co-occurrence constraints".
Conditional type assignment: These are a specific type of constraints, that solve the following problems while modeling XML data using XML schemas. Some properties of an element (mainly the absence/presence, or values of its attributes), may require certain types (simple or complex types) to the element.
These aspects will become more clear, in the following sections of this article.
4. Co-occurrence constraints
The XML Schema 1.1 specification defines schema co-occurrence constraints as follows:
"constraints which make the presence of an attribute or element, or the values allowable for it, depend on the value or presence of other attributes or elements".
XML Schema 1.0 provided certain kinds of co-occurrence constraints, using the following elements in the schema document: unique, key and keyref (these constructs are known as Identity Constraints or IDC in the XML Schema language). These elements need to be specified on the element declarations in a schema document, for establishing co-occurrence constraints. The unique, key and keyref constraints are available in XML Schema 1.1 language as well.
At a high level, both "key"
and "unique"
constraints
require that, certain values in the XML document must be distinct (i.e. all
different).
Following is an example of using the "key"
element in an
XML Schema document.
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="students"> <xs:complexType> <xs:sequence> <xs:element name="student" maxOccurs="unbounded" type="Student"/> </xs:sequence> </xs:complexType> <xs:key name="id_key"> <xs:selector xpath="student"/> <xs:field xpath="id"/> </xs:key> </xs:element> <xs:complexType name="Student"> <xs:sequence> <xs:element name="id" type="xs:integer" minOccurs="0"/> </xs:sequence> </xs:complexType> </xs:schema>
Interestingly, the "id"
element within the "student"
element is specified as optional in the above schema document. I've chosen so,
to illustrate the differences between "key"
and "unique"
IDC elements.
Following is one valid XML document according to above schema document:
<?xml version="1.0"?> <students> <student> <id>1</id> </student> <student> <id>2</id> </student> <student> <id>3</id> </student> <student> <id>4</id> </student> <student> <id>5</id> </student> </students>
Following are two invalid XML documents for the schema presented in this example:
<?xml version="1.0"?> <students> <student> <id>1</id> </student> <student> <id>2</id> </student> <student> <id>2</id> </student> <student> <id>4</id> </student> <student> <id>5</id> </student> </students>
(in this example, a particular "id"
value occurs more
than once, that violates the "key"
constraint)
and,
<?xml version="1.0"?> <students> <student> <id>1</id> </student> <student> <id>2</id> </student> <student> <id>3</id> </student> <student> </student> <student> <id>5</id> </student> </students>
(in this example, although "id"
element is specified as
optional in the schema, it must be present if "key"
constraint is specified in
the schema)
If in the XSD document of this example, we specified
"unique"
element instead of "key"
as follows:
<xs:unique name="id_unique"> <xs:selector xpath="student"/> <xs:field xpath="id"/> </xs:unique>
and everything else in the XSD document remains same, then following two XML documents would be reported as valid by such a XSD document:
<?xml version="1.0"?> <students> <student> <id>1</id> </student> <student> <id>2</id> </student> <student> </student> <student> <id>4</id> </student> <student> <id>5</id> </student> </students>
and,
<?xml version="1.0"?> <students> <student> <id>1</id> </student> <student> <id>2</id> </student> <student> </student> <student> </student> <student> <id>5</id> </student> </students>
The "unique"
element in an XSD document, considers fine
that the XML item pointed by xs:field can be null/absent. But the "key"
element
does not allow this (all XML items pointed by xs:field must exist, and must have
valid values).
I won't be explaining "keyref"
element in this article,
and expect the readers to read about it elsewhere if they want to.
4.1 XML Schema 1.1 co-occurrence constraints
In this section, I would explain the uses of following
XML Schema 1.1 constructs: <xs:assert>
and <xs:assertion>
. The <xs:assert>
construct provides features of co-occurrence constraints. <xs:assertion>
looks
similar to <xs:assert>
but <xs:assertion>
is specified in the definitions of XSD
simple types as facet. Although <xs:assertion>
is not a co-occurrence syntax in
XML Schema documents, I'll explain <xs:assertion>
as well in this section, since
<xs:assert>
and <xs:assertion>
are syntactically very similar.
I'll copy few definitions from the XML Schema 1.1
specification below, to illustrate where in an XSD document <xs:assert>
element
can occur (the place of assertions are highlighted with bold emphasis).
XML Representation Summary: complexType Element Information Item
<complexType
abstract = boolean : false
block = (#all | List of (extension | restriction))
final = (#all | List of (extension | restriction))
id = ID
mixed = boolean
name = NCName
defaultAttributesApply = boolean : true
{any attributes with non-schema namespace . . .}>
Content: (annotation?, (simpleContent | complexContent | (openContent?, (group | all | choice | sequence)?, ((attribute | attributeGroup)*, anyAttribute?), assert*)))
</complexType>
XML Representation Summary: simpleContent Element Information Item et al.
<simpleContent id = ID {any attributes with non-schema namespace . . .}> Content: (annotation?, (restriction | extension)) </simpleContent> <restriction base = QName id = ID {any attributes with non-schema namespace . . .}> Content: (annotation?, (simpleType?, (minExclusive | minInclusive | maxExclusive | maxInclusive | totalDigits | fractionDigits | length | minLength | maxLength | enumeration | whiteSpace | pattern | assertion | {any with namespace: ##other})*)?, ((attribute | attributeGroup)*, anyAttribute?), assert*) </restriction> <extension base = QName id = ID {any attributes with non-schema namespace . . .}> Content: (annotation?, ((attribute | attributeGroup)*, anyAttribute?), assert*) </extension>
XML Representation Summary: complexContent Element Information Item et al.
<complexContent id = ID mixed = boolean {any attributes with non-schema namespace . . .}> Content: (annotation?, (restriction | extension)) </complexContent> <restriction base = QName id = ID {any attributes with non-schema namespace . . .}> Content: (annotation?, openContent?, (group | all | choice | sequence)?, ((attribute | attributeGroup)*, anyAttribute?), assert*) </restriction> <extension base = QName id = ID {any attributes with non-schema namespace . . .}> Content: (annotation?, openContent?, ((group | all | choice | sequence)?, ((attribute | attributeGroup)*, anyAttribute?), assert*)) </extension>
As can be seen from above definitions, <xs:assert>
can be written in complex type definitions and they can occur zero up to any number of times. The asserts are written at the end of complex type definitions.
The element <xs:assert>
is itself defined as following in the XML Schema 1.1 specification:
XML Representation Summary: assert Element Information Item
<assert id = ID test = an XPath expression xpathDefaultNamespace = (anyURI | (##defaultNamespace | ##targetNamespace | ##local)) {any attributes with non-schema namespace . . .}> Content: (annotation?) </assert>
Quoting from XML Schema 1.1 specification, "An assertion is a predicate associated with a type, which is checked for each instance of the type. If an element or attribute information item fails to satisfy an assertion associated with a given type, then that information item is not locally valid with respect to that type."
What this essentially means is, that if assertions are specified on a complex type for example, then every XML instance fragment that is validated by such a complex type will also be validated by the rules expressed by the assertions. An assertion when evaluated on an XML instance document, will result in a boolean 'true' or a 'false' outcome. If the evaluation of any assertion results in a 'false' outcome, then the concerned XML instance fragment and the XML instance document as a whole would be reported as invalid by the XML Schema 1.1 processor. Assertions are specified as XPath 2.0 expressions, that are evaluated on a strongly typed (the types we are referring are the XML Schema types) XML tree rooted at an element whose complex type is performing the validation of an XML element.
Let us look at few XML Schema use cases, where assertions could be useful. The following examples with relevant explanations illustrate this.
Assertion example 1:
Consider the following XSD 1.1 schema:
(the complex type of this XSD document is borrowed from XML Schema 1.1 specification)
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="data"> <xs:complexType> <xs:attribute name="min" type="xs:int"/> <xs:attribute name="max" type="xs:int"/> <xs:assert test="@min le @max"/> </xs:complexType> </xs:element> </xs:schema>
(the assertion in this example, requires that the value
of "min"
attribute must be less or equals to that of "max"
attribute for the XML
document to be considered valid)
Following is a valid XML document, when validated by the above schema:
<?xml version="1.0"?> <data min="5" max="10"/>
While, following XML document is invalid when validated by the schema in this example:
<?xml version="1.0"?> <data min="5" max="2"/>
The XML Schema 1.1 specification allows, one up to any
number of <xs:assert>
elements to be there at a particular point. Consider the
following XSD document, which is a slight variation of the XSD document
specified above:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="data"> <xs:complexType> <xs:attribute name="min" type="xs:int"/> <xs:attribute name="max" type="xs:int"/> <xs:assert test="@min le @max"/> <xs:assert test="@min gt 2"/> </xs:complexType> </xs:element> </xs:schema>
In this example we have two <xs:assert>
elements, both of
which need to return boolean value 'true'
for the validation to pass. The second
<xs:assert>
specifies that the "min"
attribute must be greater than 2.
Following is a valid XML document for the above schema:
<?xml version="1.0"?> <data min="5" max="10"/>
Whereas, following XML document will be reported as invalid by the schema specified in this example:
<?xml version="1.0"?> <data min="1" max="10"/>
Interestingly, the two asserts specified in this example
can be converted into following one <xs:assert>
:
<xs:assert test="(@min le @max) and (@min gt 2)"/>
(please note the "and"
condition in the assert's XPath
expression)
Note that any particular assert, can use expressions using the full schema aware XPath 2.0 language (look at References section, for a link to XPath 2.0 language specification).
Assertion example 2:
Consider the following XSD document:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="IDVals"> <xs:complexType> <xs:sequence> <xs:element name="id" type="xs:integer" maxOccurs="unbounded"/> </xs:sequence> <xs:assert test="count(id) eq count(distinct-values(id))"/> </xs:complexType> </xs:element> </xs:schema>
The assertion in above XSD document specifies, that all
"id"
element values must be distinct.
Following is a valid XML document as per schema above:
<?xml version="1.0"?> <IDVals> <id>1</id> <id>2</id> <id>3</id> <id>4</id> <id>5</id> </IDVals>
Whereas, following is an invalid XML document when validated by the schema document provided in this example:
<?xml version="1.0"?> <IDVals> <id>1</id> <id>2</id> <id>3</id> <id>4</id> <id>2</id> </IDVals>
(the value 2 has occurred more than once)
It can be argued, that assertion in this example
specifies an IDC like constraint (with "key"
or "unique"
XSD elements). It is
really up to the user's taste whether they would like to use IDC constraints or
use assertions, for a requirement like specified in this example.
Assertion example 3:
Let us look at another XSD example, that solves the
problem of sorting using <xs:assert>
expressions. Consider the following XSD
document:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="IDVals"> <xs:complexType> <xs:sequence> <xs:element name="id" type="xs:integer" maxOccurs="unbounded"/> </xs:sequence> <xs:assert test="every $val in (for $x in 1 to count(id)-1 return (id[$x] le id[$x+1])) satisfies ($val eq true())"/> </xs:complexType> </xs:element> </xs:schema>
The <xs:assert>
element in this XSD document requires,
that values of "id"
elements must occur in ascending sorted order. In this
example, consecutive same values are considered sorted.
Let's try to understand the XPath 2.0 expression as specified in this example. A "for"
loop is embedded in an "every"
statement. The "for"
loop returns a sequence of boolean values (a 'true' means that, the previous value is less-or-equal to the current item's value). The "every"
construct specifies, that each of the item in the "for"
loop's returned sequence must be 'true'. The specified XPath expression is a naive implementation, of the
check of sorted order.
Assertion example 4:
Assertions also provides new string processing
capabilities for complex type "mixed"
content models. Following is one example
of this. Consider the following XSD document:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="X"> <xs:complexType mixed="true"> <xs:sequence> <xs:element name="x" type="xs:string" maxOccurs="unbounded"/> </xs:sequence> <xs:assert test="not(contains(string- join(text(),''),'prohibited')) "/> </xs:complexType> </xs:element> </xs:schema>
(note mixed="true"
on complexType, and the <xs:assert>
element)
The assertion specifies that, the word "prohibited"
must
not occur in any of text nodes which are child of the element node "X"
.
Following are two valid XML documents for the above schema document:
<?xml version="1.0"?> <X> <x>a</x> <x>b</x> <x>c</x> <x>d</x> <x>e</x> </X>
and,
<?xml version="1.0"?> <X> <x>a</x> <x>b</x> <x>c</x> f <x>d</x> <x>e</x> </X>
Whereas, following is one invalid XML document when validated by the XSD document provided in this example:
<?xml version="1.0"?> <X> <x>a</x> <x>b</x> <x>c</x> prohibited <x>d</x> <x>e</x> </X>
Assertion example 5:
Assertions also provide for arbitrary string processing
when using the <xs:any>
wild-card schema element (the <xs:any>
wild-card
in the schema, specifies that it allows any element to occur at its point in the
XML instance document). Consider the following XSD
document:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="X"> <xs:complexType> <xs:sequence> <xs:any processContents="skip"/> </xs:sequence> <xs:assert test="not(temp)"/> <xs:assert test="string-length(string(*[ 1])) lt 10"/> </xs:complexType> </xs:element> </xs:schema>
The first assertion specifies that, child element of
element "X"
must not have name "temp"
. The second assertion specifies that,
string value of an element which is child of "X"
must have maximum length 9.
Following is one valid XML document for the XSD schema shown above:
<?xml version="1.0"?> <X> <x>abcde</x> </X>
Whereas following two XML documents would be invalid, when validated by the same XSD schema:
<?xml version="1.0"?> <X> <temp>abcde</temp> </X>
(element "temp"
is not allowed by the <xs:assert>
)
and,
<?xml version="1.0"?> <X> <x>abcdefghij</x> </X>
(the string "abcdefghij"
exceeds the length required by
<xs:assert>
)
Assertion example 6:
In this example, we'll see how specifying <xs:assert>
works in an XSD complex type derivation. Consider the following XSD document:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="X"> <xs:complexType> <xs:complexContent> <xs:extension base="xSpec"> <xs:attribute name="id" type="xs:integer"/> <xs:assert test="@id mod 2 = 0"/> </xs:extension> </xs:complexContent> </xs:complexType> </xs:element> <xs:complexType name="xSpec"> <xs:sequence> <xs:element name="x" type="xs:string"/> </xs:sequence> <xs:assert test="contains(x, 'hello')"/> </xs:complexType> </xs:schema>
Following is one valid XML document, when validated by the above XSD schema:
<?xml version="1.0"?> <X id="2"> <x>abcdefhello</x> </X>
(the value of attribute "id"
is an even integer, and the
value of element "x"
contains the word "hello"
)
Following are two invalid XML documents, when validated by the XSD document specified in this example:
<?xml version="1.0"?> <X id="2"> <x>abcdef</x> </X>
(the value of element "x"
doesn't contain the word
"hello"
)
<?xml version="1.0"?> <X id="1"> <x>abcdefhello</x> </X>
(the value of "id"
attribute is not even)
Other useful XSD 1.1 schemas can be written that use
<xs:assert>
, if we use XPath 2.0 functions like "avg"
, "max"
, "min"
, "sum"
(Infact the whole of XPath 2.0 functions & operators can be used in <xs:assert>
XPath expressions).
With Xerces-J, by default XML comments and processing
instructions (PIs) are not available in the XPath Data Model (XDM) trees that
<xs:assert>
expressions can access. With an API option, setting a specific
validation feature to boolean 'true', will make comments and PIs exist in XDM
trees during assertion evaluations. By enabling this feature, we can allow
<xs:assert>
expressions to check for presence/absence and do string processing
on comments and PIs.
4.2 Simple Type facet <xs:assertion>
The XML Schema 1.1 language, has introduced a new facet
for simple types that is named <xs:assertion>
. Although <xs:assertion>
doesn't
provide for co-occurrence constraints, I'm describing it here because it is quite
similar to <xs:assert>
at the syntactical level (<xs:assert>
and <xs:assertion>
both use XPath 2.0 expressions as a predicate language).
Following is an excerpt from the XML Schema 1.1 specification the grammar of simple types:
XML Representation Summary: simpleType Element Information Item et al.
<simpleType final = (#all | List of (list | union | restriction | extension)) id = ID name = NCName {any attributes with non-schema namespace . . .}> Content: (annotation?, (restriction | list | union)) </simpleType> <restriction base = QName id = ID {any attributes with non-schema namespace . . .}> Content: (annotation?, (simpleType?, (minExclusive | minInclusive | maxExclusive | maxInclusive | totalDigits | fractionDigits | length | minLength | maxLength | enumeration | whiteSpace | pattern | assertion | explicitTimezone | {any with namespace: ##other})*)) </restriction> <list id = ID itemType = QName {any attributes with non-schema namespace . . .}> Content: (annotation?, simpleType?) </list> <union id = ID memberTypes = List of QName {any attributes with non-schema namespace . . .}> Content: (annotation?, simpleType*) </union>
The element <xs:assertion>
is itself defined as following in
the XML Schema 1.1 specification:
XML Representation Summary: assertion Element Information Item
<assertion id = ID test = an XPath expression xpathDefaultNamespace = (anyURI | (##defaultNamespace | ##targetNamespace | ##local)) {any attributes with non-schema namespace . . .}> Content: (annotation?) </assertion>
Let's see a fairly simple example of using <xs:assertion>
facet in an XSD schema. Consider the following XSD document:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="IDVals"> <xs:complexType> <xs:sequence> <xs:element name="id" type="ID" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> <xs:simpleType name="ID"> <xs:restriction base="xs:nonNegativeInteger"> <xs:assertion test="$value mod 2 = 0"/> </xs:restriction> </xs:simpleType> </xs:schema>
Following is one valid XML document, when validated by the above XSD document:
<?xml version="1.0"?> <IDVals> <id>2</id> <id>4</id> <id>6</id> <id>8</id> <id>10</id> </IDVals>
(the value of the "id"
element must be an even integer,
as required by the schema)
Whereas following is one invalid XML document, when validated by the XSD document specified in this example:
<?xml version="1.0"?> <IDVals> <id>1</id> <id>2</id> <id>3</id> <id>4</id> <id>5</id> </IDVals>
(the "id"
values 1, 3 and 5 are not valid according to
the XSD schema)
5. Conditional Type Assignment
The XML Schema 1.1 specification also refers to this XSD construct as "Type Alternatives". Many times, we would also refer to this feature as CTA. The "Type Alternative" XSD feature is also a form of co-occurrence constraint (as we shall see in this section), but I have chosen to describe it in a section of its own. Assertions are the generic co-occurrence constraints feature, while CTAs provide a certain kind of co-occurrence constraints feature.
We can specify 0 up to any number of <xs:alternative>
XSD
elements as child of <xs:element>
construct.
The following XSD grammar fragments, illustrate the
syntax of <xs:alternative>
construct:
XML Representation Summary: element Element Information Item
<element abstract = boolean : false block = (#all | List of (extension | restriction | substitution)) default = string final = (#all | List of (extension | restriction)) fixed = string form = (qualified | unqualified) id = ID maxOccurs = (nonNegativeInteger | unbounded) : 1 minOccurs = nonNegativeInteger : 1 name = NCName nillable = boolean : false ref = QName substitutionGroup = List of QName targetNamespace = anyURI type = QName {any attributes with non-schema namespace . . .}> Content: (annotation?, ((simpleType | complexType)?, alternative*, (unique | key | keyref)*)) </element>
<alternative id = ID test = an XPath expression type = QName xpathDefaultNamespace = (anyURI | (##defaultNamespace | ##targetNamespace | ##local)) {any attributes with non-schema namespace . . .}> Content: (annotation?, (simpleType | complexType)?) </alternative>
Let's look at a simple example, illustrating the
functionality of <xs:alternative>
element. Consider the following XSD document:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Addresses"> <xs:complexType> <xs:sequence> <xs:element name="address" minOccurs="2" maxOccurs="2"> <xs:alternative test="@format ='US'" type="USAddress"/> <xs:alternative test="@format ='Canada'" type="CanadaAddress"/> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <xs:complexType name="USAddress"> <xs:sequence> <xs:element name="street" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="state" type="xs:string"/> <xs:element name="zip" type="xs:positiveInteger"/> </xs:sequence> <xs:attribute name="format" type="xs:string" fixed="US"/> </xs:complexType> <xs:complexType name="CanadaAddress"> <xs:sequence> <xs:element name="civicAddress" type="xs:string"/> <xs:element name="municipality" type="xs:string"/> <xs:element name="province" type="xs:string"/> <xs:element name="postalCode" type="xs:string"/> </xs:sequence> <xs:attribute name="format" type="xs:string" fixed="Canada"/> </xs:complexType> </xs:schema>
According to the XSD document above, the content of
"address"
element will have a type "USAddress"
in an XML instance document if the attribute "format"
has value "US"
on the "address"
element. And if the "address"
element's attribute "format"
has value "Canada"
, then the content of
"address"
element will have a type "CanadaAddress"
in an XML instance document.
i.e depending on the value of an attribute in an XML instance document, we can
specify different XSD types to an element.
Following is one valid XML document, according to the XSD document presented above:
<?xml version="1.0"?> <Addresses> <address format="US"> <street>123 Main Street</street> <city>Lansing</city> <state>Michigan</state> <zip>48864</zip> </address> <address format="Canada"> <civicAddress>10-123 1/2 Main ST SE</civicAddress> <municipality>Montreal</municipality> <province>QC</province> <postalCode>H3Z 2Y7</postalCode> </address> </Addresses>
Let's also look at the following variations to above XSD validation:
1) If in an XML instance document, we specify value other
than "US"
and "Canada"
to the attribute "format"
of XML element "address"
.
In this case, the validation will still pass. Let's
discuss the reasons of this. The <xs:element>
declaration with <xs:alternative>
elements in it, is implicitly available as following:
<xs:element name="address" minOccurs="2" maxOccurs="2" type="xs:anyType"> <xs:alternative test="@format ='US'" type="USAddress"/> <xs:alternative test="@format ='Canada'" type="CanadaAddress"/> </xs:element>
That is, if "format"
attribute does not have values "US"
or "Canada"
, then "address"
element will have type "xs:anyType".
Also consider following syntax:
<xs:element name="E1" type="T3"> <xs:alternative test="..." type="T1"/> <xs:alternative test="..." type="T2"/> </xs:element>
The XML instance element "E1"
can have types T1, T2 or
T3. Types T1 and T2 must derive from type T3. This is a constraint that is
required by the XML Schema 1.1 specification.
2) We change the element declaration in XSD document to following:
<xs:element name="address" minOccurs="2" maxOccurs="2"> <xs:alternative test="@format ='US'" type="USAddress"/> <xs:alternative test="@format ='Canada'" type="CanadaAddress"/> <xs:alternative type="xs:error"/> </xs:element>
Now if in the XML instance document, value of attribute
"format"
is anything other than "US"
or "Canada"
, the type xs:error will be
assigned to the element "address"
. xs:error is an XSD simple type, and any
element or attribute that is assigned the type xs:error is invalid. Therefore,
in this case the XSD validation fails.
"test" attribute on <xs:alternative>:
The value of "test"
attribute on the <xs:alternative>
element, is an XPath
expression. It would evaluate to either boolean 'true' or 'false'. The context
item for the "test"
evaluation is an element node on which <xs:alternative>
constructs are specified. The XML Schema 1.1 language says, that we can write
these "test"
expressions using a CTA specific XPath language (that is specified
in the XSD 1.1 specification), or using the full XPath 2.0 language. Xerces-J by
default uses the CTA specific XPath language, but it can use full XPath 2.0
language by setting a specific feature on the validation API during the XSD
validation. Other XSD 1.1 validators can choose to implement this, in their own
ways.
6. Summary
I hope this article met the expectations of readers, to understand the working of co-occurrence constraints and type alternative features of the XSD 1.1 language.
While discussing both assertions and conditional type assignments, I haven't discussed the XML namespace related features of assertion and CTA constructs. They are simple to understand, if we know the fundamentals of XML namespaces. I would leave it to the readers to explore those features, if they want to.
7. References
- W3C XML Schema Definition Language (XSD) 1.1 Part 1:
Structures :
http://www.w3.org/TR/
xmlschema11-1/ - W3C XML Schema Definition Language (XSD) 1.1 Part 2:
Datatypes :
http://www.w3.org/TR/
xmlschema11-2/ - XML Path Language (XPath) 2.0 : http://www.w3.org/TR/xpath20/
- XQuery 1.0 and XPath 2.0 Functions and Operators :
http://www.w3.org/TR/xquery-
operators/ - Xerces-J XML tools library :
http://xerces.apache.org/
xerces2-j/