XML Schemas support Built-In Data Types and User Defined Data Types to assist with the data validation of the input text strings.
XML Schemas divide user defined data types into two broad categories: simple and complex:
This topic also considers:
The following table lists all the basic data types supported by XML:
Built-in XML Data Types | |||
---|---|---|---|
Type | Description | Type | Description |
string | Any character | normalizedString | A whitespace normalized string where all spaces, tabs, carriage return and line feed characters are converted to single spaces |
token | A string that does not contain a sequence of two or more spaces, tabs, carriage return and line feed characters | byte | A numeric value from -128 to 127 |
unsignedByte | A numeric value from 0 to 255 | hex64Binary | Base 64 encoded binary information |
hexBinary | Hexidecimally encoded binary information | integer | A numeric value representing a whole number |
positiveInteger | An integer whose value is greater than 0 | negativeInteger | An integer whose value is less than 0 |
nonNegativeInteger | An integer whose value is 0 or greater than 0 | nonPositiveInteger | An integer whose value is 0 or less than 0 |
int | A numeric value from -2 147 483 648 to 2 147 483 647 | unsignedInt | A numeric value from 0 to 4 294 967 295 |
long | A numeric value from -9 223 372 036 854 775 808 to 9 223 372 036 854 775 807 | unsignedLong | A numeric value from 0 to 18 446 744 073 709 551 615 |
short | A numeric value from -32 768 to 32 767 | unsignedShort | A numeric value from 0 to 65 535 |
decimal | A numeric value that may or may not contain a fractional part | float | Any 32-bit floating-type real number e.g. 1E4, 1267.43233E12, 12.78e-2, 12 |
double | Any 64-bit floating-type real number e.g. 1E4, 1267.43233E12, 12.78e-2, 12 |
Boolean | A logical value including True, False, 0 and 1 |
time | An instant of time that recurs everyday in the format HH:MM:SS e.g.12:30:00, this time references UTC (Coordinated Universal Time) | date | Date value in the format YYYY-MM-DD |
dateTime | A combine date and time value in the format YYYY-MM-DD HH:MM:SS | duration | Length of a time interval in the extended format e.g.P1Y1M1D1H1M1S = 1 Year + 1 Month + 1 Day + 1 Hour + 1 Minute +1 Second |
gMonth | A Gregorian (Calendar) Month, the month (MM) part of a Date | gYear | A Gregorian (Calendar) Year, the year (YYYY) part of a Date |
gYearMonth | A Gregorian Year and Month, the year-month (YYYY-MM) part of a Date | gDay | A Gregorian Day, the day (DD) part of a Date |
gMonthDay | A specific day of the month, the month-day (MM-DD) part of a Date | Name | A string based on a well-formed element and attribute naming rules |
QName | The fully qualified XML Namespace name e.g. if the namespace is defined asxmlns:html="http://www.w3.org/1999/xhtml" the qualified name for <html:p> resolves to {http://www.w3.org/1999/xhtml}p |
NCName | The part of the namespace name to the right of the namespace prefix and colon e.g. if the namespace is defined as xmlns:html="http://www.w3.org/1999/xhtml" the NCName would be html="http://www.w3.org/1999/xhtml" |
anyURI | Represents the URI (Universal Resource Identifier) and can contain a URL or URN | language | A language constant as defined in RFC 1766, e.g. en-us (RFC 1766, ISO 639 Language Codes) (accessed September 2005) |
A simpleType declaration follows the format:
Copy |
---|
<--Basic declaration--> <declarationPrefix:simpleType id=ID final=(#all|(list|union|restriction)) name=NCName> <--Declaration of a restriction--> <declarationPrefix:restriction base=QName> <--Declaration of a list--> <declarationPrefix:list base=QName> <--Declaration of a union--> <declarationPrefix:union base=QName> |
For more details on further options see www.xml.dvint.com/docs/SchemaDataTypesQR-1.pdf (accessed September 2005)
Although the simpleType declaration format seems very confusing they follow a simple structure:
Constraining Factors in XML | |
---|---|
Type | Description |
length | Number of characters or for lists number of list choices |
minLength | Minimum number of characters or minimum number of list choices for lists |
maxLength | Maximum number of characters or maximum number of list choices for lists |
pattern | Defines a pattern or sequence of acceptable characters |
enumeration | Restricts the allowed values to a set of specified values |
whitespace | Sets how line feeds, tabs, spaces, and carriage returns are treated when the document is parsed |
maxInclusive | Maximum value including the number specified |
minInclusive | Minimum value including the number specified |
maxExclusive | Maximum value excluding the number specified |
minExclusive | Minimum value excluding the number specified |
totalDigits | Number of digits allowed for the non-decimal part of a decimal number (must be a positive number) |
fractionDigits | Number of digits allowed for the non-decimal part of a decimal number (must be a non-negative number) |
The structure is often best understood in an example:
Copy |
---|
<--XML Type Declaration--> <?xml version="1.0" encoding="UTF-8"> <--XML Schema Declaration--> <xsd:schema xmlns:xsd="http://www.w3.org/2001/xmlschema"> <--Simple Type Element Declaration--> <xsd:simpleType name="my_day_of_month"> <--Declaration of Restriction Base and Conditions--> <xsd:restriction base="xsd:positiveInteger"> <--Another Simple Type Element Declaration--> <xsd:simpleType name="numeric_postal_code"> <--Declaration of Restriction Base and Conditions--> <xsd:restriction base="xsd:integer"> </xsd:schema> </?xml> |
The ability to define complex data types in XML is one of its most powerful features.
A complex type element declaration is used to describe data collections and choices.
A complex type definition is most easily understood as a set of data variables - elements - and a content model (which is a combination construct).
Copy |
---|
<--XML Type Declaration--> <?xml version="1.0" encoding="UTF-8"> <--XML Schema Declaration--> <xsd:schema xmlns:xsd="http://www.w3.org/2001/xmlschema"> <--Complex Type Element Declaration--> <xsd:complexType name="contactDetails"> <--Specifying the Content Model--> <xsd:sequence> <--Listing the Elements--> <xsd:element name="firstName" type="xsd:string" minOccurs="1" maxOccurs="unbounded" /> <--Listing the Attributes--> <xsd:attribute name="title" type="xsd:string" use="optional" /> </xsd:complexType> |
Elements can be combined using the following content models:
The number of values each element is specified using the minOccurs and maxOccurs variables - these can be any non-negative integer. Similarly specifying minOccurs="0" defines the element as being optional. The Occurrence Constraints are further discussed in XML Schema |
Attributes are defined as usual, with the additional Use Constraints if the attribute is required or optional
Both simpleType and complexType elements can be constrained using a data restriction |
All data definitions in XML can be extended. The following example extends the contactDetails definition above.
Copy |
---|
<--Specifying the New Data Set Definition Name--> <xsd:complexType name="extendedContactDetails"> <--Specifying Data Definition being Extended--> <xsd:extension base="contactDetails"> <--Specifying the Content Model--> <xsd:sequence> <--Listing the New Elements--> <xsd:element name="address" type="xsd:string" /> </xsd:sequence> |
XML data definitions are able to relate awareness of the context in which they will be used by specifying content types.
Copy |
---|
<--Data Type Declaration: Order Confirmation, note mixed declaration--> <xsd:complexType name="orderConfirmation" mixed="true"> <xsd:extension base="extendedContactDetails"> <--Specifying the Content Model--> <xsd:sequence> <--Listing the New Elements--> <xsd:element name="orderId" type="xsd:positiveInteger"> </xsd:sequence> <--Using the Order Confirmation Data Type--> <orderConfirmation> Dear <firstName>John Smith</firstName> |
Copy |
---|
<--Shorthand for an Empty Complex Data Type--> <xsd:complexType name="internationalPrice"> <--List Attributes Only--> <xsd:attribute name="currency" type="xsd:string" /> </xsd:complexType> |
Copy |
---|
<--Declared anyType Data Type--> <xsd:element name="anything" type="xsd:anyType" /> <--No Type Declaration - default anyType--> <xsd:element name="anything" /> |