K2 BLACKPEARL PRODUCT DOCUMENTATION: USER GUIDE
XML Schema

K2 Concepts - XML Schema

An XML Schema describes the structure and content of document types in XML. The real power of XML schemas, however, is that they enable organizations to define and share a common vocabulary on which to base their documents and scripts.

XML Schemas also help in automatically validating the structure and content of the data entered - and as in most business application about 60% of all code is concerned with data validation this is a great saving.

This topic focuses on highlighting the structure of an XML Schema, giving some idea of how the basic model can be extended and some best practices

The Structure of an XML Schema

The structure of the XML Schema will be discussed in stages and finally a complete Schema will be shown. Remember Schemas define:

This topic covers:

Some of these items have been covered in >XML Basics and will only be highlighted here. More detail on the data types is provided
in XML Data Types

Schema Declaration

An XML Schema starts with an XML Type (document) Declaration, but more importantly is the schema declaration - which defines namespaces, sets various data transparency parameters, etc. - essentially it declares the environment in which the schema will be effective.

Format

<schema 
  id=ID
  xmlns:xsd=URI
  ...
  xmlns:my_xsd=myURI
  targetNamespace=anyURI
  attributeFormDefault=('qualified'|'unqualified')
  elementFormDefault=('qualified'|'unqualified')
  xml:language=language>
...
</schema>

Schema is the root element of the schema definition
The id attribute uniquely identifies this schema
The namespaces used in this schema are listed
The target namespace is set - elements without an explicit namespace link will use this namespace
The transparency (hidden / exposed) of the attribute names is set - whether they will be qualified (showing the namespace information) or not. Defaults to unqualified
The transparency (hidden / exposed) of the element names is set - whether they will be qualified (showing the namespace information) or not. Defaults to unqualified 
The language of the document content (usually ISO 639 Language Code)

Example

<schema id="camera" xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                    xmlns:nikon="http://www.nikon.com"
                    xmlns:olympus="http://www.olympus.com"
                    xmlns:pentax="http://www.pentax.com"
                    xmlns=http://www.camera.org"
           targetNamespace="http://www.camera.org"
           attributeFormDefault="unqualified"
           elementFormDefault="unqualified"
           language="en">

 Best Practice

  1. Make the XML Schema the default namespace
  2. Make the target namespace the default
  3. Have no default namespace

Schema Management

Any project typically has more than one schema. Schema files can be integrated in three ways:

  1. Include - gives the integrating schema access to the components defined in the included schema. The effect is the same as if the component declarations/definitions had been typed directly into the integrating schema. If the components come from a schema with no namespace they take on the namespace of the integrating schema
  2. Import - the integrating schema reuses the components in the supporting schemas by importing them. Import is primarily used to integrate schemas with different namespaces
  3. Redefine - the integrating schema reuses the components from the supporting schema in a way that allows their definitions to be modified (extended or restricted - See XML Data Types)

Format

<include id=ID schemaLocation=anyURI />

<import id=ID namespace=anyURI
        schemaLocation=anyURI />

<redefine id=ID schemaLocation=anyURI />

Include - includes external schemas with the same Namespace

Import - imports external schemas maintaining different Namespaces

Redefine - includes an external schema with the possibility of modification

Example

<xsd:include schemaLocation="nikon.xsd"/>
<xsd:redefine schemaLocation="olympus.xsd" />
<xsd:import namespace="http://www.pentax.com" schemaLocation="pentax.xsd"/>

 Best Practice

  1. Heterogeneous Namespace Design: each schema has a different namespace
    Key characteristics: Uses <import />; maintains the integrity of element names; does not allow the components to be modified
  2. Homogeneous Namespace Design: each schema has the same target namespace
    Key characteristics: Uses <include /> or <redefine />; does not provide any visual indication of the origin or lineage of components and runs the risk of name collisions; allows components to be modified
  3. Chameleon Namespace Design / No Namespace Design: the integrating schema has a target namespace the supporting schemas have no  target namespace
    Key characteristics: Strictly uses <include /> or <redefine /> but if proxy schemas are employed uses <import /> to integrate the proxy schemas and so avoids name collisions; components can still be customized; part of the lineage of the components remains hidden 

Type and Element Declarations

Type and Element declarations form the heart of any schema. The basic format and options for type and element declarations are covered here.

Defining elements is discussed in XML Basics; Simple and Complex Type Declarations are also covered in XML Data Types

Simple Type Declaration

 Format

<--Basic declaration-->

<declarationPrefix:simpleType
    id=ID 
    final=(#all|(list|union|restriction))
    name=NCName>
  Content:(annotation?, (restriction|list|union))
</declarationPrefix:simpleType>

<--Declaration of a restriction-->

<declarationPrefix:restriction base=QName>
  Content:(annotation?, (simpleType?
            (minExclusive|minInclusive
             |maxExclusive|maxInclusive
             |totalDigits|fractionDigits
             |length|minLength|maxLength
             |enumeration|whiteSpace|pattern)))</declarationPrefix:restriction>

<--Declaration of a list-->

<declarationPrefix:list base=QName>
Content:(annotation?, (simpleType?))</declarationPrefix:list>

<--Declaration of a union-->

<declarationPrefix:union base=QName>
Content:(annotation?, (simpleType))</declarationPrefix:union>

The basic simpleType declaration contains three attributes:

  • id - provides a unique identifier to element
  • final - restricts how new data types may be derived from this simpleType:
  • #all - no restriction to a single type
  • restriction - only a part of the defined data type
  • list - a limited sequence of values
  • union - a combination of one or more data types
  • name - the name of the data type this element is defining

The type information of the data type being defined makes use of basic data types supported by XML see XML Data Types

The basic data types are combined using restriction, list or union operators

The constraint parameters for the restriction operation are highlighted in XML Data Types


Examples

<!--simpleType Restriction-->

<!--Bounded Numbers-->

<xsd:simpleType name="theAnswer">

   <xsd:restriction base="xsd:integer">

      <xsd:minInclusive value="42" />

      <xsd:maxInclusive value="42" />

   </xsd:restriction>

</xsd:simpleType>

<!--String Length-->

<xsd:simpleType name="licensePlate">

   <xsd:restriction base="xsd:string">

      <xsd:minLength value="1" />

      <xsd:maxLength value="9" />

   </xsd:restriction>

</xsd:simpleType>

<!--Enumeration-->

<xsd:simpleType name="title">

   <xsd:restriction base="xsd:string">

      <xsd:enumeration value="Dr" />

      <xsd:enumeration value="Mr" />

      <xsd:enumeration value="Mrs" />

      <xsd:enumeration value="Ms" />

      <xsd:enumeration value="Prof" />

   </xsd:restriction base="xsd:string">

</xsd:simpleType>

<!--Digital Numbers-->

<xsd:simpleType name="currency" >

   <xsd:restriction base="xsd:decimal">

      <xsd:fractionDigits value="2" />

   </xsd:restriction>

</xsd:simpleType>

<!--Pattern-->

<xsd:simpleType name="socialSecurityNumber" >

   <xsd:restriction base="xsd:string">

      <xsd:pattern value="[0-9]{3}-[0-9]{2}-
            [0-9]{4}" />

   </xsd:restriction>

</xsd:simpleType>

<!--Whitespace-->

<xsd:simpleType name="token" >

   <xsd:restriction base="xsd:normalizedString">

      <xsd:whitespace value="collapse" />

   </xsd:restriction>

</xsd:simpleType>

<!--simpleType List-->

<xsd:simpleType name="ages" >

   <xsd:list itemType="xsd:positiveInteger">

   </xsd:list>

</xsd:simpleType>

<!--simpleType Union-->

<!--Define the base simpleType data sets-->

<xsd:simpleType name="catBreeds">

   <xsd:restriction base="xsd:string">

      <xsd:enumeration value="Abyssinian" />

      <xsd:enumeration value="Siamese" />

      <xsd:enumeration value="Himalayan" />

      <xsd:enumeration value="Persian" />

   </xsd:restriction base="xsd:string">

</xsd:simpleType>

<xsd:simpleType name="dogBreeds">

   <xsd:restriction base="xsd:string">

      <xsd:enumeration value="Labrador" />

      <xsd:enumeration value="Spaniel" />

      <xsd:enumeration value="Terrier" />

      <xsd:enumeration value="Poodle" />

   </xsd:restriction base="xsd:string">

</xsd:simpleType>

<!--Combine the data sets-->

<xsd:simpleType name="pet">

   <xsd:union memberTypes="target:catBreeds
        target:dogBreeds" />

</xsd:simpleType>

Complex Type Declaration

Format

<--Basic declaration-->

<declarationPrefix:complexType
    id=ID 
    abstract=boolean
    final=(#all|(extension|restriction))
    mixed=boolean
    name=NCName>
  Content:(annotation?, (simpleContent|complexContent|((group|all|choice|sequence),((attribute|attributeGroup),anyAttribute))))
</declarationPrefix:complexType>

<--Simple Content-->

<declarationPrefix:simpleContent id=ID>
  Content:(annotation?,(restriction|extension))
</declarationPrefix:simpleContent>

<--Restriction-->

<declarationPrefix:restriction id=ID base=QName>
  Content:(annotation?, (simpleType?
            (minExclusive|minInclusive
             |maxExclusive|maxInclusive
             |totalDigits|fractionDigits
             |length|minLength|maxLength
             |enumeration|whiteSpace|pattern)),
            ((attribute|attributeGroup),
             anyAttribute))
</declarationPrefix:restriction>

<--Extension-->

<declarationPrefix:extension id=ID base=QName>
Content:(annotation?,((attribute|attributeGroup),anyAttribute)) </declarationPrefix:extension>

<--Complex Content-->

<declarationPrefix:complexContent id=ID mixed=boolean>
  Content:(annotation?,(restriction|extension))
</declarationPrefix:complexContent>

<--Restriction-->

<declarationPrefix:restriction id=ID base=QName>
Content:(annotation?,(group|all|choice|sequence),((attribute|attributeGroup),anyAttribute)) </declarationPrefix:restriction>

<--Extension-->

<declarationPrefix:extension id=ID base=QName>
Content:(annotation?,(group|all|choice|sequence),((attribute|attributeGroup),anyAttribute)) </declarationPrefix:extension>

complexType data types can contain both element content and attributes

The complexType declaration contains five attributes:

  • id - provides a unique identifier to element
  • abstract - sets whether the data type can be used to validate an element
  • final - restricts how new data types may be derived from this simpleType:
  • #all - no restriction to a single type
  • extension - additional data elements are defined to supplement the existing data type
  • restriction - only a part of the defined data type
  • mixed - specifies whether the content of the data type contains both elements and text or not
  • name - the name of the data type this element is defining

The content of the data type definition can be:

  • simple content - meaning it contains no tagged data, but only  text data
  • complex content - meaning it contains tagged (element) data and text data

In addition, the content of the definition can specified by combining elements using:

  • all - creates an unordered group of elements
  • group - creates a group of elements which can be referenced within the schema or other schemas
  • choice - defines a group of mutually exclusive elements
  • sequence - creates an ordered group of elements

The base definition of the complexType is can then be further refined by defining:

  • a restriction - constrains the definition of a simpleType, simpleContent or complexContent element; using the XML constraining factors (see XML Data Types)
  • an extension - enlarges a simpleType or complexType data definition

Finally the complexType content contains attributes, which could simply be listed as individual elements or grouped together in an attributeGroup

Example

<!--complexType Definition Example-->

<xsd:element name="Catalog">

<--ComplexType Declaration-->


  <xsd:complexType>
    <xsd:sequence>
       <xsd:element name="Person">
         <xsd:complexType>
           <xsd:sequence>
             <xsd:element name="Name" type="xsd:string"/>
           </xsd:sequence>
           <xsd:attribute name="id" type="xsd:ID" use="required"/>
         </xsd:complexType>
       </xsd:element>
       <xsd:element name="Book">
         <xsd:complexType>
           <xsd:sequence>
             <xsd:element name="Title" type="xsd:string"/>
             <xsd:element name="Author">
               <xsd:complexType>
                 <xsd:attribute name="idref" type="xsd:IDREF" use="required"/>
               </xsd:complexType>
             </xsd:element>
           </xsd:sequence>
         </xsd:complexType>
       </xsd:element>
    </xsd:sequence>
  </xsd:complexType>
</xsd:element>


Element Declaration

Format

<xsd:element id = ID
    abstract = boolean
    default = string
    final = ( #all|List of
    (extension|restriction))
    fixed = string
    form = (qualified|unqualified)
    maxOccurs = (nonNegativeInteger|unbounded)
    minOccurs = nonNegativeInteger
    name = NCName
    nillable = boolean
    ref = QName
    substitutionGroup = QName
    type = QName >
Content: (annotation?, ((simpleType | complexType)?,(unique | key | keyref)))
</xsd:element>

The element declaration contains the following attributes:

  • id - provides a unique identifier to element
  • abstract - sets whether the data type can be used to validate an element
  • default - specifies a default value for the element
  • final - restricts how new data types may be derived from this element:#all, extension or restriction
  • fixed - if present in an instance document the value must always match the specified string (defined constant)
  • form - overrides what is specified in elementFormDefault
  • cardinality attributes maxOccurs (max number of times) and  minOccurs (min number of times element can occur)
  • name - name of the element
  • nillable - if specified the element may have a nil value
  • ref - references a globally defined element


The content of the element can be of a simpleType or complexType, additionally the element can be specified as:

Example

<xsd:element name = "Customer">
  <xsd:complexType>
    <xsd:sequence>
      <xsd:element name = "FirstName" type = "xs:string" />
      <xsd:element name = "MiddleInitial" type = "xs:string" />
      <xsd:element name = "LastName" type = "xs:string" />
    </xsd:sequence>
    <xsd:attribute name = "customerID" type = "xs:string" />
  </xsd:complexType>
</xsd:element>

Best Practice

There are numerous preferences in arranging the content of the data definition in a schema, some important issues to consider are:

For more details see the xml-dev group best practice recommendations on global versus local declarations. All best practices have been adapted from recommendation put forward by the xml-dev group see http://www.xfront.com/BestPracticesHomepage.html (accessed September 2005) for more details

Attribute Declaration

Format

<xsd:attribute id = ID
     default = string
     fixed = string
     form = (qualified|unqualified)
     name = NCName
     ref = QName
     type = QName
     use = (optional|prohibited|required)>
Content: (annotation?, (simpleType?))
</xsd:attribute>

The attribute declaration is very similar to the element declaration and contains the following attributes:

  • id - provides a unique identifier to attribute
  • default - specifies a default value for the attribute
  • fixed - if present in an instance document the value must always match the specified string (defined constant)
  • form - overrides what is specified in attributeFormDefault
  • name - name of the attribute
  • ref - references a globally defined attribute


Example

<attribute name="version" type="string" fixed="1.0"/>
<attribute name="name" type="string" use="required"/>

For more details on XML schema structures www.xml.dvint.com/docs/SchemaStructuresQR-2.pdf

 

 


K2 blackpearl Help 4.6.11 (4.12060.1731.0)