K2 BLACKPEARL PRODUCT DOCUMENTATION: USER GUIDE
XML Basics

K2 Concepts - XML Basics

XML - Extensible Markup Language - is used to describe documents and data in a standardized text-based format. XML provides a powerful and robust framework for data transfer in that it:

XML Basics

The strength of XML lies in the flexible hierarchy of the data structures it provides. The rules of XML consist of a simple set that focus on standardizing the way in which data is organized without limiting the content in anyway. A simple analogy is a language with a strict grammar but where the words are made-up as required by its speakers.

Specifically it is important to understand:

Fig.1. Hierarchy for a Book Reference

Tags, Elements, Attributes and Text

The basic XML structures are best understood in the context of an example information set. Consider a bibliography with two book references in:

  1. Iacone,SJ. Write to the Point. Career Press. 2003 (ISBN: 1-56414-639-1)
  2. Benz, B; Durant, JR. XML Programming Bible. Wiley Publishing, Inc. 2003 (ISBN: 0-7645-3829-2)

Each record consists of an author listing (Last Name, Initials) , the title of the book, the publisher and the year published - additionally the ISBN uniquely identifies each book. It is then possible to construct a data hierarchy for a Book Reference Record as shown in Fig.1.

The associated XML definition is shown in the code sample. The key elements of the definition includes:

Copy

<!--XML Definition for a Bibliography-->

<bibliography>

<!--First Book Reference Start-->

  <reference id="1" isbn="1-56414-639-1">

<!--Author Listing Start-->

    <authors>

      <author>

        <last_name>Iacone</last_name>

        <initials>SJ</initials>

      </author>

    </authors>

<!--Author Listing End-->

    <title>Write to the Point</title>

    <publisher>Career Press</publisher>

    <publish_date>2003</publish_date>

  </reference>

<!--First Book End-->

!--Second Book Reference Start-->

  <reference id="2" isbn="0-7645-3829-2">

<!--Author Listing Start-->

    <authors>

      <author>

        <last_name>Benz</last_name>

        <initials>B</initials>

      </author>

    <author>

        <last_name>Durant</last_name>

        <initials>JR</initials>

      </author>

    </authors>

<!--Author Listing End-->

    <title>XML Programming Bible</title>

    <publisher>Wiley Publishing, Inc</publisher>

    <publish_date>2003</publish_date>

  </reference>

<!--Second Book End-->

</bibliography>

<!--Bibliography End-->

Well-formed XML

XML has core set of format requirements in order for XML to be considered well-formed. This is the grammar of the language referred to above.

General formatting rules for XML are:

Elements

The following are the rules for elements, according to the XML standard:

Colons should only be used when a namespace has been defined. See namespaces below for more detail
XML provides a shortcut for empty elements, <empty_element></empty_element> can be written as  <empty_element/>

Attributes

Attribute Values can contain apostrophes as long as they are framed by double quotes.
e.g. source="Roget's Thesaurus" is valid

Text

Text usually represents the actual data associated with an element. The only considerations for text center around whitespace (spaces, tabs, etc.) which makes the document more readable and troublesome characters (like &,<,>,",') which may confuse an XML parser

Comments

Comments make it easier to understand the XML document and must adhere strictly to the format:
<!--comment-->

  • Only two dashes (-) are allowed on either side of the comment
  • Script tags (<script></script>) are not treated as comments in XML and are displayed

In the example above all comments are shown in green. e.g. <!--XML Definition for a Bibliography--> and <!--First Book Reference-->

XML Declaration

The XML Declaration identifies the document as an XML document, and although not required it does provide important information to any program trying to interpret the XML file.

The XML Declaration takes the following form:
<?xml version="1.0" encoding="UTF-16" standalone="yes" ?>

UTF stands for Universal Character Set Transformation. UTF-8 uses an eight bit encryption of the character set , UTF-16 uses a 16 bit encryption of the character set. More detail on Unicode formats available is available from www.unicode.org

XML Namespaces

Namespaces differentiate elements and attributes defined in different documents or related to different data sets. They help to ensure the uniqueness of element and attribute names which is important when sharing information between different applications or even publicly.

Additionally, it helps to identify information groups and types within the current document.

The bibliography example above is extended to include a basic namespace declaration:

Copy

<--The XML Declaration-->

<?xml version="1" encoding="UTF-16" standalone="no">

<--Definition of the Root Element including a Namespace Declaration-->

<bibliography xmlns:bib="http://www.k2workflow.com/bibliography">

<!--First Book Reference Start-->

  <bib:reference bib:id="1" bib:isbn="1-56414-639-1">

<!--Author Listing Start-->

    <authors>

      <author>

        <last_name>Iacone</last_name>

        <initials>SJ</initials>

      </author>

    </authors>

<!--Author Listing End-->

    <bib:title>Write to the Point</bib:title>

    <bib:publisher>Career Press</bib:publisher>

    <bib:publish_date>2003</bib:publish_date>

  </bib:reference>

<!--First Book End-->

...

The following changes in the XML code sample are important:

  1. The reserved declaration prefix - xmlns
  2. The namespace prefix - bib - which identifies any descendent elements
  3. The Universal Resource Identifier (URI) which can either be a URL (Universal Resource Locator) or a URN (Universal Resource Name) - http://www.k2.com/bibliography
Typically URLs are used as the URI to uniquely identify namespaces. It is, however, not required that this link to an actual file
There are public namespace URIs, including:
- xmlns:html="http://www.w3.org/1999/xhtml"
- xmlns:xs="http://www.w3.org/2001/XMLSchema"
- xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"
To cancel the default namespace for an element include an empty namespace declaration, e.g. name in the following code snippet
<p xmlns:html="http://www.w3.org/1999/xhtml">I met <name xmlns="">John David</name> on holiday
This is not a comprehensive introduction to XML - but rather an orientation to XML as used in K2
See Also

 

 


K2 blackpearl Help 4.6.11 (4.12060.1731.0)