Recent Changes - Search:

Network Programming

This website demonstrates using wikis as teaching and learning tool.

The course instructor is happy to share the teaching materials here with those who find it readable.

XML Basics and Overview

A Network Programming Lecture by Steven Choy

Overview: What is XML? - Well-formed XML document - Valid XML document - XML DTD and XML Schema


Extensible Markup Language

  • A overview of XML
    • stands for EXtensible Markup Language
    • is designed to describe data and to focus on what data is
    • XML tags are not predefined. You must define your own tags.
    • XML uses a Document Type Definition (DTD) or an XML Schema to describe the data
  • A look at a XML file
  • A XML file with more data
<?xml version="1.0" standalone="yes" ?>
<customers>
  <customer>
    <customerno>1</customerno>
    <first>Peter</first>
    <last>Chan</last>
    <telephone>12345678</telephone>
  </customer>
  <customer>
    <customerno>2</customerno>
    <first>David</first>
    <last>Lau</last>
    <telephone>87654321</telephone>
  </customer>
</customers>

Well-formed XML document

  • Has exactly one root element
  • Every start tag has a matching end tag
  • Elements can't be nested improperly (overlap)
  • Attribute enclosed by single or double quotation marks
  • Unique attribute name within each element
  • Element's content and attribute's value can't contain unescaped < and &
  • Comments and processing instructions can't be inside tags

Valid XML document

  • Must be well-formed, and
  • Must satisfy the constraints/grammars specified in either of the following
    • Document type definition (DTD)
      • Itself not in XML
      • Can be internal or external to the XML document
    • XML schema (XSD file)
      • Itself in XML
      • External to the XML document

XML DTD and XML Schema

  • A XML DTD defines the legal elements of an XML document.
    • The purpose of a DTD is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements.
  • XML Schema is an XML based alternative to DTD.
    • An XML Schema describes the structure of an XML document.
    • The XML Schema language is also referred to as XML Schema Definition (XSD).

More XML Example

XML with internal DTD

<?xml version="1.0" standalone="yes"?>

<!DOCTYPE customer [
 <!ELEMENT customer (first, last)>
  <!ELEMENT first (#PCDATA)>
  <!ELEMENT last (#PCDATA)>
]>

<customer>
  <first>Peter</first>
  <last>Chan</last>
</customer>

XML with external DTD

<?xml version="1.0" standalone="yes"?>

<!DOCTYPE customer SYSTEM “customer_dtd2.dtd”>

<customer>
  <first>Peter</first>
  <last>Chan</last>
</customer>
<!ELEMENT customer (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>

XML with XSD

<?xml version="1.0"?>
<customer
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:noNamespaceSchemaLocation="customer_xsd1.xsd">
  <first>Peter</first>
  <last>Chan</last>
</customer>
  • External file customer_xsd1.xsd
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="customer">
  <xs:complexType>
  <xs:sequence>
    <xs:element name="first" type="xs:string"/>
    <xs:element name="last" type="xs:string"/>
  </xs:sequence>
  </xs:complexType>
  </xs:element>
</xs:schema>

More about XSD

  • An XML Schema defines
    • elements that can appear in a document
    • attributes that can appear in a document
    • which elements are child elements
    • the order of child elements
    • the number of child elements
    • whether an element is empty or can include text
    • data types for elements and attributes
    • default and fixed values for elements and attributes
  • XML Schemas (or XSD) are more popular than DTD because they
    • are extensible to future additions
    • are richer and more powerful than DTDs
    • are written in XML
    • support data types
    • support namespaces
  • To learn more, the tutorial by W3School is a good start

XML Scheme or DTD?

  • What is the difference between XML Schema and DTD? What are the limitations of a DTD?
The Document Type Definition (DTD) defines the valid syntax of a class of XML documents. (The Document Type Definition (DTD) is the method used to define all markup languages. The purpose of DTD is to define the legal building blocks of an XML document.)
A schema is used to describe the possible data content of a document in a very rigorous and formal way. (XML Schema language (often called XSD) is used to describe both the structure and the content of an XML document.)
The limitations of a DTD: DTD does not have XML syntax and offers only limited support for types or namespaces. DTDs call for elements to consist of one of three things: (1) A text string; (2) A text string with other child elements mixed together; (3) A set of child elements.

How to validate a XML document

  • XMLStarlet Command Line XML Toolkit
  • XML DOM Validation - The W3C XML specification states that a program should not continue to process an XML document if it finds an error. The reason is that XML software should be easy to write, and that all XML documents should be compatible.
  • XML Schema Validator - This service lets you validate XML documents such as XHTML against the appropriate schemas. It performs a more accurate validation than the W3C validator.

Do you really know what is XML?

  • Let's check it out

Other similar names related to XML

  • XSL (EXtensible Stylesheet Language)
  • Other names you often see: SOAP, WSDL, RDF, RSS, XML-RPC, SVG

Extra Materials for Probing Further

Learn more about XML

XML Editors

a complete cross platform XML editor providing the tools for XML authoring, XML conversion, XML Schema, DTD, Relax NG and Schematron development, XPath, XSLT, XQuery debugging, SOAP and WSDL testing
allows to edit large, complex, modular, XML documents. It makes it easy mastering XML vocabularies such as DocBook or DITA.
The "visual" part comes from the fact that Vex hides the raw XML tags from the user, providing instead a wordprocessor-like interface. Because of this, Vex is best suited for "document-style" XML documents such as XHTML and DocBook rather than "data-style" XML documents.
XMLSpy - XML editor for modeling, editing, transforming, and debugging XML technologies
It is a free and Windows-based XML editor and development environment for XML, DTD, and XSLT documents
XML Copy Editor is a fast, free, validating XML editor. It has both Windows and Linux versions.

Thanks for Reading

If you would rather like to have this lecture note in printed format, please click the print action link in the top right corner.

If you find any problem in this lecture note, please feel free to tell Steven via steven@findaway.hk.

Edit - History - Print - Recent Changes - Search
Page last modified on January 21, 2010, at 09:13 AM