XML, the eXtensible Markup Language

XML, also known as the eXtensible Markup Language, is a standard to process documents. This standard was recommended by the World Wide Web Consortium (W3C). The specs of this language are not completed yet. Many persons expect that XML and all the technologies based on XML will replace the way pages with dynamic content currently are constructed. Nowadays many text processors and web browsers already support XML.

Basically XML is a simplified form of the Standard Generalized Markup Language (SGML). This standard for documentation was developed in the 1980's. The problem with SGML is that is too complex for the web. That's why an effort was made to reduce the size of the language to make it suitable for the Internet. In that manner XML was developed.

XML is a meta language. Unlike for instance HTML it is possible to create your own tags within XML. In principle these tags don't have any meaning. They will get a meaning at the point when you will process the information within your XML document. You can create your tags with stylesheets and document type definitions (DTD's). There are no 'correct' or 'inccorect' tags within your XML document, because all the tags are defined by yourself.

It is important to understand that XML only holds data. The way the data is treated is determined by the application you will use to process those data. XML completely seperates the data from the layout. Many XML applications support CSS (Cascading Style Sheets), but there is a better more suitable specification for the layout issues. This specification is called XSL, The eXstensible Stylesheet Language. XSL makes sure that every XML document will have the same layout on any machine and any platform. Note that the specifications for XML and XSL still are under development. It is expected that the basis of these technologies will stay the same though.

An XML document consists of one or more elements. An element has the form <element> text within element </element>. Each element has an opening tag and a closing tag. The closing tag is similar to the opening tag, but has a slash (/) in front of the name of the tag. Some elements are empty. For instance: <anelement></anelement>. In a similar situation it is allowed to shorten the opening and closing tag in this manner: <anelement  />. Note that the space between the element name and the slash is mandatory. So the following notations are equivalent to each other:

  • <picture></picture>
  • <picture  />

An element can have one or more attributes, just like HTML tags. Attributes are used to store additional information of the element. For instance:

<picture source="photo.jpg" quality="high"  />

Within XML it is not allowed to leave out the quotes (") around the value of an attribute. Thus an attribute always has the form:

attribute = "value"

Obviously it is allowed to have elements which are not empty with attributes:

<price currency="dollar">1.49</price>

An element can have one or more children:

<booklist>
    <book title="My First Book"  />
    <book title="My Second Book"  />
</booklist>

All elements have to be nested properly. So it is not allowed to have something similar to this:

<bold><underline> wrong example </bold></underline>

This should be corrected to something like:

<bold><underline> nested properly </underline></bold>

Each XML document starts with the XML declaration which is similar to this:

<?xml version="1.0" standalone="yes"?>

This declaration element has to be on the first line of the XML document. The first character of that line has to be the less than (<) character. Note the question marks (?) after < character and before the > character. The version attribute tells what version of the XML specification is used for this document. The standalone attribute can have the values "yes" or "no". If its value is set to "yes" it means that the document is a stand-alone document, which means that no external DTD (Document Type Definition) has to be declared for this document. The default DTD will be applied in that case. If its value is "no" this document will need an external DTD. The declaration element is the only element within an XML document without a closing tag.

Above the very basics of XML are covered. To write proper XML documents these rules definately should be applied. Obviously there is much much more about XML, but that is not discussed on this page. Below some sites with more information regarding XML are listed. There is much information about this subject available, but note that the language still is being developed and that some information may be out of date. The basics of XML should stay similar though.

Relevant links

See Also

Contact the author: Send an electronic mail to: pajtroon@dds.nl.
Peter's ICQ Number is: #3900785.

This page: Copyright © 2002 - 2005 Peter A. J. Troon

Note: This page is part of the Peter Troon Site.