Tuesday, February 5, 2008

Introduction to XML

The Extensible Markup Language (XML) is a general-purpose markup language.

It is classified as an extensible language because it allows its users to define their own elements. Its primary purpose is to facilitate the sharing of structured data across different information systems, particularly via the Internet.

It is used both to encode documents and to serialize data. In the latter context, it is comparable with other text-based serialization languages such as JSON and YAML.

Well-formed and valid XML documents

There are two levels of correctness of an XML document:

Well-formed. A well-formed document conforms to all of XML's syntax rules. For example, if a start-tag appears without a corresponding end-tag, it is not well-formed. A document that is not well-formed is not considered to be XML; a conforming parser is not allowed to process it.

A "Well Formed" XML document has correct XML syntax, it should also satisfy the following conditions -

1. XML documents must have a root element
2. XML elements must have a closing tag
3. XML tags are case sensitive
4. XML elements must be properly nested
5. XML attribute values must be quoted

Valid. A valid document additionally conforms to some semantic rules. These rules are either user-defined, or included as an XML schema or DTD. For example, if a document contains an undefined element, then it is not valid; a validating parser is not allowed to process it.

Example of a meaningful XML Document




XML Root - Bookstore
XML Child Node - book
XML Attributes for book - category
XML Elements - title, author, year price

XML elements must follow these naming rules:

1. Names can contain letters, numbers, and other characters
2. Names must not start with a number or punctuation character
3, Names must not start with the letters xml (or XML, or Xml, etc)
4, Names cannot contain spaces
5. Any name can be used, no words are reserved.

No comments: