XML is a widely-used text-based format for data as well as instructions. This format arranges information in a hierarchical fashion.
An XML document is made of tags. There are two types of tags: start and end. A start tag is of the format
1. Each start tag has a corresponding end tag, with the same tag name. For example:
This is an HTML page
2. As the ‘<'and '>‘ characters are used to start and end a tag, they are reserved (i.e. they cannot be included in a valid XML input except for this purpose).
3. Between a start tag and its corresponding end tag, information can be in the form of (a) another tag pair or (b) an arbitrary string or both. For example
Some data
4. Tags must be properly nested. That is, if tag A starts before tag B, then B must end before A. Thus B is completedly enclosed within A. For example:
5. A tag name can contain only characters ‘a-z’, ‘A-Z’, ‘0-9’, :, _, and -. However, a tag name cannot start with a number or -.
Valid examples include:
<_xml> …
Invalid examples include:
<0body> … 0body>
<-xml> …-xml>