The XML syntax rules are very logical and straightforward. Basically, there is only one strictly mandatory part of the document: the root element. All other parts are user-defined and optional.
In this lesson, we are going to discuss the six most important things to know about the XML syntax, like follows:
- Root Element
- XML Prolog
- Tags and Elements
- Attributes
- Text
- References
The root element
Every XML document must contain only one Root element that is the parent of all other elements in the document, as shown in blue in the code block below.
<root>
<child1>
<subChild1>...</subChild1>
<subChild2>...</subChild2>
</child1>
</root>Only one root element is allowed. All other elements must be inside the root element. Nothing (except the XML declaration) can appear outside it.
Notice that the root element does not necessarily need to be named <root>. It can be any user-defined word and typically gives context to the whole document. It tells what the data represents (like <devices>, <users>, or <fwRules>).
The only rule that applies to the root element is that it must have a matching closing tag. The following example is wrong. It misses the closing tag </devices>.
<devices>
<device1></device1>
<device2></device2>XML Prolog
The XML prolog (also called the XML declaration) is the first line in an XML document. It is an optional line of code that, if it exists, must come first in the document. However, it is a good practice to include it.
<?xml version="1.0" encoding="UTF-8"?>It is pretty much self-explanatory:
- version is the XML version
- encoding specifies the character rules.
In the above example, XML version 1.0 is shown. There is an XML version 1.1 that allows the use of scripts and characters absent from Unicode, but it is rarely used and supported.
Note that the XML prolog does not have a closing tag. This is not an error. It is just not a part of the XML document at all. It is not a tag — it’s a special instruction for XML parsers, not part of the document’s data.
XML Tags and Elements
A tag is a markup construct that begins with < and ends with >. There are three types of tags:
- start-tag: It is placed before the content of an element and is typically called an opening tag. An example would be the opening tag <speed> shown in Figure 1.
- end-tag: It is placed after the content of an element and is typically called a closing tag. An example would be the closing tag </speed> shown in Figure 1.
- An empty-element tag is a single-line component, such as <line-break />, that has a special purpose and does not contain any value.
The following diagram shows that tags and elements are closely related but technically not the same thing. A tag is a markup word. An element is everything between a start and end tag, including the tags and the content inside.
In this example, <speed> and </speed> are tags. The whole thing (<speed>1000</speed>) is an element.
The value between the start and end tags, if any, is the element's content and may contain additional XML markup, including other elements called child elements.
Tags are case-sensitive
XML tags are case sensitive. This is very important.l It is one of the hot questions in the Network Automation tracks. The tag <Speed> is different from the tag <speed>. Opening and Closing tags must be written in the same case.
INCORRECT:
<Speed>1000</speed>
CORRECT:
<speed>1000</speed>Tags must be properly nested
XML tags must be properly nested. XML elements must not overlap - meaning the end tag of an element must have the same name as that of the most recent unmatched start tag.
INCORRECT:
<speed>
<duplex> full
</speed>
</duplex>
CORRECT:
<speed>1000</speed>
<duplex>full</duplex>XML Attributes
XML elements can have attributes, just like HTML. Attributes give extra information about an element. They are written inside the start tag, and the attribute values must always be in quotes. For example:
<device hostname="SW1" ip="10.1.1.1"> ... </device>
Attributes are optional — you can use them if needed. They are designed to hold metadata (extra info), not primary data.
You may be wondering - when to use attributes? There is no universal rule - it is a design choice. Consider the following two examples that represent the same information but are formatted differently - with and without attributes.
EXAMPLE 1 - Using Attributes
<interface name="GigabitEthernet0/0/0">
<address>10.1.1.1</address>
<mask>255.255.255.0</mask>
</interface>
EXAMPLE 2 - Using Elements
<interface>
<name>GigabitEthernet0/0/0</name>
<address>10.1.1.1</address>
<mask>255.255.255.0</mask>
</interface>Both examples above provide the same information. There are no specific rules for when to use attributes or when to use elements in XML.
XML Text
In XML documents, all whitespace characters are ignored but are preserved. XML does not truncate whitespace like HTML. Some characters like (< > ' " & ) are reserved by the XML syntax itself. All XML files must be saved as Unicode UTF-8.
XML References
By now, you must have noticed that some characters with XML are reserved for the data format itself and cannot be used by the user. For example:
<!--- characters reserved for use of the language itself --->
>, <, &, ', and "XML references are special codes used to represent characters that have a special meaning or cannot be typed directly.
References allow you to include additional markup in an XML document. They always start with the reserved symbol "&" and end with a symbol ";". There are two types of references:
- Character Reference - for example, A refers to alphabet "A".
- Entity Reference - for example > refers to '>'.