Skip to main content

Wider Page

 

Bigger text

 

Data is everywhere. Humans talk, devices talk, software talks, and now AI talks. But raw information can be messy and unreadable. We need ways to shape data so machines and people can read it. Text data formats give us that shape. This lesson explains why we use them, how they work, what kinds there are, and shows examples you will see in CCNA automation.

Why do we need Data Formats?

When we communicate, we basically transmit and receive information between each other. Every time we do it, we obey a set of rules, like talking in the same language, structuring our speech in sentences, and starting the conversation with a greeting, as shown in the diagram below. 

Figure 1. English language as a data format.
Figure 1. English language as a data format.

The English language works a lot like a data format. It is a set of rules that define how to structure and write the information so that the recipient can understand it exactly as you do. 

Think about it this way — your thoughts are data inside your head. You can’t just send those thoughts directly to someone else, though. You have to translate them into a common format (common language between you and someone else), like English, French, Spanish, etc.

When you speak or write in English, you’re basically serializing your thoughts into a shared language (for example, English). Anyone who understands English can then “decode” what you said and get your original thought.

Everyone knows what happens if two people don't use a common data format - a common language that both understand - they are not able to understand each other, as shown in the diagram below.

English Language as Data Format
Figure 2. People communicating without using a common language.

The same principle applies to communication between applications. When two software programs exchange data, this information has to be structured using the same language; otherwise, the applications won't be able to understand each other. We humans are very good at making sense of unstructured data, but applications have to be told explicitly how to parse data. Examine the following examples of the same information but represented differently:

interfaceFastEthernet0/0.2encapsulationdot1q210.1.1.10255.255.255.0mtu1500

interface FastEthernet0/0.2
 encapsulation dot1q 2
 10.1.1.10 255.255.255.0
 mtu 1500

Suppose an application wants to transmit this data to another application (or a network admin) via the network. Let's imagine, just for the sake of the example, that the application does not use any data format to serialize the information in a structured manner. The following example shows what would happen.

Sending data without data format
Figure 3. Sending data without data format.

You can easily see what happens if we don't use data formats and data serialization. The receiving application is not able to make sense of the received data. How is it supposed to understand when the interface name ends and the IP address starts? 

Of course, the example is a bit exaggerated, but it illustrates the main idea. Computers need an explicit set of rules for reconstructing the received information. Let's look at the same example, but this time using a well-defined data format called XML - Extensible Markup Language, which is one of the most common formats out there.

Apps communicate using XML data format
Figure 4. Apps communicate using XML data format.

You can clearly see how much more structured the data is and how much more readable it is - for humans and for computers as well. This is why we use pre-defined data formats when storing or transmitting data.

How do Data Formats work?

At a high level, data formats do two things:

  • They define structure - Structure is how data is organized. For example, an interface has a name, an IP address, and a description. Structure decides where each value goes.
  • They define syntax - Syntax is the exact characters you use. XML uses tags like <interface>. JSON uses curly braces, quotes, and colons. CSV uses commas and newlines. YAML uses indentation.

The following example shows an interface configuration represetned in the three most common formats. Notice that the information is actually the same, but is "wrapped" differently. Which one if the easiest to read to you?

Figure 5. XML, JSON and YAML structure and syntax example.
Figure 5. XML, JSON and YAML structure and syntax example.

Most people find YAML to be easiest to read. That's why it is the most popular for DevOps and NetDevOps tools like Kubernetes and Ansible.

Notice that the diagram above shows the same original information serialized into different data formats. You can clearly see that the data is the same, but formatted as XML, JSON, and YAML, it looks and feels different. We will talk more about each of the data formats in the upcoming lessons, but for now, let's focus on the fundamentals.

To understand the structure and syntax of each of the data formats that we are going to discuss in this section, we need to clear one of the simplest yet most important concepts of information - the key:value pairs.

Key-Value pairs

A key-value pair means that every piece of data is stored and transmitted as a label (the key) and a value (the data linked to that label). Think of it like a label and a box:

  • The key is the label on the box — it tells you what’s inside.
  • The value is what’s actually in the box.

Without a label, you cannot know what is inside the box, right? And vice versa, without a box, you will have a label pointing to nothing, as shown in the diagram below.

Label and Box pair
Figure 6. Label and Box pair.

It is very important to understand and remember that in the computer world, data is ALWAYS structured as key: value pairs to have meaning. This makes it understandable and machine-readable. Raw information without a key (label), like a single number, word, or raw signal, is meaningless. 

For example, a sensor might just send the number 75. That’s data, but without context - we don’t know what 75 means. Once we add a key like temperature: 75, it becomes meaningful information. The temperature of the sensor is 75 degrees.

KEY NOTE: Each data format structures the information into key-value pairs. Value without a key is meaningless.

Each format has its own unique way of expressing the key-value pairs, as you can see in the diagram below.

XML, JSON and YAML - key value pairs.
Figure 7. XML, JSON and YAML - key value pairs.

You need to build a solid understanding and knowledge of the key-value concept. It is the foundation of the entire CCNA Automation world. Every time you work with any type of information, you will store, edit, and transmit it in one of the formats that we discuss in this section - XML, JSON, or YAML.

Let's finish this part of the lesson with a few examples that illustrate how a well-known data representation (CISCO CLI) is formatted into XML, JSON, and YAML. Notice that each code section has the same information, just structured in a different data format, using the rules of the language in which it is written. They may look different, but the information is basically the same.

Cisco CLI:

!
interface GigabitEthernet0/0/0
 description VLAN20
 ip address 10.1.1.1 255.255.255.0
 ip mtu 1400
 duplex full
 speed 1000
!

XML:

<?xml version="1.0" encoding="UTF-8"?>
<interfaces>
  <interface>
    <name>GigabitEthernet0/0/0</name>
    <description >VLAN20</description >
    <address>10.1.1.1</address>
    <mask>255.255.255.0</mask>
    <MTU>1400</MTU>
    <duplex>full</duplex>
    <speed>1000</speed>
  </interface>
</interfaces>

JSON:

{
   "interfaces":[
      {
         "name":"GigabitEthernet0/0/0",
         "description ":"VLAN20",
         "address":"10.1.1.1",
         "mask ":"255.255.255.0",
         "MTU ":"1400",
         "duplex ":"full",
         "speed ":"1000"
      }
   ]
}

YAML:

--- 
- GigabitEthernet0/0/0: 
    description: VLAN20
    address: "10.1.1.1"
    mask: "255.255.255.0"
    MTU: 1400
    duplex: full
    speed: 1000

What are Serialization and Parsing?

Now, let’s move to the programming side and see how data formats fit into the bigger picture. In most cases, information doesn’t just appear by itself — it’s created, processed, or used by a program written in languages like C++, Java, or Python. These programs usually store and handle data as objects in memory.

When a program needs to save, send, or receive that data, it must convert it into a data format such as JSON, XML, or YAML. That’s where two important terms come in:

  • Serialization is the process of converting data from a program (like an object in memory) into a format that can be stored — for example, in a file or database — or transmitted over a network. The data can later be reconstructed, even by another program written in a different language or running on another system.
  • Parsing is the opposite. It means taking structured text — like JSON or XML — and converting it back into data structures that the program can use, such as dictionaries or objects.
Serialization and Parsing example
Figure 8. Serialization and Parsing example.

In short:

  • Parsing reads text and turns it into data.
  • Serialization takes data and turns it into text.

Full Content Access is for Subscribed Users Only...

  • Learn any CCNA, CCIE or Network Automation topic with animated explanation.
  • We focus on simplicity. Networking tutorials and examples written in simple, understandable language.