Pinterest Stumbleupon Whatsapp

what is xmlThere are countless file extensions out there and new ones seem to crop up every day. When you happen to discover one you have never heard of before, I bet your initial instinct is to suspect malware, rather than being excited to find out what it is. Did you ever wonder what an XML file is?

XML stands for Extensible Markup Language. A markup language is used to annotate text or add additional information. These annotations are not shown to the end-user, but are needed by the ‘machine’ to read and subsequently process the text correctly.

A very well known example is HTML (HyperText Markup Language). Websites are coded in HTML (and other programming languages), however, you (should) never see a trace of the code. What you do see is its interpretation by the browser, for example a certain font formatting, a table, or embedded images. So what does XML do?

what is xml

Let me start at the very beginning. XML is a more recent language similar to HTML, but it allows for more flexibility. Like HTML it is a simplified subset of SGML (Standard Generalized Markup Language), the mother of all markup languages. Per definition, XML is a universal format for structured documents and data on the web. In other words, it is used to markup or describe data.

To achieve this description of data, XML relies on Document Type Definition (DTD). You could say that this is the ‘machine’s’ dictionary, which allows it to understand the markup language. Thus, each document must start by defining the type of DTD to be used. The code could look something like this:


<!doctype html public “-//w3c//DTD html 4.0//en”>

This specific example tells the ‘machine’, in this case a browser, that the DTD is html 4.0 in English. The browser can then go ahead and compare each of the given commands with its DTD, which in turn tells it what to do with that command. That’s how the command <b> translates to bold text or <u> to underlined text.

xml tutorial

Now that was a HTML example and probably doesn’t instantly bring you any closer to understanding XML. Hang on, we’re getting there, but not without looking at HTML a little bit more. The problem with HTML is that it consists of a static set of commands and whenever you need to give certain attributions, you need to type these commands. Over and over again. In other words, HTML is very simple, very obvious, easy to learn, but also not very flexible.

Let me give you an example. Say you want to change the size or color of a header you have used a dozen times throughout your website. That can be quite annoying. To circumvent this tiresome editing of HTML documents, style sheets were invented. Now you simply call your header a ‘header 1’ in your website and in the style sheet you define what a ‘header 1’ looks like. So when you want to change your header, you only change it in one place, i.e. the style sheet. Problem solved.

xml tutorial

XML works similar to a style sheet. Yes, now we’re getting there! It’s more flexible than HTML because it lets you create your own building blocks. An XML document basically is a type of style sheet that defines how its subordinate documents are to be read by the ‘machine’. Of course there is a crucial difference to HTML and style sheets. You basically skip the HTML because you create your own DTD and the whole process is simplified. Essentially, the command definition is outsourced to the DTD.

There are two types of XML files. The first relies on the standard DTD that the ‘machine’, e.g. a browser can read. The second allows you to write your own DTD and create your own command building blocks. So you could become really creative and then provide the ‘machine’ with both your XML and your very own DTD file.

what is xml

Now another extremely important difference between HTML and XML is that HTML defines how data look, while XML defines what data is. This should make clear that XML does not replace HTML, rather it extends it. In essence, XML doesn’t really do anything, but structure, store, and transport data. Bummer.

So what XML is used for, is to outsource data. Rather than integrating them into the HTML document, they are stored in separate XML files. Since XML stores data in plain text format, the storage is independent of your platform and your data can be exported, imported, or simply moved much easier.

Many other languages are based on XML, including XHTML, WAP for hand held devices, or RSS for feeds. And here we have reached some actual uses of XML, as well as the end of this article. Please see my resources and the following articles for possible applications:

Are you any closer to understanding XML now?

Resources: What is XML? on HTML Goodies, XML on Wikipedia, XML at,

Image credits: flaivoloka, svilen001 (1), svilen001 (2), dsigning, Bessarro

Leave a Reply

Your email address will not be published. Required fields are marked *