XML and Excel

Pages in this article

  1. What is XML
  2. Characteristics of XML
  3. Structure of XML
  4. XML Schemas
  5. XML in Excel
  6. XML Validation
  7. Conclusion

What is XML

XML is the acronym for eXtended Markup Language. XML is a standard conceived for the web, aiming to ease searching for information in web pages. The idea behind this is to add descriptive "tags" to information, which make the information simple to find and categorize. For example, all jokes might be surrounded by the tag "<joke> …. </joke>", after which search engines could simply look for these tags in a web page in order to find all jokes on a site. Figure 1 shows a piece of XML "code". As you can see, it resembles HTML code. Does that mean XML is a dialect of HTML?

XML file shown in Internet Explorer
Figure 1, XML file shown in Internet Explorer

Just like HTML, XML is a so-called mark-up language or meta language. A language which gives information about another language, or in other words offers information about its content. Markup codes have been around for quite some time. Editors started to use mark-up to indicate what kind of formatting needed to be applied to parts of the text (e.g. italic) in printed matter.
In the sixties, IBM developed the "Generalized Markup Language" (GML) to be able to do the same with electronic files. In 1986 GML was expanded further and became an ISO standard known as the "Standard Generalized Markup Language" (SGML).

Later, researchers at the CERN institute in Switzerland expanded SGML even further, because they needed a way to mark-up their electronically stored publications with formatting instructions. Their documents contained lots of links to other sources, mathematical equations and other complex information, which they wanted to appear the same on everyone's systems and they wished to be easy to search. The structure CERN devised is the basis for our current HTML.

So both HTML and XML are derived from SGML, the "Standard Generalized Markup Language". Both standards are managed by the World Wide Web consortium (W3C see: http://www.w3.org/).

Both XML and HTML use "tags", but the goal of these tags is different. HTML primarily is a standard directed at defining the formatting of information. With HTML the tags one can use are pre-defined within the standard, in principle one does not have the liberty to create ones own tags. Furthermore, HTML tags relay no information about the actual content of the information, just about the representation. XML on the other hand, focuses on marking what type of information is there, rather than indicating formatting. In XML it is the user who determines what tags are needed and only the structure of an XML file has been set by the XML standard.



Have a question, comment or suggestion? Then please use this form.

If your question is not directly related to this web page, but rather a more general "How do I do this" Excel question, then I advise you to ask your question here: www.eileenslounge.com.

To post VBA code in your comment, use [VB] tags, like this: [VB]Code goes here[/VB].