How to Read and Write XML Files With Code
Pinterest Stumbleupon Whatsapp
Advertisement

Would you like to learn how to read and write an XML file from java?

XML files are used for a variety of purposes including storage of data. Before JSON became popular, XML was the prefered format for representing, storing and transporting structured data. Even though the popularity of XML has waned in recent years, you may encounter it occasionally so it is important to learn how to work with it from code.

Java Standard Edition (SE) 10 Core Java Concepts You Should Learn When Getting Started 10 Core Java Concepts You Should Learn When Getting Started Whether you are writing a GUI, developing server-side software, or a mobile application using Android, learning Java will serve you well. Here are some core Java concepts to help you get started. Read More includes the Java API for XML Processing (JAXP), which is an umbrella term covering most aspects of XML processing. These include:

  • DOM: The Document Object Model includes classes for working with XML artifacts such as element, node, attributes, etc. The DOM API loads the complete XML document into memory for processing, so it is not very suited for working with large XML files.
  • SAX: The Simple API for XML is an event-driven algorithm for reading XML. Here XML is processed by firing events found when reading XML. The memory requirements for using this method is low, but working with the API is more complex than working with the DOM.
  • StAX: The Streaming API for XML is a recent addition to the XML APIs and provides high-performance stream filtering, processing, and modification of XML. While it avoids loading the whole XML document into memory, it provides a pull-type architecture rather than an event-driven architecture, so the application is easier to code and understand than using the SAX API.

In this article, we use the DOM API to demonstrate how to read and write XML files from java. We will cover the other two APIs in future articles.

A Sample XML File

For the purpose of this article, we demonstrate the concepts using the following sample XML, which can be found here:

<?xml version="1.0"?>
<catalog>
 <book id="bk101">
 <author>Gambardella, Matthew</author>
 <title>XML Developer's Guide</title>
 <genre>Computer</genre>
 <price>44.95</price>
 <publish_date>2000-10-01</publish_date>
 <description>An in-depth look at creating applications
 with XML.</description>
 </book>
 <book id="bk102">
 <author>Ralls, Kim</author>
...

Reading an XML File

Let us look at the basic steps required for reading an XML file using the DOM API.

The first step is to get an instance of DocumentBuilder. The builder is used to parse XML documents. For basic usage, we do it like this:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);
factory.setValidating(false);
DocumentBuilder builder = factory.newDocumentBuilder();

We can now load the whole document into memory starting from the XML root element. In our example, it is the catalog element.

File file = ...; // XML file to read
Document document = builder.parse(file);
Element catalog = document.getDocumentElement();

And that’s it, folks! The DOM API for reading an XML is really simple. You now have access to the whole XML document starting from its root element, catalog. Let us now see how to work with it.

Using the DOM API

Now that we have the XML root Element, we can use the DOM API to extract interesting nuggets of information.

Get all the book children of the root element and loop over them. Note that getChildNodes() returns all children, including text, comments, etc. For our purpose, we need just the child elements, so we skip over the others.

NodeList books = catalog.getChildNodes();
for (int i = 0, ii = 0, n = books.getLength() ; i < n ; i++) {
  Node child = books.item(i);
  if ( child.getNodeType() != Node.ELEMENT_NODE )
    continue;
  Element book = (Element)child;
  // work with the book Element here
}

How do you find a specific child element, given the parent? The following static method returns the first matching element if found, or null. As you can see, the procedure involves getting the list of child nodes and looping through them picking out element nodes with the specified name.

static private Node findFirstNamedElement(Node parent,String tagName)
{
  NodeList children = parent.getChildNodes();
  for (int i = 0, in = children.getLength() ; i < in ; i++) {
    Node child = children.item(i);
    if ( child.getNodeType() != Node.ELEMENT_NODE )
      continue;
    if ( child.getNodeName().equals(tagName) )
      return child;
  }
  return null;
}

Note that the DOM API treats text content within an element as a separate node of type TEXT_NODE. In addition, the text content might be split into multiple adjacent text nodes. So the following special processing is required to fetch the text content within an element.

static private String getCharacterData(Node parent)
{
  StringBuilder text = new StringBuilder();
  if ( parent == null )
    return text.toString();
  NodeList children = parent.getChildNodes();
  for (int k = 0, kn = children.getLength() ; k < kn ; k++) {
    Node child = children.item(k);
    if ( child.getNodeType() != Node.TEXT_NODE )
      break;
    text.append(child.getNodeValue());
  }
  return text.toString();
}

Armed with these convenience functions, let us now look at some code for listing out some information from our sample XML. We would like to show detailed information for each book, such as would be available in a book catalog.

NodeList books = catalog.getChildNodes();
for (int i = 0, ii = 0, n = books.getLength() ; i < n ; i++) {
  Node child = books.item(i);
  if ( child.getNodeType() != Node.ELEMENT_NODE )
    continue;
  Element book = (Element)child;
  ii++;

  String id = book.getAttribute("id");
  String author = getCharacterData(findFirstNamedElement(child,"author"));
  String title = getCharacterData(findFirstNamedElement(child,"title"));
  String genre = getCharacterData(findFirstNamedElement(child,"genre"));
  String price = getCharacterData(findFirstNamedElement(child,"price"));
  String pubdate = getCharacterData(findFirstNamedElement(child,"pubdate"));
  String descr = getCharacterData(findFirstNamedElement(child,"description"));

  System.out.printf("%3d. book id = %s\n" +
  " author: %s\n" +
  " title: %s\n" +
  " genre: %s\n" +
  " price: %s\n" +
  " pubdate: %s\n" +
  " descr: %s\n",
  ii, id, author, title, genre, price, pubdate, descr);
}

Writing XML Output

Java provides the XML Tranform API to transform XML data. We use this API with the identity transform to generate output.

As an example, let us add a new book element to the sample catalog presented above. The details of the book (such as author, title, etc) can be obtained externally, perhaps from a properties file or a database. We use the following properties file to load the data.

id=bk113
author=Jane Austen
title=Pride and Prejudice
genre=Romance
price=6.99
publish_date=2010-04-01
description="It is a truth universally acknowledged, that a single man in possession of a good fortune must be in want of a wife." So begins Pride and Prejudice, Jane Austen's witty comedy of manners-one of the most popular novels of all time-that features splendidly civilized sparring between the proud Mr. Darcy and the prejudiced Elizabeth Bennet as they play out their spirited courtship in a series of eighteenth-century drawing-room intrigues.

The first step is to parse the existing XML file using the method presented above. The code is also shown below.

File file = ...; // XML file to read
Document document = builder.parse(file);
Element catalog = document.getDocumentElement();

We load the data from the properties file using the Properties class provided with java. The code is quite simple and shown below.

String propsFile = ...;
Properties props = new Properties();
try (FileReader in = new FileReader(propsFile)) {
  props.load(in);
}

Once the properties are loaded, we retrieve the values we want to add from the properties file.

String id = props.getProperty("id");
String author = props.getProperty("author");
String title = props.getProperty("title");
String genre = props.getProperty("genre");
String price = props.getProperty("price");
String publish_date = props.getProperty("publish_date");
String descr = props.getProperty("description");

Let us now create an empty book element.

Element book = document.createElement("book");
book.setAttribute("id", id);

Adding the child elements to the book is trivial. For convenience, we collect the required element names in a List and add the values in a loop.

List<String> elnames =Arrays.asList("author", "title", "genre", "price",
 "publish_date", "description");
for (String elname : elnames) {
  Element el = document.createElement(elname);
  Text text = document.createTextNode(props.getProperty(elname));
  el.appendChild(text);
  book.appendChild(el);
}
catalog.appendChild(book);

And that is how it is done. The catalog element now has the new book element added. All that remains now is to write out the updated XML.

For writing the XML, we need an instance of Transformer which is created as shown below. Note that we request indentation of the output XML using the setOutputProperty() method.

TransformerFactory tfact = TransformerFactory.newInstance();
Transformer tform = tfact.newTransformer();
tform.setOutputProperty(OutputKeys.INDENT, "yes");
tform.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "3");

The final step in generating the XML output is to apply the tranformation. The result appears on the output stream, System.out.

tform.transform(new DOMSource(document), new StreamResult(System.out));

To write the output directly to a file, use the following.

tform.transform(new DOMSource(document), new StreamResult(new File("output.xml")));

And that wraps up this article on reading and writing XML files using the DOM API.

Have you used the DOM API in your applications? How did it perform? Please let us know in the comments below.

Enjoyed this article? Stay informed by joining our newsletter!

Enter your Email

Leave a Reply

Your email address will not be published. Required fields are marked *