dd.util
Class XMLHelper

java.lang.Object
  extended bydd.util.XMLHelper

public class XMLHelper
extends java.lang.Object

Utility class for parsing XML. This class provides some convenience functions for creating and using a DOM tree produced by parsing an XML source file.

This class provides a simple syntax for returning a node from a structured XML document. A path to a node is a series of nested tags, separated by periods, such as "outer.middle.inner". You can specify a path from the root or any subpath. The search routine searches the tree recursively from the beginning to the end of the file and returns the first node that matches the path specified. Normally that node contains text that could then be used for something, but you can also retrieve any node which will contain the target node and and sub-nodes.

Consider the following XML file:

 <?xml version="1.0" encoding="UTF-8"?>

 <order>
   <item>
     <book>
       <title>War and Peace</title>
     </book>
   </item>
   <item>
     <magazine>
       <title>Wired</title>
       <price>19.95</price>
     </magazine>
   </item>
   <item>
     <cd>
       <title>Best of Leo Kottke</title>
     </cd>
   </item>
 </order>
 

Here would be the result of various queries:

QueryResult
order.item.magazine.titleWired
cd.titleBest of Leo Kottke
order.booknull
item.cdNode: <title>Best of Leo Kottke</title>
titleWar and Peace

Example usage:

 Document myDocument = XMLHelper.makeDocument("myFile.xml");
 double price = XMLHelper.findDouble(myDocument, "magazine.price");
 String cdTitle = XMLHelper.findString(myDocument, "order.item.cd.title");
 

Version:
$Revision: 1.18 $
Author:
Eric Scharff (scharff@ucar.edu)

Constructor Summary
XMLHelper()
           
 
Method Summary
protected static java.lang.String[] convertQueryToArray(java.lang.String query)
          Compiles a query for internal representation.
static org.w3c.dom.Element createNode(org.w3c.dom.Document document, java.lang.String nodeName)
          Creates a new empty XML node.
static org.w3c.dom.Element createNode(org.w3c.dom.Document document, java.lang.String nodeName, org.w3c.dom.Node parentNode)
          Creates a new node and inserts it into the DOM tree.
static org.w3c.dom.Element createNode(org.w3c.dom.Document document, java.lang.String nodeName, java.lang.String attributeName, java.lang.String attributeValue, org.w3c.dom.Node parentNode)
          Creates a new node and inserts it into the DOM tree.
static org.w3c.dom.Element createTextNode(org.w3c.dom.Document document, java.lang.String nodeName, java.lang.String nodeText, org.w3c.dom.Node parentNode)
          Creates a new text node and interts it into the DOM tree.
static org.w3c.dom.Node findChild(org.w3c.dom.Node n, java.lang.String elementName)
          Queries the immediate children of a node.
static double findDouble(org.w3c.dom.Document d, java.lang.String target)
          Returns the matching node as a double.
static org.w3c.dom.Element findFirstElement(org.w3c.dom.Node n)
          Returns the first child element of the specified node.
static int findInt(org.w3c.dom.Document d, java.lang.String target)
          Returns the matching node as an integer.
static org.w3c.dom.Node findNode(org.w3c.dom.Document d, java.lang.String target)
          Finds a node matching the specified path.
protected static org.w3c.dom.Node findNode(org.w3c.dom.Node n, java.lang.String[] targets, int index)
          Finds the node recursively.
static java.lang.String findString(org.w3c.dom.Document d, java.lang.String target)
          Returns the matching node as a String.
static java.lang.String getNodeAttribute(org.w3c.dom.Node node, java.lang.String attributeName)
          Access the attributes of a node.
static void main(java.lang.String[] args)
          Runs the XMLHelper.
static org.w3c.dom.Document makeDocument()
          Creates a new empty XML document.
static org.w3c.dom.Document makeDocument(java.io.Reader reader)
          Creates a new parsed XML document from a reader.
static org.w3c.dom.Document makeDocument(java.lang.String fileName)
          Creates a new parsed XML document from a file.
static java.lang.String nodeText(org.w3c.dom.Node n)
          Returns the string contents of a node.
static java.lang.StringBuffer nodeText(org.w3c.dom.Node n, java.lang.StringBuffer text)
          Returns the string contents of a node.
static double parseXMLDouble(java.lang.String s)
          Converts the String into a double.
static int parseXMLInt(java.lang.String s)
          Converts the String into an integer.
static void printTree(org.w3c.dom.Node n, int lvl)
          Print the tree representing a node in the XML document.
static void writeXMLFile(java.lang.String fileName, org.w3c.dom.Node mainNode)
          Writes a new XML file from the current node.
static void writeXMLFile(java.io.Writer out, org.w3c.dom.Node mainNode)
          Writes the given XML node to the writer provided.
static java.lang.String writeXMLString(org.w3c.dom.Node mainNode)
          Writes the given node out as a String.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

XMLHelper

public XMLHelper()
Method Detail

convertQueryToArray

protected static java.lang.String[] convertQueryToArray(java.lang.String query)
Compiles a query for internal representation. Basically, turns the single string in the form "a.b.c" into the array of strings {"a", "b", "c"}.

Parameters:
query - the incoming dot-path query
Returns:
the array where each element in the path is a single entry.

findNode

protected static org.w3c.dom.Node findNode(org.w3c.dom.Node n,
                                           java.lang.String[] targets,
                                           int index)
Finds the node recursively. This function does the bulk of the search, passing along the query string, checking the current element of the query, and recursing over the children nodes if the string matches. This method returns a node that matches, or null if not found.

Parameters:
n - the node to begin the search
targets - the expanded query string
index - level into the query string of the recursive search
Returns:
the node matching the path target, null if not found.
See Also:
findNode(Document,String)

findChild

public static org.w3c.dom.Node findChild(org.w3c.dom.Node n,
                                         java.lang.String elementName)
Queries the immediate children of a node. This method searches all the children of the current node (without any deeper search) and returns the first node that has the name specified.

Parameters:
n - parent node under which search takes place
elementName - name of the node to find
Returns:
the first node with the name elementName, null if there is no child node with the name specified.

findFirstElement

public static org.w3c.dom.Element findFirstElement(org.w3c.dom.Node n)
Returns the first child element of the specified node. This searches the children of the node until an element is found and returns it. The nodes that would be skipped would be text (probably whitespace) nodes that precede it.

Parameters:
n - parent node under which child element should be found
Returns:
the first element node, null if there are no child element nodes.

findNode

public static org.w3c.dom.Node findNode(org.w3c.dom.Document d,
                                        java.lang.String target)
Finds a node matching the specified path. A path is a list of node names, separated by periods. The whole document is searched and the first node to match the target is returned. See the class comment for an example of the node search.

Parameters:
d - document to search
target - the path of the node to be located
Returns:
the node matching the path target, null if not found.

findString

public static java.lang.String findString(org.w3c.dom.Document d,
                                          java.lang.String target)
Returns the matching node as a String. This convenience function takes a document and a query string and returns the text of the node that matches that query. If this is a text node, the parsed text (with XML escapes replaced) is returned.

Parameters:
d - document to search
target - the path of the node to be located
Returns:
the contents of the target node, null if not found.

findDouble

public static double findDouble(org.w3c.dom.Document d,
                                java.lang.String target)
Returns the matching node as a double. This convenience function takes a document and a query string and treats the contents of the found node as a floating point double. This would match something like <target>10.2</target>.

Parameters:
d - document to search
target - the path of the node to be located
Returns:
the contents of the target node, -1.0 if not found or non-numeric value.

findInt

public static int findInt(org.w3c.dom.Document d,
                          java.lang.String target)
Returns the matching node as an integer. This convenience function takes a document and a query string and treats the contents of the found node as an integer. This would match something like <target>10</target>.

Parameters:
d - document to search
target - the path of the node to be located
Returns:
the contents of the target node, -1 if not found or non-numeric value.

getNodeAttribute

public static java.lang.String getNodeAttribute(org.w3c.dom.Node node,
                                                java.lang.String attributeName)
Access the attributes of a node. This convenience function takes a node and an attribute name. If that node has an attribute with the name specified, the attribute value is returned. For example, the node <item price="50"/>, when queried for "price" will return "50".

Parameters:
node - node which contains the attribute
attributeName - name of the attribute to query
Returns:
the value of the attribute, null if it does not exist.

parseXMLDouble

public static double parseXMLDouble(java.lang.String s)
Converts the String into a double. This routine may be subclassed if special conversion is necessary. Right now, it uses the standard java number conversion.

Parameters:
s - the string to parse into a double
Returns:
the double value of the string, -1.0 if an error occurs.
See Also:
Double.parseDouble(String)

parseXMLInt

public static int parseXMLInt(java.lang.String s)
Converts the String into an integer. This routine may be subclassed if special conversion is necessary. Right now, it uses the standard java number conversion.

Parameters:
s - the string to parse into a double
Returns:
the double value of the string, -1.0 if an error occurs.
See Also:
Integer.parseInt(String)

makeDocument

public static org.w3c.dom.Document makeDocument()
Creates a new empty XML document. The document created by this method is the root of the DOM tree. It is important to add one (and only one) child node (the root XML node) to this document since the XML standard says that an XML document should have only one root node that corresponds to the DOCTYPE. There is no DOCTYPE generated yet, but this should be considered.

Returns:
a new empty XML document

makeDocument

public static org.w3c.dom.Document makeDocument(java.io.Reader reader)
Creates a new parsed XML document from a reader. This convenience method does all the work to initialize the readers and XML DOM parsers and returns the parsed XML document if successful.

Parameters:
reader - character input stream from which XML is read
Returns:
the Document representing the DOM tree, null if an error occurs in parsing. Errors are printed to standard error.

makeDocument

public static org.w3c.dom.Document makeDocument(java.lang.String fileName)
Creates a new parsed XML document from a file. This convenience method does all the work to initialize the readers and XML DOM parsers and returns the parsed XML document if successful.

Parameters:
fileName - name of the XML file to read
Returns:
the Document representing the DOM tree, null if an error occurs in parsing. Errors are printed to standard error.

writeXMLFile

public static void writeXMLFile(java.lang.String fileName,
                                org.w3c.dom.Node mainNode)
Writes a new XML file from the current node. This convenience method does all the work to initialize the Transformer objects. It takes an XML node and writes that node as a self-contained XML file.

Parameters:
fileName - name of the XML file to create
mainNode - root of the DOM tree to write to the file

writeXMLString

public static java.lang.String writeXMLString(org.w3c.dom.Node mainNode)
Writes the given node out as a String. This convenience method does all the work to initialize the Transformer objects. It takes an XML node and returns the String representation of that node.

Parameters:
mainNode - root of the DOM tree to write
Returns:
a String representation of the XML node

writeXMLFile

public static void writeXMLFile(java.io.Writer out,
                                org.w3c.dom.Node mainNode)
Writes the given XML node to the writer provided. This is a lower level convenience method that writes the contents of a node to any Writer. Note that the writer is not closed, so this method is suitable for writing several objects

Parameters:
out - output writer onto which XML should be written
mainNode - root of the DOM tree to write

nodeText

public static java.lang.String nodeText(org.w3c.dom.Node n)
Returns the string contents of a node. The string contents of a node is the recursive traversal of the current node. This method concatenates all of the nodes that have a non-null node value, so some extra information (like comment text) may appear in this representation

Parameters:
n - current node
Returns:
all of the string contents of this node

nodeText

public static java.lang.StringBuffer nodeText(org.w3c.dom.Node n,
                                              java.lang.StringBuffer text)
Returns the string contents of a node. This does the work for the nodeText method.

Parameters:
n - current node
text - text accumulated so far
Returns:
text filled in with the string content for the node

printTree

public static void printTree(org.w3c.dom.Node n,
                             int lvl)
Print the tree representing a node in the XML document. Useful for debugging, this method recursively prints the node and its children.

Parameters:
n - current node
lvl - index for the recursive parse

createNode

public static org.w3c.dom.Element createNode(org.w3c.dom.Document document,
                                             java.lang.String nodeName)
Creates a new empty XML node. This is a convenience method for creating new DOM elements. Note that unlike the other methods for creating nodes, this method does NOT add the node to the resulting XML tree. If the node is not added manually, it will not appear in the final XML document.

Parameters:
document - XML document into which the node will be inserted
nodeName - name of the XML tag
Returns:
a newly created node

createNode

public static org.w3c.dom.Element createNode(org.w3c.dom.Document document,
                                             java.lang.String nodeName,
                                             org.w3c.dom.Node parentNode)
Creates a new node and inserts it into the DOM tree. This method creates a new empty node, and the new node is added as a child of parentNode.

Parameters:
document - XML document into which the node will be inserted
nodeName - name of the XML tag
parentNode - node to which this node will be added as a child
Returns:
the newly created node, after it is added as the child of the parentNode

createNode

public static org.w3c.dom.Element createNode(org.w3c.dom.Document document,
                                             java.lang.String nodeName,
                                             java.lang.String attributeName,
                                             java.lang.String attributeValue,
                                             org.w3c.dom.Node parentNode)
Creates a new node and inserts it into the DOM tree. This method is a convenience to create a new node with a named attribute. The best way to illustrate is with an example. If nodeName is "my-node", attributeName is "length", and attributeValue is "seven", this method creates a new XML node that like <my-node length="seven"/> If attributeName or attributeValue are null, then the name/value pair is ommited. The new node is inserted into the DOM tree as a child of parentNode. The newly created node is returned in case further manipulation is desired.

Parameters:
document - XML document into which the node will be inserted
nodeName - name of the XML tag
attributeName - optional key to add to XML tag
attributeValue - optional value to be associated with key
parentNode - node to which this node will be added as a child
Returns:
the newly created node, after it is added as the child of the parentNode

createTextNode

public static org.w3c.dom.Element createTextNode(org.w3c.dom.Document document,
                                                 java.lang.String nodeName,
                                                 java.lang.String nodeText,
                                                 org.w3c.dom.Node parentNode)
Creates a new text node and interts it into the DOM tree. The string provided becomes the contents of the node named nodeName, which is added to the DOM tree as a child of parentNode. Any XML escaping of the string is done automatically by this method.

Parameters:
document - XML document into which the node will be inserted
nodeName - name of the XML tage
nodeText - text which will be added to the new node
parentNode - node to which this node will be added as a child
Returns:
the newly created node, after it is added as a child of the parentNode

main

public static void main(java.lang.String[] args)
Runs the XMLHelper. If one argument is present, it is treated as a local filename and the DOM tree of tags for that file is printed. If there are two command line arguments, the first is a file name of the XML file, the second is a search query. The results of running the search query on the file are printed to standard output.

Parameters:
args - command line arguments