|
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
An iterator for traversing the elements and text of an XML document.
Rather than a callback model like SAX, XmlIterator uses an iterator object that is moved forward through the items in an XML document. Because it is designed to work efficiently with an underlying SAX event stream, the iterator does not allow moving backward in the document.
An iterator is always in one of the following states.
(Note that advanceMixed() may be used instead of advance() and the rules are then slightly different, as will be explained later.)
The following method iterates recursively over all elements in a document.
import java.io.IOException; import org.greybird.xmliter.XmlIterator; import org.greybird.xmliter.XmlIteratorException; ... static void traverse(XmlIterator iter, String indent) throws IOException, XmlIteratorException { while (iter.advance()) { System.out.println(indent + "- " + iter.name() + " (" + iter.namespace() + ')'); XmlIterator children = iter.children(); if (children != null) { traverse(children, indent + " "); } else { String value = iter.value(); if (value.length() > 0) { System.out.println(indent + " = " + value); } } } }
Given the following XML document, traverse() will output the text below it.
<Top xmlns="http://namespaceURI/"> <One> <Two/> </One> <Three>some text...</Three> <Four/> </Top> ------------------------------------ - Top (http://namespaceURI/) - One (http://namespaceURI/) - Two (http://namespaceURI/) - Three (http://namespaceURI/) = some text... - Four (http://namespaceURI/)
Notice that the whitespace between element tags is not returned by the iterator. This is because the advance() method returns the next element and skips text that is mixed with elements at the same level of the document. This is convenient for structured XML documents where such mixing only occurs with whitespace that is irrelevant.
Mixed text and elements at the same level of the document are called "mixed content", and are used with unstructured XML documents such as XHTML (XML-ized HTML). Mixed content is supported by calling the advanceMixed() method, which moves to the next element or text item.
The following method iterates recursively over all elements and mixed text items in a document.
import java.io.IOException; import org.greybird.xmliter.XmlIterator; import org.greybird.xmliter.XmlIteratorException; ... static void traverseMixed(XmlIterator iter, String indent) throws IOException, XmlIteratorException { while (iter.advanceMixed()) { if (iter.name() != null) { System.out.println(indent + "- " + iter.name() + " (" + iter.namespace() + ')'); XmlIterator children = iter.children(); if (children != null) { traverseMixed(children, indent + " "); } else { String value = iter.value(); if (value.length() > 0) { System.out.println(indent + " = " + value); } } } else { System.out.println(indent + "= " + iter.value()); } } }
Given the following XHTML document, traverseMixed() will output the text below it.
<html xmlns="http://www.w3.org/1999/xhtml"> <body> <p>This is a <b>bold</b> word.</p> </body> </html> ------------------------------------ - html (http://www.w3.org/1999/xhtml) = - body (http://www.w3.org/1999/xhtml) = - p (http://www.w3.org/1999/xhtml) = This is a - b (http://www.w3.org/1999/xhtml) = bold = word. = =
If a document does not have a mixed content model, it is preferrable to use advance() instead of advanceMixed() to avoid dealing with irrelevant whitespace. After advance() is called, you can be sure that a non-null element name() will always be returned.
But if you do need to use advanceMixed() there are a couple of things to keep in mind.
The following output from running traverse() with the XHTML input above shows that mixed text is ignored by the advance() method. Notice that "This is a" and "word" are not listed because they are mixed with elements, while "bold" is listed because no elements appear at the same level.
- html (http://www.w3.org/1999/xhtml) - body (http://www.w3.org/1999/xhtml) - p (http://www.w3.org/1999/xhtml) - b (http://www.w3.org/1999/xhtml) = bold
Method Summary | |
boolean |
advance()
Moves to the next element at the current level and returns true, or returns false if there are no more elements. |
boolean |
advanceMixed()
Moves to the next element or text item at the current level and returns true, or returns false if there are no more items. |
java.lang.String |
attribute(java.lang.String namespace,
java.lang.String name)
Returns the value of the attribute with the given name belonging to the current element, or null if no such attribute exists, or null if positioned on a mixed text item. |
XmlAttributes |
attributes()
Returns an iterator over the attributes belonging to the current element, or null if positioned on a mixed text item. |
XmlIterator |
children()
Returns an iterator positioned before the first child of the current element, or null if the current element does not contain any element children, or null if positioned on a mixed text item. |
java.lang.String |
name()
Returns the local name of the current element, or null if positioned on a mixed text item. |
java.lang.String |
namespace()
Returns the namespace URI of the current element, or an empty string if the current element has no namespace, or null if positioned on a mixed text item. |
java.lang.String |
toString()
Returns a string for error reporting that identifies the current element. |
java.lang.String |
value()
Returns the text contained by the current element or text item, or null if positioned on an element that contains one or more element children. |
Method Detail |
public boolean advance() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
java.lang.IllegalStateException
- if an element would be returned out of
depth-first tree order.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public boolean advanceMixed() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
java.lang.IllegalStateException
- if an element or text item would be
returned out of depth-first tree order.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public java.lang.String name() throws java.lang.IllegalStateException
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.public java.lang.String namespace() throws java.lang.IllegalStateException
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.public java.lang.String value() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
This method always returns the complete text between elements, which may be the result of concatenating multiple SAX character events or DOM text nodes.
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public XmlIterator children() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
If this method is called for an element more than once it will return the same iterator instance as was returned the first time it was called, and therefore the returned iterator may no longer be positioned before its first child.
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public java.lang.String attribute(java.lang.String namespace, java.lang.String name) throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
WARNING: Unlike the DOM methods Node.getAttribute() and getAttributeNS() this method returns null, not an empty string, when the attribute does not exist.
namespace
- is the namespace of the attribute or may be null or
an empty string if the attribute has the empty namespace.name
- is the local name of the attribute.java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public XmlAttributes attributes() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public java.lang.String toString()
[/ns:TopElement/ns:ChildElement]
toString
in class java.lang.Object
|
Copyright (c) 2003 Mark T. Hayes; All Rights Reserved | |||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |