|
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--org.greybird.xmliter.SaxIterator
An XmlIterator that obtains its data from a SaxEventSource. SaxEventSource.generateEvents() is called to generate SAX events, which are then queued by this class and returned during iteration.
SaxIterator adds between 20% and 30% processing overhead compared to
direct use of the SAX API. The added overhead depends on whether a XercesSaxEventSource
(around 20% overhead) or a ThreadedSaxEventSource
(around 30% overhead) is used. Compared to using
the DOM API directly, SaxIterator takes only 50% of the time. These
percentages are relative to the total time to parse and traverse a document.
They come from the PerformanceTest included with this package.
As events are iterated they are removed from the queue and more events are incrementally added. This allows iterating over large documents while keeping only a portion of the document in memory at one time. Therefore documents of any size may be processed.
Additionally, when advance() or advanceMixed() is called and the end tag (endElement event) of the prior element has not been received, new events for the prior element's children will not be queued. This allows skipping over a subtree without wasting resources unnecessarily.
There are two ways to use a SaxIterator. The first way is to use the
SaxIterator(InputSource)
constructor, and let SaxIterator choose a
default SAX event source class and parser. A XercesSaxEventSource
is used if the Xerces Native Interface parser is available, and a ThreadedSaxEventSource
is used otherwise. This technique is easy to use
but does not allow any control over the parser.
import org.greybird.xmliter.SaxIterator; import org.greybird.xmliter.XmlIterator; import org.xml.sax.InputSource; ... // // Create a SaxIterator from a SAX InputSource. // InputSource input = new InputSource(...); XmlIterator iter = new SaxIterator(input); while (iter.advance()) { ...
The second way is to use the SaxIterator(SaxEventSource)
constructor, and explicitly specify the event source you wish to use. The
event source may be a XercesSaxEventSource
, a ThreadedSaxEventSource
, or any other implementation of the SaxEventSource
interface. The following example shows the use of a ThreadedSaxEventSource
with a SAX parser explicitly created using JAXP.
import javax.xml.parsers.SAXParserFactory; import org.greybird.xmliter.SaxEventSource; import org.greybird.xmliter.SaxIterator; import org.greybird.xmliter.XmlIterator; import org.greybird.xmliter.ThreadedSaxEventSource; import org.xml.sax.InputSource; import org.xml.sax.XMLReader; ... // // Create a ThreadedSaxEventSource from a SAX InputSource and XMLReader // InputSource input = new InputSource(...); XMLReader parser = SAXParserFactory.newInstance().newSAXParser().getXMLReader(); SaxEventSource eventSource = new ThreadedSaxEventSource(input, parser); // // Create a SaxIterator from a ThreadedSaxEventSource // XmlIterator iter = new SaxIterator(eventSource); while (iter.advance()) { ...
Constructor Summary | |
SaxIterator(org.xml.sax.InputSource inputSource)
Creates a SAX iterator for parsing a given input source using a default SAX parser. |
|
SaxIterator(SaxEventSource eventSource)
Creates a SAX iterator from a given SAX event source. |
Method Summary | |
boolean |
advance()
Moves to the next element at the current level and returns true, or returns false if there are no more elements. |
boolean |
advanceMixed()
Moves to the next element or text item at the current level and returns true, or returns false if there are no more items. |
java.lang.String |
attribute(java.lang.String namespace,
java.lang.String name)
Returns the value of the attribute with the given name belonging to the current element, or null if no such attribute exists, or null if positioned on a mixed text item. |
XmlAttributes |
attributes()
Returns an iterator over the attributes belonging to the current element, or null if positioned on a mixed text item. |
XmlIterator |
children()
Returns an iterator positioned before the first child of the current element, or null if the current element does not contain any element children, or null if positioned on a mixed text item. |
java.lang.String |
name()
Returns the local name of the current element, or null if positioned on a mixed text item. |
java.lang.String |
namespace()
Returns the namespace URI of the current element, or an empty string if the current element has no namespace, or null if positioned on a mixed text item. |
java.lang.String |
toString()
Returns a string for error reporting that identifies the current element. |
java.lang.String |
value()
Returns the text contained by the current element or text item, or null if positioned on an element that contains one or more element children. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
public SaxIterator(org.xml.sax.InputSource inputSource) throws java.io.IOException, org.xml.sax.SAXException, org.apache.xerces.xni.XNIException, javax.xml.parsers.ParserConfigurationException
If XercesSaxEventSource.isAvailable()
returns true, then
XercesSaxEventSource.XercesSaxEventSource(InputSource)
is used;
otherwise, ThreadedSaxEventSource.ThreadedSaxEventSource(InputSource)
is used.
This behavior may change in a future version, but a default/standard
parser configuration will always be used.
inputSource
- is the input document to be parsed.public SaxIterator(SaxEventSource eventSource)
eventSource
- is a SAX event source.Method Detail |
public boolean advance() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
XmlIterator
advance
in interface XmlIterator
org.greybird.xmliter.XmlIterator
java.lang.IllegalStateException
- if an element would be returned out of
depth-first tree order.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public boolean advanceMixed() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
XmlIterator
advanceMixed
in interface XmlIterator
org.greybird.xmliter.XmlIterator
java.lang.IllegalStateException
- if an element or text item would be
returned out of depth-first tree order.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public java.lang.String name() throws java.lang.IllegalStateException
XmlIterator
name
in interface XmlIterator
org.greybird.xmliter.XmlIterator
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.public java.lang.String namespace() throws java.lang.IllegalStateException
XmlIterator
namespace
in interface XmlIterator
org.greybird.xmliter.XmlIterator
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.public java.lang.String attribute(java.lang.String namespace, java.lang.String name) throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
XmlIterator
WARNING: Unlike the DOM methods Node.getAttribute() and getAttributeNS() this method returns null, not an empty string, when the attribute does not exist.
attribute
in interface XmlIterator
org.greybird.xmliter.XmlIterator
namespace
- is the namespace of the attribute or may be null or
an empty string if the attribute has the empty namespace.name
- is the local name of the attribute.java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public XmlAttributes attributes() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
XmlIterator
attributes
in interface XmlIterator
org.greybird.xmliter.XmlIterator
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public java.lang.String value() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
XmlIterator
This method always returns the complete text between elements, which may be the result of concatenating multiple SAX character events or DOM text nodes.
value
in interface XmlIterator
org.greybird.xmliter.XmlIterator
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public XmlIterator children() throws java.lang.IllegalStateException, java.io.IOException, XmlIteratorException
XmlIterator
If this method is called for an element more than once it will return the same iterator instance as was returned the first time it was called, and therefore the returned iterator may no longer be positioned before its first child.
children
in interface XmlIterator
org.greybird.xmliter.XmlIterator
java.lang.IllegalStateException
- if advance() or advanceMixed() has not yet
been called.java.io.IOException
- if an error occurs retrieving input data.XmlIteratorException
- if an error occurs processing input data.public java.lang.String toString()
XmlIterator
[/ns:TopElement/ns:ChildElement]
toString
in interface XmlIterator
toString
in class java.lang.Object
|
Copyright (c) 2003 Mark T. Hayes; All Rights Reserved | |||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |