XMLTokenizer

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

hmi.xml
Class XMLTokenizer

java.lang.Object
  hmi.xml.XMLTokenizer

public class XMLTokenizer
extends Object
extends Object

A scanner of XML input streams.

An XML scanner enforces only the simple lexical well-formedness constraints of XML 1.0. An XML stream is a sequence of lexical tokens. These lexical tokens have an external (string) representation, and an internal representation. The recognized lexical tokens, and their external representations are:

STag : Start tags, of the form <identifier ... > Here, the "dots" indicate a sequence of Attributes, externally represented in the form ATTRIBUTENAME = ATTRIBUTEVALUE
ETAG : End tags, of the form
CHARDATA : Character data, in the form of character strings, see below.
PI : Processing instruction, of the form
DECL : Declarations, of the form (Special cases are:
- comments of the form , where string must not contain --
- document type declarations:
EndOfData : EndOfData is a pseudo token that signals that no more tokens are available.

identifiers consist exclusively of the following characters: a-zA-Z0-9-_.: and must start with one of the characters a-zA-Z_

A start tag immediately followed by the corresponding end tag can be represented externally as an "empty tag" of the form:

CHARDATA, or "content" is considered to be "parsed character data", which means the following:

The external representation is assumed not to contain < characters
& characters are assumed to start an entity reference of the form: < > & " '

Such entity references are translated to their internal representation: < > & " ' 3) The XML standard assumes also that the character sequence ]]> does not occur in character data. Note that the characters > " ' are not forbidden in character data. However, the easiest way to translate arbitrary character strings into legal XML character data is to do the following:

replace < by <
replace > by > (This automatically takes care of ]]> sequences)
replace & by &

The regular expression that describes the possible streams of lexical tokens is: ( (Stag (AttrName AttrValue)*) | ETAG | CHARDATA | PI | DECL )* EndOfData

The scanner can work with two different interfaces:

Author:: Dennis Reidsma, Twente University, Job Zwiers, Twente University

Nested Class Summary
`private class`	`XMLTokenizer.TokenizerState` TokenizerState objects are used to save and restore the current "state" of this XMLTokenizer on the stack.

Field Summary
`private String`	`attributeName`
`private StringBuilder`	`attributeNameBuffer`
`private String`	`attributePrefix`
`private HashMap<String,String>`	`attributes`
`private StringBuilder`	`attributeValueBuffer`
`private StringBuilder`	`buf`
`private static int`	`BUFSIZELARGE`
`private static int`	`BUFSIZESMALL`
`private StringBuilder`	`cDataBuffer`
`private static int`	`CDATABUFFERSIZE`
`static int`	`CDSECT`
`private char`	`ch`
`static int`	`CHARDATA`
`private static int`	`CHARDATA_MODE`
`private StringBuilder`	`charDataBuffer`
`private int`	`charPos`
`private int`	`ci`
`static int`	`COMMENT`
`private StringBuilder`	`commentDataBuffer`
`static int`	`CONSUMED`
`private static boolean`	`debug`
`static int`	`DECL`
`private String`	`defaultNamespace`
`private int`	`defaultScanExceptionMode`
`static int`	`DISCARDED_TOKEN_LIMIT`
`static int`	`DOCTYPE`
`private static int`	`DOCTYPEBUFFERSIZE`
`private String`	`doctypeName`
`static int`	`ENDOFDOCUMENT`
`private static int`	`ENDOFDOCUMENT_MODE`
`static int`	`EOS`
`private static int`	`ERRORBUFSIZE`
`static int`	`ERRORFILE`
`static int`	`ERRORFILELINE`
`static int`	`ERRORFULL`
`static int`	`ERRORLINE`
`static int`	`ERRORPOS`
`static int`	`ERRORTOKEN`
`static int`	`ERRORTOKENLINE`
`static int`	`ERRORTOKENPOS`
`static int`	`ERRORURL`
`static int`	`ETAG`
`private File`	`file`
`private BufferedReader`	`in`
`private int`	`line`
`static boolean`	`LOG`
`private static Logger`	`logger`
`private XMLNameSpaceStack`	`namespaceStack`
`static int`	`NOERRORPOSITION`
`static int`	`NULLTOKEN`
`private static int`	`PENDING_ETAG_MODE`
`static int`	`PI`
`private StringBuilder`	`piDataBuffer`
`private boolean`	`popOnEndOfDocument`
`private String`	`pubidLiteral`
`private boolean`	`recognizeNamespaces`
`static boolean`	`RECOGNIZENAMESPACES`
`private StringBuilder`	`sectionBuffer`
`private boolean`	`sectionBuffering`
`static int`	`SECTIONBUFSIZE`
`private boolean`	`skipComment`
`static boolean`	`SKIPCOMMENT`
`private boolean`	`skipDoctype`
`static boolean`	`SKIPDOCTYPE`
`private boolean`	`skipPI`
`static boolean`	`SKIPPI`
`static int`	`STAG`
`private String`	`systemLiteral`
`private String`	`tagName`
`private StringBuilder`	`tagNameBuffer`
`private String`	`tagNamespace`
`private String`	`tagPrefix`
`private ArrayList<String>`	`tagStack`
`private static int`	`TAGSTACKSIZE`
`private int`	`token`
`private boolean`	`tokenBuffered`
`private int`	`tokenCharPos`
`private boolean`	`tokenConsumed`
`private ArrayList<XMLTokenizer.TokenizerState>`	`tokenizerStateStack`
`private int`	`tokenLine`
`private int`	`tokenMode`
`private URL`	`url`
`private ArrayList<Integer>`	`xmlnsCountStack`

Constructor Summary
`XMLTokenizer()` Create XMLTokenizer for a null Reader.
`XMLTokenizer(File inFile)` Create a XMLTokenizer for a (buffered) Reader constructed from the specified File.
`XMLTokenizer(InputStream in)` Create a XMLTokenizer for the specified InputStream.
`XMLTokenizer(Reader in)` Create a XMLTokenizer for the specified Reader.
`XMLTokenizer(String xmlString)` Like XMLTokenizer(Reader), with a Reader constructed from a StringReader for xmlString.
`XMLTokenizer(URL url)` Create a XMLTokenizer for a (buffered) Reader constructed from the specified URL.

Method Summary
`boolean`	`atCDSect()` tests whether the scanner is positioned at a CDATA section.
`boolean`	`atCharData()` tests whether the scanner is positioned at CHARDATA
`boolean`	`atComment()` tests whether the scanner is positioned at a comment.
`boolean`	`atDoctype()` tests whether the scanner is positioned at a doctype comment.
`boolean`	`atDoctype(String name)` tests whether the scanner is positioned at a doctype comment.
`boolean`	`atEndOfDocument()` Tests whether the scanner is positioned at the end of the document.
`boolean`	`atETag()` Tests whether the scanner is positioned at an end tag.
`boolean`	`atETag(String tName)` Tests whether the scanner is positioned at an end tag with the given name.
`boolean`	`atPI()` Tests whether the scanner is positioned at a processing instruction.
`boolean`	`atSTag()` Tests whether the scanner is positioned at an STAG or an OPENSTAG.
`boolean`	`atSTag(String tagName)` Tests whether the scanner is positioned at an start tag with the given name.
`private void`	`attributePrefixFixup(String nsPrefix, String ns)`
`private void`	`checkEmptyTagStack()` checks whether the tagStack is empty.
`private void`	`checkSequence(String seq)`
`private void`	`clearBuffer(StringBuilder b)`
`private void`	`clearSectionBuffer()`
`int`	`currentToken()`
`String`	`currentTokenString()`
`String`	`getAttribute(String attributeName)`
`Iterator`	`getAttributeIterator()` returns an Iterator for the attributes HashMap, which has properly defined name/value pairs iff the current token is eitherOpenSTag, or STAG.
`HashMap<String,String>`	`getAttributes()` returns the attributes HashMap, which has properly defined name/value pairs iff the current token is eitherOpenSTag, or STAG.
`String`	`getCDSect()` Reads the current CDATA.
`String`	`getCharData()` Reads the current CHARDATA.
`int`	`getCharPos()` returns the current character position within the current line.
`String`	`getComment()` reads the current comment, without advancing the scanner to the next token.
`String`	`getDoctypeName()` returns the doctype name, or null if not defined.
`String`	`getErrorMessage(String message)` returns an error mesage String, containing the message String, but also including positional information, depending on available information, and settings.
`String`	`getErrorMessage(String message, int mode)` returns an error mesage String, containing the message String, but also including positional information, depending on available information, and settings.
`File`	`getFile()` returns the current file, which could be null
`int`	`getLine()` returns the current line number; line counts start at 1.
`String`	`getNamespace()` Returns the namespace of the current tag.
`String`	`getPI()` reads the current PI, without advancing the scanner to the next token.
`boolean`	`getpopOnEndOfDocument()`
`String`	`getPubidLiteral()`
`Reader`	`getReader()` Gets the Reader that this tokenizer is currently using.
`boolean`	`getRecognizeNamespaces()` returns the current status of recognizeNamespace status.
`private String`	`getSectionBuffer()`
`private String`	`getStrippedSectionBuffer()`
`String`	`getSystemLiteral()`
`String`	`getTagName()` Returns the current start tag.
`int`	`getToken()` returns the current token, without consuming it (unlike nextToken) If the current token was consumed, nextToken is called first.
`int`	`getTokenCharPos()` returns the character position of the start of the current token.
`int`	`getTokenLine()` returns the starting line number of the start of the the current token.
`String`	`getTokenString()` returns the current token in String format, without consuming it.
`String`	`getTrimmedCharData()` Reads the current CHARDATA, and trims away surrounding blank space.
`URL`	`getURL()` returns the current URL, which could be null, for instance in the case of a XMLTokenizer constructed from a String.
`XMLScanException`	`getXMLScanException(String message)` returns an XMLScanException, containing the message String, but also including positional information, depending on available information, and settings.
`XMLScanException`	`getXMLScanException(String message, int mode)` returns an XMLScanException, containing the message String, but also including positional information, depending on available information, and settings.
`String`	`getXMLSection()` assuming that we are at an STag, gets the XML text until the corresponding closing ETag.
`String`	`getXMLSectionContent()` assuming that we are just beyond an STag, specified by means of the tag parameter, gets the XML text until the corresponding closing ETag.
`private void`	`initState()` Starts the scanner on the current input
`private boolean`	`isNameChar()`
`private boolean`	`isNamespaceSepChar()`
`private boolean`	`isNameStartChar()`
`private boolean`	`isSpaceChar()`
`private int`	`nextChar()`
`void`	`nextParsedChar()`
`private int`	`nextToken()` Called to move the tokenizer to a next token.
`private void`	`parseAttribute()`
`private int`	`parseCDSect()`
`private int`	`parseCharData()`
`private int`	`parseComment()`
`private int`	`parseDeclaration()`
`private int`	`parseDoctype()`
`private int`	`parseETag()`
`private int`	`parseMarkup()`
`private int`	`parsePI()`
`private int`	`parseSTag()`
`private int`	`parseString(StringBuilder buf)`
`void`	`popReader()` assuming that a previous call to pushReader has been made, this call will restore the previous reader, and the state of the XMLTokenizer to the state when the pushReader call was made.
`void`	`popState()`
`private void`	`popTag(String tag)`
`void`	`pushReader(BufferedReader in)` Pushes the current XMLTokenizer status on the stack, and then starts reading from the newly specified Reader.
`void`	`pushReader(String urlSpec)` Pushes the current XMLTokenizer status on the stack, and then starts reading from the Raeder specified in URL form.
`void`	`pushState()`
`private void`	`pushTag(String tag, int namespaceDeclarationCount)`
`int`	`read()` reads one (unparsed) character from the input stream.
`boolean`	`recoverAfterETag(String etag)` discards input until an ETAG with specified tag name is reached, the end of document is reached, or an IOException is thrown.
`boolean`	`recoverAfterETag(String etag, int tokenLimit)` discards input until an ETAG with specified tag name is reached, the end of document is reached, or an IOException is thrown.
`boolean`	`recoverAtSTag(String stag)` discards input until an STAG with specified tag name is reached, the end of document is reached, or an IOException is thrown.
`boolean`	`recoverAtSTag(String stag, int tokenLimit)` discards input until an STAG with specified tag name is reached, the end of document is reached, or an IOException is thrown.
`void`	`setBaseURL(String urlSpec)` like setURL, in that it defines the current URL, but unlike setURL, setBaseURL does not attempt to open a new Reader for the URL.
`void`	`setBaseURL(URL url)` like setURL, in that it defines the current URL, but does not attempt to open a new Reader for the URL.
`static void`	`setDebug(boolean mode)`
`void`	`setDefaultModes()` sets the modes to their current default values.
`void`	`setFile(File inFile)` Sets the current File, and opens a Reader for the File.
`void`	`setpopOnEndOfDocument(boolean mode)` sets the mode of popping the status stack when the ENDOFDOCUMENT token is encountered within the input stream: auto popping enabled will silently suppress the End-Of-Document tokens until the stack is empty and the ENDOFDOCUMENT is reached
`BufferedReader`	`setReader(Reader in)` replaces the Reader that this tokenizer should process.
`boolean`	`setRecognizeNamespaces(boolean recnsp)` Sets the status of namespace recognition.
`private void`	`setSectionBuffering(boolean buffering)`
`boolean`	`setSkipComment(boolean skipped)` Used to set if COMMENTS should be skipped.
`boolean`	`setSkipDoctype(boolean skipped)` Used to set if DOCTYPE should be skipped.
`boolean`	`setSkipPI(boolean skipped)` Used to set if PIDATA should be skipped.
`private void`	`setTokenPos()`
`void`	`setURL(String urlSpec)` starts reading from a new URL. the urlSpec String is regarded as relative to the current URL, unless it has the form of an absolute URL.
`void`	`setURL(URL url)` sets the current URL, and opens a Reader for the new URL.
`void`	`setXMLScanExceptionMode(int mode)` determines what to include in XMLScanExceptions, generated by getXMLScanException().
`void`	`showTokenizerStack()`
`void`	`showTokenizerStack(String message)`
`void`	`showTokenizerState()`
`void`	`showTokenizerState(String message)`
`private void`	`skipDoctype()`
`private void`	`skipSpaceChars()`
`void`	`skipTag()` assuming that we are at a STAG, skips the remainder of the current tag, up to and including the matching ETAG.
`String`	`takeCDSect()`
`String`	`takeCharData()`
`String`	`takeComment()`
`double`	`takeDoubleElement(String tagName)` expects an XML element of the form <tagName> value </tagName>, where value encodes a double value.
`HashMap<String,String>`	`takeEmptyElement(String tagName)` expects an XML element of the form <tagName attributes /> The attributes are returned, in the form of a HashMap.
`String`	`takeETag()`
`void`	`takeETag(String tagName)`
`float`	`takeFloatElement(String tagName)` expects an XML element of the form <tagName> value </tagName>, where value encodes a float value.
`int`	`takeIntElement(String tagName)` expects an XML element of the form <tagName> value </tagName>, where value encodes an integer value.
`long`	`takeLongElement(String tagName)` expects an XML element of the form <tagName> value </tagName>, where value encodes a long value.
`String`	`takePI()`
`String`	`takeSTag()` checks whether the current token is an STAG, and consumes it.
`void`	`takeSTag(String tagName)` checks whether the current token is an STAG, and consumes it.
`String`	`takeString(int len)` reads unparsed characters from the input stream, and returns them as String.
`String`	`takeTextElement(String tagName)` expects an XML element of the form <tagName> value </tagName>, where value is PCData content.
`String`	`takeTrimmedCharData()`
`static String`	`tokenString(int token)` returns a String representation for each integer that represents a lexical token, like the String "STAG" for the integer STAG constant etc.
`private String`	`topTag()`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

SKIPDOCTYPE

public static final boolean SKIPDOCTYPE

See Also:: Constant Field Values

SKIPCOMMENT

public static final boolean SKIPCOMMENT

See Also:: Constant Field Values

SKIPPI

public static final boolean SKIPPI

See Also:: Constant Field Values

RECOGNIZENAMESPACES

public static final boolean RECOGNIZENAMESPACES

See Also:: Constant Field Values

LOG

public static final boolean LOG

See Also:: Constant Field Values

SECTIONBUFSIZE

public static final int SECTIONBUFSIZE

See Also:: Constant Field Values

DISCARDED_TOKEN_LIMIT

public static final int DISCARDED_TOKEN_LIMIT

See Also:: Constant Field Values

logger

private static Logger logger

ERRORTOKENLINE

public static final int ERRORTOKENLINE

See Also:: Constant Field Values

ERRORTOKENPOS

public static final int ERRORTOKENPOS

See Also:: Constant Field Values

ERRORLINE

public static final int ERRORLINE

See Also:: Constant Field Values

ERRORPOS

public static final int ERRORPOS

See Also:: Constant Field Values

ERRORFILE

public static final int ERRORFILE

See Also:: Constant Field Values

ERRORURL

public static final int ERRORURL

See Also:: Constant Field Values

ERRORFULL

public static final int ERRORFULL

See Also:: Constant Field Values

ERRORFILELINE

public static final int ERRORFILELINE

See Also:: Constant Field Values

NOERRORPOSITION

public static final int NOERRORPOSITION

See Also:: Constant Field Values

defaultScanExceptionMode

private int defaultScanExceptionMode

ERRORBUFSIZE

private static final int ERRORBUFSIZE

See Also:: Constant Field Values

CDATABUFFERSIZE

private static final int CDATABUFFERSIZE

See Also:: Constant Field Values

DOCTYPEBUFFERSIZE

private static final int DOCTYPEBUFFERSIZE

See Also:: Constant Field Values

debug

private static boolean debug

NULLTOKEN

public static final int NULLTOKEN

See Also:: Constant Field Values

STAG

public static final int STAG

See Also:: Constant Field Values

ETAG

public static final int ETAG

See Also:: Constant Field Values

CHARDATA

public static final int CHARDATA

See Also:: Constant Field Values

CDSECT

public static final int CDSECT

See Also:: Constant Field Values

COMMENT

public static final int COMMENT

See Also:: Constant Field Values

PI

public static final int PI

See Also:: Constant Field Values

DECL

public static final int DECL

See Also:: Constant Field Values

DOCTYPE

public static final int DOCTYPE

See Also:: Constant Field Values

ENDOFDOCUMENT

public static final int ENDOFDOCUMENT

See Also:: Constant Field Values

ERRORTOKEN

public static final int ERRORTOKEN

See Also:: Constant Field Values

skipDoctype

private boolean skipDoctype

skipComment

private boolean skipComment

skipPI

private boolean skipPI

recognizeNamespaces

private boolean recognizeNamespaces

in

private BufferedReader in

url

private URL url

file

private File file

ci

private int ci

ch

private char ch

line

private int line

charPos

private int charPos

tokenLine

private int tokenLine

tokenCharPos

private int tokenCharPos

token

private int token

sectionBuffer

private StringBuilder sectionBuffer

sectionBuffering

private boolean sectionBuffering

tokenConsumed

private boolean tokenConsumed

tokenBuffered

private boolean tokenBuffered

tokenMode

private int tokenMode

tagName

private String tagName

tagPrefix

private String tagPrefix

tagNamespace

private String tagNamespace

defaultNamespace

private String defaultNamespace

attributeName

private String attributeName

attributePrefix

private String attributePrefix

doctypeName

private String doctypeName

pubidLiteral

private String pubidLiteral

systemLiteral

private String systemLiteral

BUFSIZESMALL

private static final int BUFSIZESMALL

See Also:: Constant Field Values

BUFSIZELARGE

private static final int BUFSIZELARGE

See Also:: Constant Field Values

tagNameBuffer

private StringBuilder tagNameBuffer

attributeNameBuffer

private StringBuilder attributeNameBuffer

attributeValueBuffer

private StringBuilder attributeValueBuffer

charDataBuffer

private StringBuilder charDataBuffer

cDataBuffer

private StringBuilder cDataBuffer

piDataBuffer

private StringBuilder piDataBuffer

commentDataBuffer

private StringBuilder commentDataBuffer

buf

private StringBuilder buf

TAGSTACKSIZE

private static final int TAGSTACKSIZE

See Also:: Constant Field Values

tagStack

private ArrayList<String> tagStack

namespaceStack

private XMLNameSpaceStack namespaceStack

xmlnsCountStack

private ArrayList<Integer> xmlnsCountStack

tokenizerStateStack

private ArrayList<XMLTokenizer.TokenizerState> tokenizerStateStack

popOnEndOfDocument

private boolean popOnEndOfDocument

attributes

private HashMap<String,String> attributes

CHARDATA_MODE

private static final int CHARDATA_MODE

See Also:: Constant Field Values

PENDING_ETAG_MODE

private static final int PENDING_ETAG_MODE

See Also:: Constant Field Values

ENDOFDOCUMENT_MODE

private static final int ENDOFDOCUMENT_MODE

See Also:: Constant Field Values

EOS

public static final int EOS

See Also:: Constant Field Values

CONSUMED

public static final int CONSUMED

See Also:: Constant Field Values

Constructor Detail

XMLTokenizer

public XMLTokenizer(Reader in)

Create a XMLTokenizer for the specified Reader. Note that in general a BufferedReader is preferred; A non-buffered reader will be wrapped internally into a BufferedReader.

XMLTokenizer

public XMLTokenizer(InputStream in)

Create a XMLTokenizer for the specified InputStream.

XMLTokenizer

public XMLTokenizer()

Create XMLTokenizer for a null Reader.

XMLTokenizer

public XMLTokenizer(String xmlString)

Like XMLTokenizer(Reader), with a Reader constructed from a StringReader for xmlString. No file name or URL is defined, so error messages cannot refer to these attributes.

XMLTokenizer

public XMLTokenizer(File inFile)
             throws FileNotFoundException

Create a XMLTokenizer for a (buffered) Reader constructed from the specified File. Also, the base URL is set to the file URL derived from the File.

Throws:: FileNotFoundException

XMLTokenizer

public XMLTokenizer(URL url)

Create a XMLTokenizer for a (buffered) Reader constructed from the specified URL.

Method Detail

setFile

public void setFile(File inFile)
             throws FileNotFoundException

Sets the current File, and opens a Reader for the File. Also, the base URL is set to the file URL derived from the File.

Throws:: FileNotFoundException

getFile

public final File getFile()

returns the current file, which could be null

setBaseURL

public void setBaseURL(URL url)

like setURL, in that it defines the current URL, but does not attempt to open a new Reader for the URL. Therefore, it is allowed to use an URL that specifies a directory name, ending with a slash character, rather than a file name. This URL will be effectively used as the ``base URL'' when setURL is called with a relative URL specification.

setBaseURL

public void setBaseURL(String urlSpec)

like setURL, in that it defines the current URL, but unlike setURL, setBaseURL does not attempt to open a new Reader for the URL. Therefore, it is allowed to use an URL that specifies a directory name, ending with a slash character, rather than a file name. This URL will be effectively used as the ``base URL'' when setURL is called next, with a relative URL specification. The base url specification can be absolute, or relative to the current URL.

setURL

public void setURL(URL url)
            throws IOException

sets the current URL, and opens a Reader for the new URL. The current URL is ignored, that is, it is not regarded as a ``base URL''.

Throws:: IOException

setURL

public void setURL(String urlSpec)
            throws IOException

starts reading from a new URL. the urlSpec String is regarded as relative to the current URL, unless it has the form of an absolute URL.

Throws:: IOException

getURL

public URL getURL()

returns the current URL, which could be null, for instance in the case of a XMLTokenizer constructed from a String. XMLTokenizers constructed from a File object do have a defined (file) URL.

setReader

public final BufferedReader setReader(Reader in)

replaces the Reader that this tokenizer should process. The previous Reader is returned.

getReader

public final Reader getReader()

Gets the Reader that this tokenizer is currently using.

pushReader

public final void pushReader(BufferedReader in)

Pushes the current XMLTokenizer status on the stack, and then starts reading from the newly specified Reader.

pushReader

public final void pushReader(String urlSpec)
                      throws IOException

Pushes the current XMLTokenizer status on the stack, and then starts reading from the Raeder specified in URL form. The URL specification can be relative, in which case the current base URL is taken into account, or it can be an absolute URL. If boolean value returned is true if the reader for the specified url could be opened, otherwise false is retuned.

Throws:: IOException

popReader

public final void popReader()

assuming that a previous call to pushReader has been made, this call will restore the previous reader, and the state of the XMLTokenizer to the state when the pushReader call was made. Note that popReader calls are implied (made automatically) when the pop-on-ENDOFDOCUMENT mode is true.

setXMLScanExceptionMode

public void setXMLScanExceptionMode(int mode)

determines what to include in XMLScanExceptions, generated by getXMLScanException(). ``mode'' must be an OR of a selection of the following masks: ERRORTOKENLINE (line of error token) ERRORTOKENPOS (starting position of the error token) ERRORLINE (line where error was detected) ERRORPOS (character position where error was detected) ERRORFILE (file name, if available) ERRORURL (URL, if available, but not if file name has been included already) The default mode is ERRORFULL, which is simply the ``or'' of all masks. Also available is ERRORFILELINE, which equals ERRORFILE|ERRORURL|ERRORLINE, and NOERRORPOSITION, which is simply 0;

getXMLScanException

public XMLScanException getXMLScanException(String message)

returns an XMLScanException, containing the message String, but also including positional information, depending on available information, and settings. In principle, the file or url (if available), the token line and character position, and the curent line and position are included.

getXMLScanException

public XMLScanException getXMLScanException(String message,
                                            int mode)

getErrorMessage

public String getErrorMessage(String message)

returns an error mesage String, containing the message String, but also including positional information, depending on available information, and settings. In principle, the file or url (if available), the token line and character position, and the curent line and position are included.

getErrorMessage

public String getErrorMessage(String message,
                              int mode)

getLine

public final int getLine()

returns the current line number; line counts start at 1.

getCharPos

public final int getCharPos()

returns the current character position within the current line. character counts start at 1 for the first character within the line. After a \n character, the character position is 0.

getTokenLine

public final int getTokenLine()

returns the starting line number of the start of the the current token. line counting starts at 1.

getTokenCharPos

public final int getTokenCharPos()

returns the character position of the start of the current token.

currentToken

public final int currentToken()

currentTokenString

public final String currentTokenString()

recoverAtSTag

public final boolean recoverAtSTag(String stag)

discards input until an STAG with specified tag name is reached, the end of document is reached, or an IOException is thrown. In the first situation, true is returned, otherwise false.

recoverAtSTag

public final boolean recoverAtSTag(String stag,
                                   int tokenLimit)

discards input until an STAG with specified tag name is reached, the end of document is reached, or an IOException is thrown. In the first situation, true is returned, otherwise false. When showDiscardedTokens is true, discarded tokens are shown on the Console, up to a certain limit ( DISCARDED_TOKEN_LIMIT)

recoverAfterETag

public final boolean recoverAfterETag(String etag)

recoverAfterETag

public final boolean recoverAfterETag(String etag,
                                      int tokenLimit)

discards input until an ETAG with specified tag name is reached, the end of document is reached, or an IOException is thrown. In the first situation, the ETAG is consumed, and true is returned, otherwise false is returned. When showDiscardedTokens is true, discarded tokens are shown on the Console, up to a certain limit ( DISCARDED_TOKEN_LIMIT)

skipTag

public final void skipTag()
                   throws IOException

assuming that we are at a STAG, skips the remainder of the current tag, up to and including the matching ETAG. The current token must be STAG, OPENSTAG, or CLOSESTAG. The XML part that is skipped must be well formed, implying that STags and ETags should be properly matched. An Exception is thrown if the ENDOFDOCUMENT is reached while skipping.

Throws:: IOException

getXMLSection

public final String getXMLSection()
                           throws IOException

assuming that we are at an STag, gets the XML text until the corresponding closing ETag. An Exception is thrown if the ENDOFDOCUMENT is reached while skipping.

Throws:: IOException

getXMLSectionContent

public final String getXMLSectionContent()
                                  throws IOException

assuming that we are just beyond an STag, specified by means of the tag parameter, gets the XML text until the corresponding closing ETag. An Exception is thrown if the ENDOFDOCUMENT is reached while skipping.

Throws:: IOException

nextToken

private final int nextToken()
                     throws IOException,
                            XMLScanException

Called to move the tokenizer to a next token. Effect depends on current token and settings of several settings.

Throws:: IOException; XMLScanException

parseCharData

private int parseCharData()
                   throws IOException

Throws:: IOException

parseMarkup

private int parseMarkup()
                 throws IOException

Throws:: IOException

parseSTag

private int parseSTag()
               throws IOException

Throws:: IOException

attributePrefixFixup

private void attributePrefixFixup(String nsPrefix,
                                  String ns)

parseETag

private int parseETag()
               throws IOException

Throws:: IOException

parseAttribute

private void parseAttribute()
                     throws IOException

Throws:: IOException

parseString

private int parseString(StringBuilder buf)
                 throws IOException

Throws:: IOException

parseDeclaration

private int parseDeclaration()
                      throws IOException

Throws:: IOException

parseCDSect

private int parseCDSect()
                 throws IOException

Throws:: IOException

parseComment

private int parseComment()
                  throws IOException

Throws:: IOException

skipDoctype

private void skipDoctype()
                  throws IOException

Throws:: IOException

parseDoctype

private int parseDoctype()
                  throws IOException

Throws:: IOException

checkSequence

private void checkSequence(String seq)
                    throws IOException

Throws:: IOException

parsePI

private int parsePI()
             throws IOException

Throws:: IOException

setSkipPI

public final boolean setSkipPI(boolean skipped)

Used to set if PIDATA should be skipped.

setSkipComment

public final boolean setSkipComment(boolean skipped)

Used to set if COMMENTS should be skipped.

setSkipDoctype

public final boolean setSkipDoctype(boolean skipped)

Used to set if DOCTYPE should be skipped.

getRecognizeNamespaces

public final boolean getRecognizeNamespaces()

returns the current status of recognizeNamespace status.

setRecognizeNamespaces

public final boolean setRecognizeNamespaces(boolean recnsp)

Sets the status of namespace recognition. When set to true (the default) then somens:sometag is split up in a namespace part somens and a tag part sometag. If set to false, the : character is considered part of a (non-standard) tag somens:sometag. Returns the boolean value of the old setting.

setSectionBuffering

private final void setSectionBuffering(boolean buffering)

clearSectionBuffer

private final void clearSectionBuffer()

getStrippedSectionBuffer

private final String getStrippedSectionBuffer()

getSectionBuffer

private final String getSectionBuffer()

atSTag

public final boolean atSTag()
                     throws IOException

Tests whether the scanner is positioned at an STAG or an OPENSTAG. (Note that the value of completeSTags determines whether the actual token is an STAG or an OPENSTAG; also, the atOpenSTag() can be used to ensure that the token is OPENSTAG, rather than an STAG, if desired.)

Returns:: true if the scanner's token is STAG or OPENSTAG.
Throws:: IOException

atSTag

public final boolean atSTag(String tagName)
                     throws IOException

Tests whether the scanner is positioned at an start tag with the given name. The actual token can be STAG or OPENSTAG.

Parameters:: tagName - The element name to be tested
Returns:: true if the scanner's token is at an start tag and the name in the tag is equal to tagName.
Throws:: IOException

atETag

public final boolean atETag()
                     throws IOException

Tests whether the scanner is positioned at an end tag.

Returns:: true if the scanner's token is ETAG.
Throws:: IOException

atETag

public final boolean atETag(String tName)
                     throws IOException

Tests whether the scanner is positioned at an end tag with the given name.

Parameters:: tName - The element name to be tested
Returns:: true if the scanner's token is at an end tag and the name in the tag is equal to tagName.
Throws:: IOException

atPI

public final boolean atPI()
                   throws IOException

Tests whether the scanner is positioned at a processing instruction.

Returns:: true if the scanner's token is PI.
Throws:: IOException

atComment

public final boolean atComment()
                        throws IOException

tests whether the scanner is positioned at a comment.

Returns:: true if the scanner's token is COMMENT.
Throws:: IOException

atDoctype

public final boolean atDoctype()
                        throws IOException

tests whether the scanner is positioned at a doctype comment.

Returns:: true if the scanner's token is DOCTYPE.
Throws:: IOException

atCDSect

public final boolean atCDSect()
                       throws IOException

tests whether the scanner is positioned at a CDATA section.

Returns:: true if the scanner's token is CDATA.
Throws:: IOException

atDoctype

public final boolean atDoctype(String name)
                        throws IOException

tests whether the scanner is positioned at a doctype comment. and that the doctype name equals name.

Throws:: IOException

atCharData

public final boolean atCharData()
                         throws IOException

tests whether the scanner is positioned at CHARDATA

Returns:: true if the scanner's token is CHARDATA.
Throws:: IOException

atEndOfDocument

public final boolean atEndOfDocument()
                              throws IOException

Tests whether the scanner is positioned at the end of the document.

Returns:: true if the scanner's token is ENDOFDOCUMENT.
Throws:: IOException

getToken

public final int getToken()
                   throws IOException

returns the current token, without consuming it (unlike nextToken) If the current token was consumed, nextToken is called first.

Returns:: The current token
Throws:: IOException

getTokenString

public final String getTokenString()
                            throws IOException

returns the current token in String format, without consuming it. If the current token was consumed, nextToken is called first.

Returns:: The current token as String
Throws:: IOException

getTagName

public final String getTagName()
                        throws IOException

Returns the current start tag.

Returns:: The name in the current start tag
Throws:: IOException

getNamespace

public final String getNamespace()
                          throws IOException

Returns the namespace of the current tag.

Returns:: The namespace of the current start tag
Throws:: IOException

getComment

public final String getComment()
                        throws IOException

reads the current comment, without advancing the scanner to the next token.

Returns:: The current comment
Throws:: IOException

getDoctypeName

public final String getDoctypeName()

returns the doctype name, or null if not defined.

getPubidLiteral

public final String getPubidLiteral()

getSystemLiteral

public final String getSystemLiteral()

getPI

public final String getPI()
                   throws IOException

reads the current PI, without advancing the scanner to the next token.

Returns:: The current pi
Throws:: IOException

getCharData

public final String getCharData()
                         throws IOException

Reads the current CHARDATA.

Returns:: current CHARDATA.
Throws:: IOException

getTrimmedCharData

public final String getTrimmedCharData()
                                throws IOException

Reads the current CHARDATA, and trims away surrounding blank space.

Returns:: current CHARDATA.
Throws:: IOException

getCDSect

public final String getCDSect()
                       throws IOException

Reads the current CDATA.

Returns:: current CDATA.
Throws:: IOException

getAttributes

public final HashMap<String,String> getAttributes()
                                           throws IOException

returns the attributes HashMap, which has properly defined name/value pairs iff the current token is eitherOpenSTag, or STAG.

Throws:: IOException

getAttribute

public final String getAttribute(String attributeName)
                          throws IOException

Throws:: IOException

getAttributeIterator

public final Iterator getAttributeIterator()
                                    throws IOException

returns an Iterator for the attributes HashMap, which has properly defined name/value pairs iff the current token is eitherOpenSTag, or STAG. The Iterator elements are of type Map.Entry, which accessor methods getKey() and getValue()

Throws:: IOException

takeSTag

public final String takeSTag()
                      throws IOException

checks whether the current token is an STAG, and consumes it. When the current token is OPENSTAG, the remainder of the STAG will be read and consumed. The tagName of the STAG is returned.

Throws:: IOException

takeSTag

public final void takeSTag(String tagName)
                    throws IOException

checks whether the current token is an STAG, and consumes it. When the current token is OPENSTAG, the remainder of the STAG will be read and consumed. The tagName of the STAG is returned.

Throws:: IOException

takeETag

public final String takeETag()
                      throws IOException

Throws:: IOException

takeETag

public final void takeETag(String tagName)
                    throws IOException

Throws:: IOException

takeCharData

public final String takeCharData()
                          throws IOException

Throws:: IOException

takeTrimmedCharData

public final String takeTrimmedCharData()
                                 throws IOException

Throws:: IOException

takeCDSect

public final String takeCDSect()
                        throws IOException

Throws:: IOException

takePI

public final String takePI()
                    throws IOException

Throws:: IOException

takeComment

public final String takeComment()
                         throws IOException

Throws:: IOException

takeTextElement

public String takeTextElement(String tagName)
                       throws IOException

expects an XML element of the form <tagName> value </tagName>, where value is PCData content. In particular, mixed content is not allowed. The (trimmed) text content is returned.

Throws:: IOException

takeIntElement

public int takeIntElement(String tagName)
                   throws IOException

expects an XML element of the form <tagName> value </tagName>, where value encodes an integer value. The parsed integer is returned.

Throws:: IOException

takeLongElement

public long takeLongElement(String tagName)
                     throws IOException

expects an XML element of the form <tagName> value </tagName>, where value encodes a long value. The parsed long is returned.

Throws:: IOException

takeFloatElement

public float takeFloatElement(String tagName)
                       throws IOException

expects an XML element of the form <tagName> value </tagName>, where value encodes a float value. The parsed float is returned.

Throws:: IOException

takeDoubleElement

public double takeDoubleElement(String tagName)
                         throws IOException

expects an XML element of the form <tagName> value </tagName>, where value encodes a double value. The parsed double is returned.

Throws:: IOException

takeEmptyElement

public HashMap<String,String> takeEmptyElement(String tagName)
                                        throws IOException

expects an XML element of the form <tagName attributes /> The attributes are returned, in the form of a HashMap.

Throws:: IOException

read

public final int read()
               throws IOException

reads one (unparsed) character from the input stream. this is a low-level method, that should be used only for exceptional situations for instance, when embedded data that does not conform to XML standards has to be processed.

Throws:: IOException

takeString

public final String takeString(int len)
                        throws IOException

reads unparsed characters from the input stream, and returns them as String. Exactly len characters will be read, including "blank space".

Throws:: IOException

isNameChar

private boolean isNameChar()

isNamespaceSepChar

private boolean isNamespaceSepChar()

isNameStartChar

private boolean isNameStartChar()

isSpaceChar

private boolean isSpaceChar()

skipSpaceChars

private void skipSpaceChars()
                     throws IOException

Throws:: IOException

nextChar

private int nextChar()
              throws IOException

Throws:: IOException

nextParsedChar

public final void nextParsedChar()
                          throws IOException

Throws:: IOException

setTokenPos

private void setTokenPos()

clearBuffer

private void clearBuffer(StringBuilder b)

pushTag

private void pushTag(String tag,
                     int namespaceDeclarationCount)

topTag

private String topTag()

popTag

private void popTag(String tag)

checkEmptyTagStack

private void checkEmptyTagStack()

checks whether the tagStack is empty. If not, an XMLScanException is thrown.

setpopOnEndOfDocument

public final void setpopOnEndOfDocument(boolean mode)

sets the mode of popping the status stack when the ENDOFDOCUMENT token is encountered within the input stream: auto popping enabled will silently suppress the End-Of-Document tokens until the stack is empty and the ENDOFDOCUMENT is reached

getpopOnEndOfDocument

public final boolean getpopOnEndOfDocument()

tokenString

public static String tokenString(int token)

returns a String representation for each integer that represents a lexical token, like the String "STAG" for the integer STAG constant etc. Useful for printing tag names in Console messages.

setDebug

public static void setDebug(boolean mode)

setDefaultModes

public final void setDefaultModes()

sets the modes to their current default values.

initState

private void initState()

Starts the scanner on the current input

pushState

public final void pushState()

popState

public final void popState()

showTokenizerStack

public final void showTokenizerStack()

showTokenizerStack

public final void showTokenizerStack(String message)

showTokenizerState

public final void showTokenizerState()

showTokenizerState

public final void showTokenizerState(String message)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

hmi.xml Class XMLTokenizer

SKIPDOCTYPE

SKIPCOMMENT

SKIPPI

RECOGNIZENAMESPACES

LOG

SECTIONBUFSIZE

DISCARDED_TOKEN_LIMIT

logger

ERRORTOKENLINE

ERRORTOKENPOS

ERRORLINE

ERRORPOS

ERRORFILE

ERRORURL

ERRORFULL

ERRORFILELINE

NOERRORPOSITION

defaultScanExceptionMode

ERRORBUFSIZE

CDATABUFFERSIZE

DOCTYPEBUFFERSIZE

debug

NULLTOKEN

STAG

ETAG

CHARDATA

CDSECT

COMMENT

PI

DECL

DOCTYPE

ENDOFDOCUMENT

ERRORTOKEN

skipDoctype

skipComment

skipPI

recognizeNamespaces

in

url

file

ci

ch

line

charPos

tokenLine

tokenCharPos

token

sectionBuffer

sectionBuffering

tokenConsumed

tokenBuffered

tokenMode

tagName

tagPrefix

tagNamespace

defaultNamespace

attributeName

attributePrefix

doctypeName

pubidLiteral

systemLiteral

BUFSIZESMALL

BUFSIZELARGE

tagNameBuffer

attributeNameBuffer

attributeValueBuffer

charDataBuffer

cDataBuffer

piDataBuffer

commentDataBuffer

buf

TAGSTACKSIZE

tagStack

namespaceStack

xmlnsCountStack

tokenizerStateStack

popOnEndOfDocument

attributes

CHARDATA_MODE

hmi.xml
Class XMLTokenizer