|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objecthmi.xml.XMLTokenizer
public class XMLTokenizer
A scanner of XML input streams.
An XML scanner enforces only the simple lexical well-formedness constraints of XML 1.0. An XML stream is a sequence of lexical tokens. These lexical tokens have an external (string) representation, and an internal representation. The recognized lexical tokens, and their external representations are:
A start tag immediately followed by the corresponding end tag can be represented
externally as an "empty tag" of the form:
CHARDATA, or "content" is considered to be "parsed character data", which means the following:
The regular expression that describes the possible streams of lexical tokens is: ( (Stag (AttrName AttrValue)*) | ETAG | CHARDATA | PI | DECL )* EndOfData
The scanner can work with two different interfaces:
| Nested Class Summary | |
|---|---|
private class |
XMLTokenizer.TokenizerState
TokenizerState objects are used to save and restore the current "state" of this XMLTokenizer on the stack. |
| Constructor Summary | |
|---|---|
XMLTokenizer()
Create XMLTokenizer for a null Reader. |
|
XMLTokenizer(File inFile)
Create a XMLTokenizer for a (buffered) Reader constructed from the specified File. |
|
XMLTokenizer(InputStream in)
Create a XMLTokenizer for the specified InputStream. |
|
XMLTokenizer(Reader in)
Create a XMLTokenizer for the specified Reader. |
|
XMLTokenizer(String xmlString)
Like XMLTokenizer(Reader), with a Reader constructed from a StringReader for xmlString. |
|
XMLTokenizer(URL url)
Create a XMLTokenizer for a (buffered) Reader constructed from the specified URL. |
|
| Method Summary | |
|---|---|
boolean |
atCDSect()
tests whether the scanner is positioned at a CDATA section. |
boolean |
atCharData()
tests whether the scanner is positioned at CHARDATA |
boolean |
atComment()
tests whether the scanner is positioned at a comment. |
boolean |
atDoctype()
tests whether the scanner is positioned at a doctype comment. |
boolean |
atDoctype(String name)
tests whether the scanner is positioned at a doctype comment. |
boolean |
atEndOfDocument()
Tests whether the scanner is positioned at the end of the document. |
boolean |
atETag()
Tests whether the scanner is positioned at an end tag. |
boolean |
atETag(String tName)
Tests whether the scanner is positioned at an end tag with the given name. |
boolean |
atPI()
Tests whether the scanner is positioned at a processing instruction. |
boolean |
atSTag()
Tests whether the scanner is positioned at an STAG or an OPENSTAG. |
boolean |
atSTag(String tagName)
Tests whether the scanner is positioned at an start tag with the given name. |
private void |
attributePrefixFixup(String nsPrefix,
String ns)
|
private void |
checkEmptyTagStack()
checks whether the tagStack is empty. |
private void |
checkSequence(String seq)
|
private void |
clearBuffer(StringBuilder b)
|
private void |
clearSectionBuffer()
|
int |
currentToken()
|
String |
currentTokenString()
|
String |
getAttribute(String attributeName)
|
Iterator |
getAttributeIterator()
returns an Iterator for the attributes HashMap, which has properly defined name/value pairs iff the current token is eitherOpenSTag, or STAG. |
HashMap<String,String> |
getAttributes()
returns the attributes HashMap, which has properly defined name/value pairs iff the current token is eitherOpenSTag, or STAG. |
String |
getCDSect()
Reads the current CDATA. |
String |
getCharData()
Reads the current CHARDATA. |
int |
getCharPos()
returns the current character position within the current line. |
String |
getComment()
reads the current comment, without advancing the scanner to the next token. |
String |
getDoctypeName()
returns the doctype name, or null if not defined. |
String |
getErrorMessage(String message)
returns an error mesage String, containing the message String, but also including positional information, depending on available information, and settings. |
String |
getErrorMessage(String message,
int mode)
returns an error mesage String, containing the message String, but also including positional information, depending on available information, and settings. |
File |
getFile()
returns the current file, which could be null |
int |
getLine()
returns the current line number; line counts start at 1. |
String |
getNamespace()
Returns the namespace of the current tag. |
String |
getPI()
reads the current PI, without advancing the scanner to the next token. |
boolean |
getpopOnEndOfDocument()
|
String |
getPubidLiteral()
|
Reader |
getReader()
Gets the Reader that this tokenizer is currently using. |
boolean |
getRecognizeNamespaces()
returns the current status of recognizeNamespace status. |
private String |
getSectionBuffer()
|
private String |
getStrippedSectionBuffer()
|
String |
getSystemLiteral()
|
String |
getTagName()
Returns the current start tag. |
int |
getToken()
returns the current token, without consuming it (unlike nextToken) If the current token was consumed, nextToken is called first. |
int |
getTokenCharPos()
returns the character position of the start of the current token. |
int |
getTokenLine()
returns the starting line number of the start of the the current token. |
String |
getTokenString()
returns the current token in String format, without consuming it. |
String |
getTrimmedCharData()
Reads the current CHARDATA, and trims away surrounding blank space. |
URL |
getURL()
returns the current URL, which could be null, for instance in the case of a XMLTokenizer constructed from a String. |
XMLScanException |
getXMLScanException(String message)
returns an XMLScanException, containing the message String, but also including positional information, depending on available information, and settings. |
XMLScanException |
getXMLScanException(String message,
int mode)
returns an XMLScanException, containing the message String, but also including positional information, depending on available information, and settings. |
String |
getXMLSection()
assuming that we are at an STag, gets the XML text until the corresponding closing ETag. |
String |
getXMLSectionContent()
assuming that we are just beyond an STag, specified by means of the tag parameter, gets the XML text until the corresponding closing ETag. |
private void |
initState()
Starts the scanner on the current input |
private boolean |
isNameChar()
|
private boolean |
isNamespaceSepChar()
|
private boolean |
isNameStartChar()
|
private boolean |
isSpaceChar()
|
private int |
nextChar()
|
void |
nextParsedChar()
|
private int |
nextToken()
Called to move the tokenizer to a next token. |
private void |
parseAttribute()
|
private int |
parseCDSect()
|
private int |
parseCharData()
|
private int |
parseComment()
|
private int |
parseDeclaration()
|
private int |
parseDoctype()
|
private int |
parseETag()
|
private int |
parseMarkup()
|
private int |
parsePI()
|
private int |
parseSTag()
|
private int |
parseString(StringBuilder buf)
|
void |
popReader()
assuming that a previous call to pushReader has been made, this call will restore the previous reader, and the state of the XMLTokenizer to the state when the pushReader call was made. |
void |
popState()
|
private void |
popTag(String tag)
|
void |
pushReader(BufferedReader in)
Pushes the current XMLTokenizer status on the stack, and then starts reading from the newly specified Reader. |
void |
pushReader(String urlSpec)
Pushes the current XMLTokenizer status on the stack, and then starts reading from the Raeder specified in URL form. |
void |
pushState()
|
private void |
pushTag(String tag,
int namespaceDeclarationCount)
|
int |
read()
reads one (unparsed) character from the input stream. |
boolean |
recoverAfterETag(String etag)
discards input until an ETAG with specified tag name is reached, the end of document is reached, or an IOException is thrown. |
boolean |
recoverAfterETag(String etag,
int tokenLimit)
discards input until an ETAG with specified tag name is reached, the end of document is reached, or an IOException is thrown. |
boolean |
recoverAtSTag(String stag)
discards input until an STAG with specified tag name is reached, the end of document is reached, or an IOException is thrown. |
boolean |
recoverAtSTag(String stag,
int tokenLimit)
discards input until an STAG with specified tag name is reached, the end of document is reached, or an IOException is thrown. |
void |
setBaseURL(String urlSpec)
like setURL, in that it defines the current URL, but unlike setURL, setBaseURL does not attempt to open a new Reader for the URL. |
void |
setBaseURL(URL url)
like setURL, in that it defines the current URL, but does not attempt to open a new Reader for the URL. |
static void |
setDebug(boolean mode)
|
void |
setDefaultModes()
sets the modes to their current default values. |
void |
setFile(File inFile)
Sets the current File, and opens a Reader for the File. |
void |
setpopOnEndOfDocument(boolean mode)
sets the mode of popping the status stack when the ENDOFDOCUMENT token is encountered within the input stream: auto popping enabled will silently suppress the End-Of-Document tokens until the stack is empty and the ENDOFDOCUMENT is reached |
BufferedReader |
setReader(Reader in)
replaces the Reader that this tokenizer should process. |
boolean |
setRecognizeNamespaces(boolean recnsp)
Sets the status of namespace recognition. |
private void |
setSectionBuffering(boolean buffering)
|
boolean |
setSkipComment(boolean skipped)
Used to set if COMMENTS should be skipped. |
boolean |
setSkipDoctype(boolean skipped)
Used to set if DOCTYPE should be skipped. |
boolean |
setSkipPI(boolean skipped)
Used to set if PIDATA should be skipped. |
private void |
setTokenPos()
|
void |
setURL(String urlSpec)
starts reading from a new URL. the urlSpec String is regarded as relative to the current URL, unless it has the form of an absolute URL. |
void |
setURL(URL url)
sets the current URL, and opens a Reader for the new URL. |
void |
setXMLScanExceptionMode(int mode)
determines what to include in XMLScanExceptions, generated by getXMLScanException(). |
void |
showTokenizerStack()
|
void |
showTokenizerStack(String message)
|
void |
showTokenizerState()
|
void |
showTokenizerState(String message)
|
private void |
skipDoctype()
|
private void |
skipSpaceChars()
|
void |
skipTag()
assuming that we are at a STAG, skips the remainder of the current tag, up to and including the matching ETAG. |
String |
takeCDSect()
|
String |
takeCharData()
|
String |
takeComment()
|
double |
takeDoubleElement(String tagName)
expects an XML element of the form <tagName> value </tagName>, where value encodes a double value. |
HashMap<String,String> |
takeEmptyElement(String tagName)
expects an XML element of the form <tagName attributes /> The attributes are returned, in the form of a HashMap. |
String |
takeETag()
|
void |
takeETag(String tagName)
|
float |
takeFloatElement(String tagName)
expects an XML element of the form <tagName> value </tagName>, where value encodes a float value. |
int |
takeIntElement(String tagName)
expects an XML element of the form <tagName> value </tagName>, where value encodes an integer value. |
long |
takeLongElement(String tagName)
expects an XML element of the form <tagName> value </tagName>, where value encodes a long value. |
String |
takePI()
|
String |
takeSTag()
checks whether the current token is an STAG, and consumes it. |
void |
takeSTag(String tagName)
checks whether the current token is an STAG, and consumes it. |
String |
takeString(int len)
reads unparsed characters from the input stream, and returns them as String. |
String |
takeTextElement(String tagName)
expects an XML element of the form <tagName> value </tagName>, where value is PCData content. |
String |
takeTrimmedCharData()
|
static String |
tokenString(int token)
returns a String representation for each integer that represents a lexical token, like the String "STAG" for the integer STAG constant etc. |
private String |
topTag()
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final boolean SKIPDOCTYPE
public static final boolean SKIPCOMMENT
public static final boolean SKIPPI
public static final boolean RECOGNIZENAMESPACES
public static final boolean LOG
public static final int SECTIONBUFSIZE
public static final int DISCARDED_TOKEN_LIMIT
private static Logger logger
public static final int ERRORTOKENLINE
public static final int ERRORTOKENPOS
public static final int ERRORLINE
public static final int ERRORPOS
public static final int ERRORFILE
public static final int ERRORURL
public static final int ERRORFULL
public static final int ERRORFILELINE
public static final int NOERRORPOSITION
private int defaultScanExceptionMode
private static final int ERRORBUFSIZE
private static final int CDATABUFFERSIZE
private static final int DOCTYPEBUFFERSIZE
private static boolean debug
public static final int NULLTOKEN
public static final int STAG
public static final int ETAG
public static final int CHARDATA
public static final int CDSECT
public static final int COMMENT
public static final int PI
public static final int DECL
public static final int DOCTYPE
public static final int ENDOFDOCUMENT
public static final int ERRORTOKEN
private boolean skipDoctype
private boolean skipComment
private boolean skipPI
private boolean recognizeNamespaces
private BufferedReader in
private URL url
private File file
private int ci
private char ch
private int line
private int charPos
private int tokenLine
private int tokenCharPos
private int token
private StringBuilder sectionBuffer
private boolean sectionBuffering
private boolean tokenConsumed
private boolean tokenBuffered
private int tokenMode
private String tagName
private String tagPrefix
private String tagNamespace
private String defaultNamespace
private String attributeName
private String attributePrefix
private String doctypeName
private String pubidLiteral
private String systemLiteral
private static final int BUFSIZESMALL
private static final int BUFSIZELARGE
private StringBuilder tagNameBuffer
private StringBuilder attributeNameBuffer
private StringBuilder attributeValueBuffer
private StringBuilder charDataBuffer
private StringBuilder cDataBuffer
private StringBuilder piDataBuffer
private StringBuilder commentDataBuffer
private StringBuilder buf
private static final int TAGSTACKSIZE
private ArrayList<String> tagStack
private XMLNameSpaceStack namespaceStack
private ArrayList<Integer> xmlnsCountStack
private ArrayList<XMLTokenizer.TokenizerState> tokenizerStateStack
private boolean popOnEndOfDocument
private HashMap<String,String> attributes
private static final int CHARDATA_MODE
private static final int PENDING_ETAG_MODE
private static final int ENDOFDOCUMENT_MODE
public static final int EOS
public static final int CONSUMED
| Constructor Detail |
|---|
public XMLTokenizer(Reader in)
public XMLTokenizer(InputStream in)
public XMLTokenizer()
public XMLTokenizer(String xmlString)
public XMLTokenizer(File inFile)
throws FileNotFoundException
FileNotFoundExceptionpublic XMLTokenizer(URL url)
| Method Detail |
|---|
public void setFile(File inFile)
throws FileNotFoundException
FileNotFoundExceptionpublic final File getFile()
public void setBaseURL(URL url)
public void setBaseURL(String urlSpec)
public void setURL(URL url)
throws IOException
IOException
public void setURL(String urlSpec)
throws IOException
IOExceptionpublic URL getURL()
public final BufferedReader setReader(Reader in)
public final Reader getReader()
public final void pushReader(BufferedReader in)
public final void pushReader(String urlSpec)
throws IOException
IOExceptionpublic final void popReader()
public void setXMLScanExceptionMode(int mode)
public XMLScanException getXMLScanException(String message)
public XMLScanException getXMLScanException(String message,
int mode)
public String getErrorMessage(String message)
public String getErrorMessage(String message,
int mode)
public final int getLine()
public final int getCharPos()
public final int getTokenLine()
public final int getTokenCharPos()
public final int currentToken()
public final String currentTokenString()
public final boolean recoverAtSTag(String stag)
public final boolean recoverAtSTag(String stag,
int tokenLimit)
public final boolean recoverAfterETag(String etag)
public final boolean recoverAfterETag(String etag,
int tokenLimit)
public final void skipTag()
throws IOException
IOException
public final String getXMLSection()
throws IOException
IOException
public final String getXMLSectionContent()
throws IOException
IOException
private final int nextToken()
throws IOException,
XMLScanException
IOException
XMLScanException
private int parseCharData()
throws IOException
IOException
private int parseMarkup()
throws IOException
IOException
private int parseSTag()
throws IOException
IOException
private void attributePrefixFixup(String nsPrefix,
String ns)
private int parseETag()
throws IOException
IOException
private void parseAttribute()
throws IOException
IOException
private int parseString(StringBuilder buf)
throws IOException
IOException
private int parseDeclaration()
throws IOException
IOException
private int parseCDSect()
throws IOException
IOException
private int parseComment()
throws IOException
IOException
private void skipDoctype()
throws IOException
IOException
private int parseDoctype()
throws IOException
IOException
private void checkSequence(String seq)
throws IOException
IOException
private int parsePI()
throws IOException
IOExceptionpublic final boolean setSkipPI(boolean skipped)
public final boolean setSkipComment(boolean skipped)
public final boolean setSkipDoctype(boolean skipped)
public final boolean getRecognizeNamespaces()
public final boolean setRecognizeNamespaces(boolean recnsp)
private final void setSectionBuffering(boolean buffering)
private final void clearSectionBuffer()
private final String getStrippedSectionBuffer()
private final String getSectionBuffer()
public final boolean atSTag()
throws IOException
IOException
public final boolean atSTag(String tagName)
throws IOException
tagName - The element name to be tested
IOException
public final boolean atETag()
throws IOException
IOException
public final boolean atETag(String tName)
throws IOException
tName - The element name to be tested
IOException
public final boolean atPI()
throws IOException
IOException
public final boolean atComment()
throws IOException
IOException
public final boolean atDoctype()
throws IOException
IOException
public final boolean atCDSect()
throws IOException
IOException
public final boolean atDoctype(String name)
throws IOException
IOException
public final boolean atCharData()
throws IOException
IOException
public final boolean atEndOfDocument()
throws IOException
IOException
public final int getToken()
throws IOException
IOException
public final String getTokenString()
throws IOException
IOException
public final String getTagName()
throws IOException
IOException
public final String getNamespace()
throws IOException
IOException
public final String getComment()
throws IOException
IOExceptionpublic final String getDoctypeName()
public final String getPubidLiteral()
public final String getSystemLiteral()
public final String getPI()
throws IOException
IOException
public final String getCharData()
throws IOException
IOException
public final String getTrimmedCharData()
throws IOException
IOException
public final String getCDSect()
throws IOException
IOException
public final HashMap<String,String> getAttributes()
throws IOException
IOException
public final String getAttribute(String attributeName)
throws IOException
IOException
public final Iterator getAttributeIterator()
throws IOException
IOException
public final String takeSTag()
throws IOException
IOException
public final void takeSTag(String tagName)
throws IOException
IOException
public final String takeETag()
throws IOException
IOException
public final void takeETag(String tagName)
throws IOException
IOException
public final String takeCharData()
throws IOException
IOException
public final String takeTrimmedCharData()
throws IOException
IOException
public final String takeCDSect()
throws IOException
IOException
public final String takePI()
throws IOException
IOException
public final String takeComment()
throws IOException
IOException
public String takeTextElement(String tagName)
throws IOException
IOException
public int takeIntElement(String tagName)
throws IOException
IOException
public long takeLongElement(String tagName)
throws IOException
IOException
public float takeFloatElement(String tagName)
throws IOException
IOException
public double takeDoubleElement(String tagName)
throws IOException
IOException
public HashMap<String,String> takeEmptyElement(String tagName)
throws IOException
IOException
public final int read()
throws IOException
IOException
public final String takeString(int len)
throws IOException
IOExceptionprivate boolean isNameChar()
private boolean isNamespaceSepChar()
private boolean isNameStartChar()
private boolean isSpaceChar()
private void skipSpaceChars()
throws IOException
IOException
private int nextChar()
throws IOException
IOException
public final void nextParsedChar()
throws IOException
IOExceptionprivate void setTokenPos()
private void clearBuffer(StringBuilder b)
private void pushTag(String tag,
int namespaceDeclarationCount)
private String topTag()
private void popTag(String tag)
private void checkEmptyTagStack()
public final void setpopOnEndOfDocument(boolean mode)
public final boolean getpopOnEndOfDocument()
public static String tokenString(int token)
public static void setDebug(boolean mode)
public final void setDefaultModes()
private void initState()
public final void pushState()
public final void popState()
public final void showTokenizerStack()
public final void showTokenizerStack(String message)
public final void showTokenizerState()
public final void showTokenizerState(String message)
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||