Class XMLParser

java.lang.Object
de.pdark.decentxml.XMLParser

public class XMLParser extends Object
The class uses the XMLTokenizer to parse an XMLSource into a Document.
See Also:
  • Field Details

    • entityResolver

      private EntityResolver entityResolver
      The entity resolver to use to expand entities in the input
    • expandEntities

      private boolean expandEntities
      Should entities be expanded? Use this to temporarily disable entity expansion even if a resolver is registered
    • treatEntitiesAsText

      private boolean treatEntitiesAsText
      Should the parser return entity nodes or treat them as text? Default is true.
    • charValidator

      private CharValidator charValidator
      The character validator to use
  • Constructor Details

    • XMLParser

      public XMLParser()
  • Method Details

    • setEntityResolver

      public XMLParser setEntityResolver(EntityResolver entityResolver)
    • getEntityResolver

      public EntityResolver getEntityResolver()
    • setExpandEntities

      public XMLParser setExpandEntities(boolean expandEntities)
    • isExpandEntities

      public boolean isExpandEntities()
    • setTreatEntitiesAsText

      public XMLParser setTreatEntitiesAsText(boolean treatEntitiesAsText)
    • isTreatEntitiesAsText

      public boolean isTreatEntitiesAsText()
    • getCharValidator

      public CharValidator getCharValidator()
    • setCharValidator

      public XMLParser setCharValidator(CharValidator charValidator)
    • parse

      public Document parse(XMLSource source)
      Parse an XML source into a Document
    • parseDocType

      protected DocType parseDocType(XMLTokenizer tokenizer)
    • createDTDTokenizer

      protected XMLTokenizer createDTDTokenizer(XMLSource source, int startOffset)
    • skipOptionalWhitespace

      protected Token skipOptionalWhitespace(XMLTokenizer tokenizer, Token startToken, DocType docType)
      If the next token is whitespace, skip it.
      Parameters:
      tokenizer -
      startToken - This might be whitespace
      docType -
      Returns:
      The current or the next token.
    • parseDocTypeSubSet

      protected Token parseDocTypeSubSet(XMLTokenizer tokenizer, Token startToken, DocType docType)
    • parseDocTypeNotation

      protected void parseDocTypeNotation(XMLTokenizer tokenizer, Token startToken, DocType docType)
    • parseDocTypeEntity

      protected void parseDocTypeEntity(XMLTokenizer tokenizer, Token startToken, DocType docType)
    • stripQuotes

      protected String stripQuotes(Token token)
    • parseDocTypeAttList

      protected void parseDocTypeAttList(XMLTokenizer tokenizer, Token startToken, DocType docType)
    • isValidName

      protected boolean isValidName(XMLTokenizer tokenizer, String name)
    • parseAttListNameTokens

      protected Token parseAttListNameTokens(XMLTokenizer tokenizer, Token token, DocTypeAttributeList attList)
    • parseAttListTypeGroup

      protected Token parseAttListTypeGroup(XMLTokenizer tokenizer, Token token, DocTypeAttributeList attList)
    • skipWhiteSpaceAndComments

      protected Token skipWhiteSpaceAndComments(XMLTokenizer tokenizer, Token token, DocTypeNode n)
    • parseDocTypeSubElement

      protected void parseDocTypeSubElement(XMLTokenizer tokenizer, Token startToken, DocType docType)
    • parsePublicLiteral

      protected Token parsePublicLiteral(XMLTokenizer tokenizer, Token startToken, DocType docType)
    • parseSystemLiteral

      protected Token parseSystemLiteral(XMLTokenizer tokenizer, Token startToken, DocType docType)
    • expect

      protected Token expect(XMLTokenizer tokenizer, Token startToken, XMLTokenizer.Type[] expected, String errorMessage)
      Fetch the next token and make sure it's one of expected. If not, create an XMLParseException using the errorMessage
    • expect

      protected Token expect(XMLTokenizer tokenizer, Token startToken, XMLTokenizer.Type expected, String errorMessage)
      Fetch the next token and make sure it's expected. If not, create an XMLParseException using the errorMessage
    • createTokenizer

      protected XMLTokenizer createTokenizer(XMLSource source)
      Parameters:
      source -
      Returns:
    • parseElement

      protected void parseElement(XMLTokenizer tokenizer, Element parent)
      Parse all tokens up to the end tag recursively into an element.
    • parseElementContent

      protected Token parseElementContent(XMLTokenizer tokenizer, Element parent, Set<String> recursionTrap)
      Parameters:
      tokenizer -
      parent -
    • expandEntity

      protected void expandEntity(Element parent, XMLTokenizer parentTokenizer, Token entityToken, Set<String> recursionTrap)
    • toNode

      protected Node toNode(Token token)
      This turns a token into a node.

      Override this to implement custom node types.

    • createDocTypeText

      protected Node createDocTypeText(Token token)
    • createProcessingInstruction

      protected Node createProcessingInstruction(Token token)
    • createElementWhitespace

      protected Node createElementWhitespace(Token token)
    • createComment

      protected Node createComment(Token token)
    • createCData

      protected Node createCData(Token token)
    • createElement

      protected Node createElement(Token token)
    • createAttribute

      protected Node createAttribute(Token token)
    • createEntity

      protected Node createEntity(Token token)
    • createText

      protected Node createText(Token token)
    • parse

      public static Document parse(String xml)
      Convenience method to parse a String into XML.

      In this case, the encoding is ignored; the string already has to be Unicode. After the parsing, you will still find the encoding from the XML declaration in the Document (if there was one).

    • parse

      public static Document parse(File file) throws IOException
      Convenience method to parse a file into XML.
      Throws:
      IOException