Package Bio :: Module NetCatch :: Class ExtractUrls
[show private | hide private]
[frames | no frames]

Class ExtractUrls

ParserBase --+    
             |    
    SGMLParser --+
                 |
                ExtractUrls


Method Summary
  __init__(self)
  __str__(self)
  end_a(self)
  extract_urls(self, handle)
  feed(self, handle)
feed(self, handle )
  handle_data(self, data)
  reset(self)
Reset this instance.
  start_a(self, attrs)
    Inherited from SGMLParser
  close(self)
Handle the remaining data.
  convert_charref(self, name)
Convert character reference, may be overridden.
  convert_codepoint(self, codepoint)
  convert_entityref(self, name)
Convert entity references.
  error(self, message)
  finish_endtag(self, tag)
  finish_shorttag(self, tag, data)
  finish_starttag(self, tag, attrs)
  get_starttag_text(self)
  goahead(self, end)
  handle_charref(self, name)
Handle character reference, no need to override.
  handle_comment(self, data)
  handle_decl(self, decl)
  handle_endtag(self, tag, method)
  handle_entityref(self, name)
Handle entity references, no need to override.
  handle_pi(self, data)
  handle_starttag(self, tag, method, attrs)
  parse_endtag(self, i)
  parse_pi(self, i)
  parse_starttag(self, i)
  report_unbalanced(self, tag)
  setliteral(self, *args)
Enter literal mode (CDATA).
  setnomoretags(self)
Enter literal mode (CDATA) till EOF.
  unknown_charref(self, ref)
  unknown_endtag(self, tag)
  unknown_entityref(self, ref)
  unknown_starttag(self, tag, attrs)
    Inherited from ParserBase
  getpos(self)
Return current line number and offset.
  parse_comment(self, i, report)
  parse_declaration(self, i)
  parse_marked_section(self, i, report)
  unknown_decl(self, data)
  updatepos(self, i, j)

Class Variable Summary
    Inherited from SGMLParser
SRE_Pattern entity_or_charref = &(?:([a-zA-Z][-\.a-zA-Z0-9]*)|#([0-9...
dict entitydefs = {'amp': '&', 'lt': '<', 'gt': '>', 'apos': ...

Method Details

feed(self, handle)

feed(self, handle )

Feed in data for scanning. handle is a file-like object containing html.
Overrides:
sgmllib.SGMLParser.feed

reset(self)

Reset this instance. Loses all unprocessed data.
Overrides:
sgmllib.SGMLParser.reset (inherited documentation)

Generated by Epydoc 2.1 on Wed Jan 31 09:59:40 2007 http://epydoc.sf.net