Trees | Index | Help |
---|
Package Martel :: Module Iterator |
|
Iterate over records of a XML parse tree.
The standard parser is callback based over all the elements of a file. If the file contains records, many people would like to be able to iterate over each record and only use the callback parser to analyze the record.
If the expression is a 'ParseRecords', then the code to do this is easy; use its make_reader to grab records and its record_expression to parse them. However, this isn't general enough. The use of a ParseRecords in the format definition should be strictly a implementation decision for better memory use. So there needs to be an API which allows both full and record oriented parsers.
Here's an example use of the API: >>> import sys >>> import swissprot38 # one is in Martel/test/testformats >>> from xml.dom import pulldom >>> iterator = swissprot38.format.make_iterator("swissprot38_record") >>> text = open("sample.swissprot").read() >>> for record in iterator.iterateString(text, pulldom.SAX2DOM()): .. print "Read a record with the following AC numbers:" ... for acc in record.document.getElementsByTagName("ac_number"): ... acc.writexml(sys.stdout) ... sys.stdout.write(" ") ...
There are several parts to this API. First is the 'Iterator
There are two parts to the API. One is the EventStream. This contains a single method called "next()" which returns a list of SAX events in the 2-ple (event_name, args). It is called multiple times to return successive event lists and returns None if no events are available.
The other is the Iterator
Sean McGrath has a RAX parser (Record API for XML) which uses a concept similar to this.Classes | |
---|---|
EventStream |
|
HeaderFooterEventStream |
|
Iterate |
|
Iterator |
|
IteratorHeaderFooter |
|
IteratorRecords |
|
RecordEventStream |
|
StoreEvents |
Function Summary | |
---|---|
_get_next_text(reader)
|
Trees | Index | Help |
---|
Generated by Epydoc 2.1 on Thu Jun 30 22:06:02 2005 | http://epydoc.sf.net |