Package nltk_lite :: Package parse :: Module rd :: Class SteppingRecursiveDescent
[show private | hide private]
[frames | no frames]

Type SteppingRecursiveDescent

object --+            
         |            
    ParseI --+        
             |        
 AbstractParse --+    
                 |    
  RecursiveDescent --+
                     |
                    SteppingRecursiveDescent


A RecursiveDescent that allows you to step through the parsing process, performing a single operation at a time.

The initialize method is used to start parsing a text. expand expands the first element on the frontier using a single CFG production, and match matches the first element on the frontier against the next text token. backtrack undoes the most recent expand or match operation. step performs a single expand, match, or backtrack operation. parses returns the set of parses that have been found by the parser.

See Also: nltk.cfg

Method Summary
  __init__(self, grammar, trace)
Create a new RecursiveDescent, that uses grammar to parse texts.
boolean backtrack(self)
Return the parser to its state before the most recent match or expand operation.
boolean currently_complete(self)
Return whether the parser's current state represents a complete parse.
Production or None expand(self, production)
Expand the first element of the frontier.
list of Production expandable_productions(self)
Return a list of all the productions for which expansions are available for the current parser state.
list of tuple of int frontier(self)
Return a list of the tree locations of all subtrees that have not yet been expanded, and all leaves that have not yet been matched.
  get_parse_list(self, tokens)
  initialize(self, tokens)
Start parsing a given text.
String or None match(self)
Match the first element of the frontier.
list of Tree parses(self)
Return a list of the parses that have been found by this parser so far.
list of String remaining_text(self)
Return the portion of the text that is not yet covered by the tree.
  set_grammar(self, grammar)
Change the grammar used to parse texts.
Production or String or boolean step(self)
Perform a single parsing operation.
Tree tree(self)
Return a partial structure for the text that is currently being parsed.
list of Production untried_expandable_productions(self)
Return a list of all the untried productions for which expansions are available for the current parser state.
boolean untried_match(self)
Return whether the first element of the frontier is a token that has not yet been matched.
  _freeze(self, tree)
list of int _parse(self, remaining_text, tree, frontier)
A stub version of _parse that sets the parsers current state to the given arguments.
Inherited from RecursiveDescent: trace, _expand, _match, _production_to_tree, _trace_backtrack, _trace_expand, _trace_fringe, _trace_match, _trace_start, _trace_succeed, _trace_tree
Inherited from AbstractParse: get_parse, grammar, parse
Inherited from ParseI: get_parse_probs
Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Instance Variable Summary
  _history: A list of (rtext, tree, frontier) tripples, containing the previous states of the parser.
  _tried_e: A record of all productions that have been tried for a given tree.
  _tried_m: A record of what tokens have been matched for a given tree.

Method Details

__init__(self, grammar, trace=0)
(Constructor)

Create a new RecursiveDescent, that uses grammar to parse texts.
Parameters:
grammar - The grammar used to parse texts.
           (type=Grammar)
trace - The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.
           (type=int)
Overrides:
nltk_lite.parse.rd.RecursiveDescent.__init__ (inherited documentation)

backtrack(self)

Return the parser to its state before the most recent match or expand operation. Calling undo repeatedly return the parser to successively earlier states. If no match or expand operations have been performed, undo will make no changes.
Returns:
true if an operation was successfully undone.
           (type=boolean)

currently_complete(self)

Returns:
Whether the parser's current state represents a complete parse.
           (type=boolean)

expand(self, production=None)

Expand the first element of the frontier. In particular, if the first element of the frontier is a subtree whose node type is equal to production's left hand side, then add a child to that subtree for each element of production's right hand side. If production is not specified, then use the first untried expandable production. If all expandable productions have been tried, do nothing.
Returns:
The production used to expand the frontier, if an expansion was performed. If no expansion was performed, return None.
           (type=Production or None)

expandable_productions(self)

Returns:
A list of all the productions for which expansions are available for the current parser state.
           (type=list of Production)

frontier(self)

Returns:
A list of the tree locations of all subtrees that have not yet been expanded, and all leaves that have not yet been matched.
           (type=list of tuple of int)

initialize(self, tokens)

Start parsing a given text. This sets the parser's tree to the start symbol, its frontier to the root node, and its remaining text to token['SUBTOKENS'].

match(self)

Match the first element of the frontier. In particular, if the first element of the frontier has the same type as the next text token, then substitute the text token into the tree.
Returns:
The token matched, if a match operation was performed. If no match was performed, return None
           (type=String or None)

parses(self)

Returns:
A list of the parses that have been found by this parser so far.
           (type=list of Tree)

remaining_text(self)

Returns:
The portion of the text that is not yet covered by the tree.
           (type=list of String)

set_grammar(self, grammar)

Change the grammar used to parse texts.
Parameters:
grammar - The new grammar.
           (type=CFG)

step(self)

Perform a single parsing operation. If an untried match is possible, then perform the match, and return the matched token. If an untried expansion is possible, then perform the expansion, and return the production that it is based on. If backtracking is possible, then backtrack, and return 1. Otherwise, return 0.
Returns:
0 if no operation was performed; a token if a match was performed; a production if an expansion was performed; and 1 if a backtrack operation was performed.
           (type=Production or String or boolean)

tree(self)

Returns:
A partial structure for the text that is currently being parsed. The elements specified by the frontier have not yet been expanded or matched.
           (type=Tree)

untried_expandable_productions(self)

Returns:
A list of all the untried productions for which expansions are available for the current parser state.
           (type=list of Production)

untried_match(self)

Returns:
Whether the first element of the frontier is a token that has not yet been matched.
           (type=boolean)

_parse(self, remaining_text, tree, frontier)

A stub version of _parse that sets the parsers current state to the given arguments. In RecursiveDescent, the _parse method is used to recursively continue parsing a text. SteppingRecursiveDescent overrides it to capture these recursive calls. It records the parser's old state in the history (to allow for backtracking), and updates the parser's new state using the given arguments. Finally, it returns [1], which is used by match and expand to detect whether their operations were successful.
Returns:
[1]
           (type=list of int)
Overrides:
nltk_lite.parse.rd.RecursiveDescent._parse

Instance Variable Details

_history

A list of (rtext, tree, frontier) tripples, containing the previous states of the parser. This history is used to implement the backtrack operation.

_tried_e

A record of all productions that have been tried for a given tree. This record is used by expand to perform the next untried production.

_tried_m

A record of what tokens have been matched for a given tree. This record is used by step to decide whether or not to match a token.

Generated by Epydoc 2.1 on Tue Sep 5 09:37:22 2006 http://epydoc.sf.net