Package nltk_lite :: Module utilities
[show private | hide private]
[frames | no frames]

Module nltk_lite.utilities

Classes
Counter A counter that auto-increments each time its value is read.
MinimalSet Find contexts where more than one possible target value can appear.
SortedDict A very rudamentary sorted dictionary, whose main purpose is to allow dictionaries to be displayed in a consistent order in regression tests.

Function Summary
  edit_dist(s1, s2)
Calculate the Levenshtein edit-distance between two strings.
  filestring(f)
  pr(data, start, end)
Pretty print a sequence of data items
  print_string(s, width)
Pretty print a string, breaking lines on whitespace
string re_show(regexp, string)
Search string for substrings matching regexp and wrap the matches with braces.
  _edit_dist_init(len1, len2)
  _edit_dist_step(lev, i, j, c1, c2)

Function Details

edit_dist(s1, s2)

Calculate the Levenshtein edit-distance between two strings. The edit distance is the number of characters that need to be substituted, inserted, or deleted, to transform s1 into s2. For example, transforming "rain" to "shine" requires three steps, consisting of two substitutions and one insertion: "rain" -> "sain" -> "shin" -> "shine". These operations could have been done in other orders, but at least three steps are needed.
Parameters:
s1, s2 - The strings to be analysed
           (type of s1=string @rtype int)
           (type of s2=string @rtype int)

pr(data, start=0, end=None)

Pretty print a sequence of data items
Parameters:
data - the data stream to print
           (type=sequence or iterator)
start - the start position
           (type=int)
end - the end position
           (type=int)

print_string(s, width=70)

Pretty print a string, breaking lines on whitespace
Parameters:
s - the string to print, consisting of words and spaces
           (type=string)
width - the display width
           (type=int)

re_show(regexp, string)

Search string for substrings matching regexp and wrap the matches with braces. This is convenient for learning about regular expressions.
Parameters:
regexp - The regular expression.
string - The string being matched.
Returns:
A string with braces surrounding the matched substrings.
           (type=string)

Generated by Epydoc 2.1 on Tue Sep 5 09:37:21 2006 http://epydoc.sf.net