Package Bio :: Package expressions :: Module genbank
[show private | hide private]
[frames | no frames]

Module Bio.expressions.genbank

Martel based parser to read GenBank formatted files.

This is a huge regular regular expression for GenBank, built using the 'regular expressions on steroids' capabilities of Martel.

Documentation for GenBank format that I found:

o GenBank/EMBL feature tables are described at: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

o There are also descriptions of different GenBank lines at: http://www.ibc.wustl.edu/standards/gbrel.txt
Function Summary
  define_block(identifier, block_tag, block_data, std_block_tag, std_tag)
Define a Martel grouping which can parse a block of text.

Variable Summary
Group accession = <Martel.Expression.Group instance at 0x2aaaa...
Group accession_block = <Martel.Expression.Group instance at 0...
Group authors_block = <Martel.Expression.Group instance at 0x2...
Group base_count = <Martel.Expression.Group instance at 0x2aaa...
Group base_count_line = <Martel.Expression.Group instance at 0...
Group base_number = <Martel.Expression.Group instance at 0x2aa...
Str big_indent_space = <Martel.Expression.Str instance at 0x...
MaxRepeat blank_space = <Martel.Expression.MaxRepeat instance at 0...
Group comment_block = <Martel.Expression.Group instance at 0x2...
Group consrtm_block = <Martel.Expression.Group instance at 0x2...
Group contig_block = <Martel.Expression.Group instance at 0x2a...
Group contig_location = <Martel.Expression.Group instance at 0...
Group data_file_division = <Martel.Expression.Group instance a...
Group date = <Martel.Expression.Group instance at 0x2aaaae674e...
Group db_source_block = <Martel.Expression.Group instance at 0...
Group definition_block = <Martel.Expression.Group instance at ...
list divisions = [<Martel.Expression.Str instance at 0x2aaaae...
Group feature = <Martel.Expression.Group instance at 0x2aaaae6...
Group feature_block = <Martel.Expression.Group instance at 0x2...
Group feature_key = <Martel.Expression.Group instance at 0x2aa...
int FEATURE_KEY_INDENT = 5                                                                     
Group feature_key_line = <Martel.Expression.Group instance at ...
int FEATURE_QUALIFIER_INDENT = 21                                                                    
Group features_line = <Martel.Expression.Group instance at 0x2...
ParseRecords format = <Martel.Expression.ParseRecords instance at 0x2...
Group gi = <Martel.Expression.Group instance at 0x2aaaae67ecb0...
Seq header = <Martel.Expression.Seq instance at 0x2aaaae6a5f...
int INDENT = 12                                                                    
Group journal_block = <Martel.Expression.Group instance at 0x2...
Group keywords_block = <Martel.Expression.Group instance at 0x...
Group location = <Martel.Expression.Group instance at 0x2aaaae...
Group locus = <Martel.Expression.Group instance at 0x2aaaae674...
Group locus_line = <Martel.Expression.Group instance at 0x2aaa...
Group medline_line = <Martel.Expression.Group instance at 0x2a...
HeaderFooter ncbi_format = <Martel.Expression.HeaderFooter instance a...
Group nid = <Martel.Expression.Group instance at 0x2aaaae67e68...
Group nid_line = <Martel.Expression.Group instance at 0x2aaaae...
Group organism = <Martel.Expression.Group instance at 0x2aaaae...
Group organism_block = <Martel.Expression.Group instance at 0x...
Group origin_line = <Martel.Expression.Group instance at 0x2aa...
Group pid = <Martel.Expression.Group instance at 0x2aaaae67e90...
Group pid_line = <Martel.Expression.Group instance at 0x2aaaae...
Group primary = <Martel.Expression.Group instance at 0x2aaaae6...
Group primary_line = <Martel.Expression.Group instance at 0x2a...
Group primary_ref_line = <Martel.Expression.Group instance at ...
Group pubmed_line = <Martel.Expression.Group instance at 0x2aa...
Group qualifier = <Martel.Expression.Group instance at 0x2aaaa...
Alt qualifier_space = <Martel.Expression.Alt instance at 0x2...
Str quote = <Martel.Expression.Str instance at 0x2aaaae6a051...
Group quoted_chars = <Martel.Expression.Group instance at 0x2a...
Seq quoted_string = <Martel.Expression.Seq instance at 0x2aa...
Group record = <Martel.Expression.Group instance at 0x2aaaae6a...
Group record_end = <Martel.Expression.Group instance at 0x2aaa...
Group reference = <Martel.Expression.Group instance at 0x2aaaa...
Group reference_bases = <Martel.Expression.Group instance at 0...
Group reference_line = <Martel.Expression.Group instance at 0x...
Group reference_num = <Martel.Expression.Group instance at 0x2...
Group remark_block = <Martel.Expression.Group instance at 0x2a...
list residue_prefixes = [<Martel.Expression.Str instance at 0...
Group residue_type = <Martel.Expression.Group instance at 0x2a...
list residue_types = [<Martel.Expression.Str instance at 0x2a...
Group segment = <Martel.Expression.Group instance at 0x2aaaae6...
Group segment_line = <Martel.Expression.Group instance at 0x2a...
Group sequence = <Martel.Expression.Group instance at 0x2aaaae...
Group sequence_entry = <Martel.Expression.Group instance at 0x...
Group sequence_line = <Martel.Expression.Group instance at 0x2...
Group sequence_plus_spaces = <Martel.Expression.Group instance...
Group size = <Martel.Expression.Group instance at 0x2aaaae6745...
Str small_indent_space = <Martel.Expression.Str instance at ...
Group source_block = <Martel.Expression.Group instance at 0x2a...
Group taxonomy = <Martel.Expression.Group instance at 0x2aaaae...
Group title_block = <Martel.Expression.Group instance at 0x2aa...
Seq unquoted_string = <Martel.Expression.Seq instance at 0x2...
list valid_divisions = ['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'P...
list valid_residue_prefixes = ['ss-', 'ds-', 'ms-']
list valid_residue_types = ['DNA', 'RNA', 'mRNA', 'tRNA', 'rR...
Group version = <Martel.Expression.Group instance at 0x2aaaae6...
Group version_line = <Martel.Expression.Group instance at 0x2a...

Function Details

define_block(identifier, block_tag, block_data, std_block_tag=None, std_tag=None)

Define a Martel grouping which can parse a block of text.

Many of the GenBank lines we'll want to process are grouped into a block like:

IDENTIFIER Blah blah blah

Where blah blah blah can wrap for multiple lines. This function makes it easy to consistently define a definition for these blocks.

Arguments: o identifier - The identifier that begins the block (like DEFINITION). o block_tag - A callback tag for the entire block. o block_data - A callback tag for the data in the block (ie. the stuff you are interested in). o std_block_tag - A Bio.Std Martel tag used to register the entire block as having being a "standard" type of information. o std_tag - A Bio.Std Martel tag used to register just the information in the block as being "standard"

Variable Details

accession

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67dcb0>                   

accession_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67e560>                   

authors_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69b950>                   

base_count

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a13b0>                   

base_count_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a14d0>                   

base_number

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a1998>                   

big_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0x2aaaae674248>                     

blank_space

Type:
MaxRepeat
Value:
<Martel.Expression.MaxRepeat instance at 0x2aaaae674290>               

comment_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69e098>                   

consrtm_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69bf38>                   

contig_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a3710>                   

contig_location

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a3320>                   

data_file_division

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67d518>                   

date

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae674ea8>                   

db_source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae680680>                   

definition_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67e170>                   

divisions

Type:
list
Value:
[<Martel.Expression.Str instance at 0x2aaaae674e18>,
 <Martel.Expression.Str instance at 0x2aaaae674f80>,
 <Martel.Expression.Str instance at 0x2aaaae674fc8>,
 <Martel.Expression.Str instance at 0x2aaaae67d050>,
 <Martel.Expression.Str instance at 0x2aaaae67d098>,
 <Martel.Expression.Str instance at 0x2aaaae67d0e0>,
 <Martel.Expression.Str instance at 0x2aaaae67d128>,
 <Martel.Expression.Str instance at 0x2aaaae67d170>,
...                                                                    

feature

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a11b8>                   

feature_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a1290>                   

feature_key

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69ed88>                   

FEATURE_KEY_INDENT

Type:
int
Value:
5                                                                     

feature_key_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a0440>                   

FEATURE_QUALIFIER_INDENT

Type:
int
Value:
21                                                                    

features_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69ec20>                   

format

Type:
ParseRecords
Value:
<Martel.Expression.ParseRecords instance at 0x2aaaae6a2050>            

gi

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67ecb0>                   

header

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0x2aaaae6a5fc8>                     

INDENT

Type:
int
Value:
12                                                                    

journal_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69cb48>                   

keywords_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae680c68>                   

location

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69efc8>                   

locus

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6743f8>                   

locus_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67d5f0>                   

medline_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69cbd8>                   

ncbi_format

Type:
HeaderFooter
Value:
<Martel.Expression.HeaderFooter instance at 0x2aaaae6a3ea8>            

nid

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67e680>                   

nid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67e7a0>                   

organism

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69a9e0>                   

organism_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69add0>                   

origin_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a1878>                   

pid

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67e908>                   

pid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67ea28>                   

primary

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69eab8>                   

primary_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69e440>                   

primary_ref_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69ea70>                   

pubmed_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69d050>                   

qualifier

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a1098>                   

qualifier_space

Type:
Alt
Value:
<Martel.Expression.Alt instance at 0x2aaaae674320>                     

quote

Type:
Str
Value:
<Martel.Expression.Str instance at 0x2aaaae6a0518>                     

quoted_chars

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a05f0>                   

quoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0x2aaaae6a0998>                     

record

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a3d88>                   

record_end

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a3908>                   

reference

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69d9e0>                   

reference_bases

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69afc8>                   

reference_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69b2d8>                   

reference_num

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69aef0>                   

remark_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69d830>                   

residue_prefixes

Type:
list
Value:
[<Martel.Expression.Str instance at 0x2aaaae6745a8>,
 <Martel.Expression.Str instance at 0x2aaaae674710>,
 <Martel.Expression.Str instance at 0x2aaaae674758>]                   

residue_type

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae674b90>                   

residue_types

Type:
list
Value:
[<Martel.Expression.Str instance at 0x2aaaae6747a0>,
 <Martel.Expression.Str instance at 0x2aaaae6747e8>,
 <Martel.Expression.Str instance at 0x2aaaae674830>,
 <Martel.Expression.Str instance at 0x2aaaae674878>,
 <Martel.Expression.Str instance at 0x2aaaae6748c0>,
 <Martel.Expression.Str instance at 0x2aaaae674908>,
 <Martel.Expression.Str instance at 0x2aaaae674950>,
 <Martel.Expression.Str instance at 0x2aaaae674998>,
...                                                                    

segment

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae680dd0>                   

segment_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69a248>                   

sequence

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a1908>                   

sequence_entry

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a3128>                   

sequence_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a1ef0>                   

sequence_plus_spaces

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae6a1dd0>                   

size

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae674560>                   

small_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0x2aaaae674200>                     

source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69a908>                   

taxonomy

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69a5f0>                   

title_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae69c560>                   

unquoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0x2aaaae6a0b90>                     

valid_divisions

Type:
list
Value:
['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'PLN', 'BCT', 'RNA', 'VRL']        

valid_residue_prefixes

Type:
list
Value:
['ss-', 'ds-', 'ms-']                                                  

valid_residue_types

Type:
list
Value:
['DNA', 'RNA', 'mRNA', 'tRNA', 'rRNA', 'uRNA', 'scRNA', 'snRNA', 'snoR\
NA']                                                                   

version

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67eb00>                   

version_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaae67efc8>                   

Generated by Epydoc 2.1 on Thu Jun 30 22:06:07 2005 http://epydoc.sf.net