Package Bio :: Package Blast :: Module NCBIStandalone
[show private | hide private]
[frames | no frames]

Module Bio.Blast.NCBIStandalone

This module provides code to work with the standalone version of BLAST, either blastall or blastpgp, provided by the NCBI. http://www.ncbi.nlm.nih.gov/BLAST/

Classes: LowQualityBlastError Except that indicates low quality query sequences. BlastParser Parses output from blast. BlastErrorParser Parses output and tries to diagnose possible errors. PSIBlastParser Parses output from psi-blast. Iterator Iterates over a file of blast results.

_Scanner Scans output from standalone BLAST. _BlastConsumer Consumes output from blast. _PSIBlastConsumer Consumes output from psi-blast. _HeaderConsumer Consumes header information. _DescriptionConsumer Consumes description information. _AlignmentConsumer Consumes alignment information. _HSPConsumer Consumes hsp information. _DatabaseReportConsumer Consumes database report information. _ParametersConsumer Consumes parameters information.

Functions: blastall Execute blastall. blastpgp Execute blastpgp. rpsblast Execute rpsblast.
Classes
BlastErrorParser Attempt to catch and diagnose BLAST errors while parsing.
BlastParser Parses BLAST data into a Record.Blast object.
Iterator Iterates over a file of multiple BLAST results.
PSIBlastParser Parses BLAST data into a Record.PSIBlast object.
_AlignmentConsumer  
_BlastConsumer  
_BlastErrorConsumer  
_DatabaseReportConsumer  
_DescriptionConsumer  
_HeaderConsumer  
_HSPConsumer  
_ParametersConsumer  
_PSIBlastConsumer  
_Scanner Scan BLAST output from blastall or blastpgp.

Exceptions
LowQualityBlastError Error caused by running a low quality sequence through BLAST.
ShortQueryBlastError Error caused by running a short query sequence through BLAST.

Function Summary
  blastall(blastcmd, program, database, infile, align_view, **keywds)
blastall(blastcmd, program, database, infile, align_view='7', **keywds) -> read, error Undohandles Execute and retrieve data from blastall.
  blastpgp(blastcmd, database, infile, align_view, **keywds)
blastpgp(blastcmd, database, infile, align_view='7', **keywds) -> read, error Undohandles Execute and retrieve data from blastpgp.
  rpsblast(blastcmd, database, infile, align_view, **keywds)
rpsblast(blastcmd, database, infile, **keywds) -> read, error Undohandles Execute and retrieve data from standalone RPS-BLAST.
  _get_cols(line, cols_to_get, ncols, expected)
  _re_search(regex, line, error_msg)
  _safe_float(str)
  _safe_int(str)

Function Details

blastall(blastcmd, program, database, infile, align_view='7', **keywds)

blastall(blastcmd, program, database, infile, align_view='7', **keywds)
-> read, error Undohandles

Execute and retrieve data from blastall.  blastcmd is the command
used to launch the 'blastall' executable.  program is the blast program
to use, e.g. 'blastp', 'blastn', etc.  database is the path to the database
to search against.  infile is the path to the file containing
the sequence to search with.

You may pass more parameters to **keywds to change the behavior of
the search.  Otherwise, optional values will be chosen by blastall.
The Blast output is by default in XML format. Use the align_view keyword
for output in a different format.

    Scoring
matrix              Matrix to use.
gap_open            Gap open penalty.
gap_extend          Gap extension penalty.
nuc_match           Nucleotide match reward.  (BLASTN)
nuc_mismatch        Nucleotide mismatch penalty.  (BLASTN)
query_genetic_code  Genetic code for Query.
db_genetic_code     Genetic code for database.  (TBLAST[NX])

    Algorithm
gapped              Whether to do a gapped alignment. T/F (not for TBLASTX)
expectation         Expectation value cutoff.
wordsize            Word size.
strands             Query strands to search against database.([T]BLAST[NX])
keep_hits           Number of best hits from a region to keep.
xdrop               Dropoff value (bits) for gapped alignments.
hit_extend          Threshold for extending hits.
region_length       Length of region used to judge hits.
db_length           Effective database length.
search_length       Effective length of search space.

    Processing
filter              Filter query sequence?  T/F
believe_query       Believe the query defline.  T/F
restrict_gi         Restrict search to these GI's.
nprocessors         Number of processors to use.
oldengine           Force use of old engine T/F

    Formatting
html                Produce HTML output?  T/F
descriptions        Number of one-line descriptions.
alignments          Number of alignments.
align_view          Alignment view.  Integer 0-11, passed as a string.
show_gi             Show GI's in deflines?  T/F
seqalign_file       seqalign file to output.

blastpgp(blastcmd, database, infile, align_view='7', **keywds)

blastpgp(blastcmd, database, infile, align_view='7', **keywds) ->
read, error Undohandles

Execute and retrieve data from blastpgp.  blastcmd is the command
used to launch the 'blastpgp' executable.  database is the path to the
database to search against.  infile is the path to the file containing
the sequence to search with.

You may pass more parameters to **keywds to change the behavior of
the search.  Otherwise, optional values will be chosen by blastpgp.
The Blast output is by default in XML format. Use the align_view keyword
for output in a different format.

    Scoring
matrix              Matrix to use.
gap_open            Gap open penalty.
gap_extend          Gap extension penalty.
window_size         Multiple hits window size.
npasses             Number of passes.
passes              Hits/passes.  Integer 0-2.

    Algorithm
gapped              Whether to do a gapped alignment.  T/F
expectation         Expectation value cutoff.
wordsize            Word size.
keep_hits           Number of beset hits from a region to keep.
xdrop               Dropoff value (bits) for gapped alignments.
hit_extend          Threshold for extending hits.
region_length       Length of region used to judge hits.
db_length           Effective database length.
search_length       Effective length of search space.
nbits_gapping       Number of bits to trigger gapping.
pseudocounts        Pseudocounts constants for multiple passes.
xdrop_final         X dropoff for final gapped alignment.
xdrop_extension     Dropoff for blast extensions.
model_threshold     E-value threshold to include in multipass model.
required_start      Start of required region in query.
required_end        End of required region in query.

    Processing
XXX should document default values
program             The blast program to use. (PHI-BLAST)
filter              Filter query sequence with SEG?  T/F
believe_query       Believe the query defline?  T/F
nprocessors         Number of processors to use.

    Formatting
html                Produce HTML output?  T/F
descriptions        Number of one-line descriptions.
alignments          Number of alignments.
align_view          Alignment view.  Integer 0-11, passed as a string.
show_gi             Show GI's in deflines?  T/F
seqalign_file       seqalign file to output.
align_outfile       Output file for alignment.
checkpoint_outfile  Output file for PSI-BLAST checkpointing.
restart_infile      Input file for PSI-BLAST restart.
hit_infile          Hit file for PHI-BLAST.
matrix_outfile      Output file for PSI-BLAST matrix in ASCII.
align_infile        Input alignment file for PSI-BLAST restart.

rpsblast(blastcmd, database, infile, align_view='7', **keywds)

rpsblast(blastcmd, database, infile, **keywds) ->
read, error Undohandles

Execute and retrieve data from standalone RPS-BLAST.  blastcmd is the
command used to launch the 'rpsblast' executable.  database is the path
to the database to search against.  infile is the path to the file
containing the sequence to search with.

You may pass more parameters to **keywds to change the behavior of
the search.  Otherwise, optional values will be chosen by rpsblast.

Please note that this function will give XML output by default, by
setting align_view to seven (i.e. command line option -m 7).
You should use the NCBIXML.BlastParser() to read the resulting output.
This is because NCBIStandalone.BlastParser() does not understand the
plain text output format from rpsblast.

WARNING - The following text and associated parameter handling has not
received extensive testing.  Please report any errors we might have made...

    Algorithm/Scoring
gapped              Whether to do a gapped alignment.  T/F
multihit            0 for multiple hit (default), 1 for single hit
expectation         Expectation value cutoff.
range_restriction   Range restriction on query sequence (Format: start,stop) blastp only
                    0 in 'start' refers to the beginning of the sequence
                    0 in 'stop' refers to the end of the sequence
                    Default = 0,0
xdrop               Dropoff value (bits) for gapped alignments.
xdrop_final         X dropoff for final gapped alignment (in bits).
xdrop_extension     Dropoff for blast extensions (in bits).
search_length       Effective length of search space.
nbits_gapping       Number of bits to trigger gapping.
protein             Query sequence is protein.  T/F
db_length           Effective database length.

    Processing
filter              Filter query sequence with SEG?  T/F
case_filter         Use lower case filtering of FASTA sequence T/F, default F
believe_query       Believe the query defline.  T/F
nprocessors         Number of processors to use.
logfile             Name of log file to use, default rpsblast.log

    Formatting
html                Produce HTML output?  T/F
descriptions        Number of one-line descriptions.
alignments          Number of alignments.
align_view          Alignment view.  Integer 0-9.
show_gi             Show GI's in deflines?  T/F
seqalign_file       seqalign file to output.
align_outfile       Output file for alignment.

Generated by Epydoc 2.1 on Mon Aug 27 16:12:12 2007 http://epydoc.sf.net