rebaseextract

 

Function

Extract data from REBASE

Description

The Restriction Enzyme database (REBASE) is a collection of information about restriction enzymes and related proteins. It contains published and unpublished references, recognition and cleavage sites, isoschizomers, commercial availability, methylation sensitivity, crystal and sequence data. DNA methyltransferases, homing endonucleases, nicking enzymes, specificity subunits and control proteins are also included. Most recently, putative DNA methyltransferases and restriction enzymes, as predicted from analysis of genomic sequences, are also listed.

The home page of REBASE is: http://rebase.neb.com/

This program derives recognition site and cleavage information from the "withrefm" file of an REBASE distribution. It creates three files in the EMBOSS data subdirectory REBASE. A pattern file, a reference file and a supplier file.

It will also (by default) produce an 'embossre.equ' file. This can be turned off by setting the -equivalences option to be false. This option calculates an 'embossre.equ' file using restriction enzyme prototypes in the "withrefm" file. The 'embossre.equ' file is a file of preferred isoschizomers. You may edit it to contain your available restriction enzymes.

The EMBOSS programs that find restriction cutting sites use the data files produced by this program and will not work without them.

Running this program may be the job of your system manager.

Usage

Here is a sample session with rebaseextract


% rebaseextract 
Extract data from REBASE
REBASE database withrefm file: withrefm
REBASE database proto file: proto

Go to the input files for this example
Go to the output files for this example

Command line arguments

   Standard (Mandatory) qualifiers:
  [-infile]            infile     REBASE database withrefm file
  [-protofile]         infile     REBASE database proto file

   Additional (Optional) qualifiers:
   -[no]equivalences   boolean    [Y] This option calculates an embossre.equ
                                  file using restriction enzyme prototypes in
                                  the withrefm file.

   Advanced (Unprompted) qualifiers: (none)
   Associated qualifiers: (none)
   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write standard output
   -filter             boolean    Read standard input, write standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages

Standard (Mandatory) qualifiers Allowed values Default
[-infile]
(Parameter 1)
REBASE database withrefm file Input file Required
[-protofile]
(Parameter 2)
REBASE database proto file Input file Required
Additional (Optional) qualifiers Allowed values Default
-[no]equivalences This option calculates an embossre.equ file using restriction enzyme prototypes in the withrefm file. Boolean value Yes/No Yes
Advanced (Unprompted) qualifiers Allowed values Default
(none)

Input file format

The input file must be the "withrefm" file of a REBASE distribution.

For example, the withrefm file for REBASE version 005 is at: ftp://ftp.neb.com/pub/rebase/withrefm.005

Input files for usage example

File: withrefm

 
REBASE version 106                                              withrefm.106
 
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    REBASE, The Restriction Enzyme Database   http://rebase.neb.com
    Copyright (c)  Dr. Richard J. Roberts, 2001.   All rights reserved.
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 
Rich Roberts                                                    May 31 2001
 

<ENZYME NAME>   Restriction enzyme name.
<ISOSCHIZOMERS> Other enzymes with this specificity.
<RECOGNITION SEQUENCE> 
                These are written from 5' to 3', only one strand being given.
                If the point of cleavage has been determined, the precise site
                is marked with ^.  For enzymes such as HgaI, MboII etc., which
                cleave away from their recognition sequence the cleavage sites
                are indicated in parentheses.  

                For example HgaI GACGC (5/10) indicates cleavage as follows:
                                5' GACGCNNNNN^      3'
                                3' CTGCGNNNNNNNNNN^ 5'

                In all cases the recognition sequences are oriented so that
                the cleavage sites lie on their 3' side.

                REBASE Recognition sequences representations use the standard 
                abbreviations (Eur. J. Biochem. 150: 1-5, 1985) to represent 
                ambiguity.
                                R = G or A
                                Y = C or T
                                M = A or C
                                K = G or T
                                S = G or C
                                W = A or T
                                B = not A (C or G or T)
                                D = not C (A or G or T)
                                H = not G (A or C or T)
                                V = not T (A or C or G)
                                N = A or C or G or T



                ENZYMES WITH UNUSUAL CLEAVAGE PROPERTIES:  

                Enzymes that cut on both sides of their recognition sequences,
                such as BcgI, Bsp24I, CjeI and CjePI, have 4 cleavage sites
                each instead of 2.



  [Part of this file has been deleted for brevity]

<6>S.A. Thompson
<7>N
<8>Morgan, R.D., Unpublished observations.
Morgan, R.D., Xu, Q., US Patent Office, 2001.
Xu, Q., Morgan, R., Blaser, M., Unpublished observations.

<1>HspAI
<2>HhaI,AspLEI,BcaI,BspLAI,BstHHI,CcoP95I,CfoI,Csp1470I,FnuDIII,Hin6I,Hin7I,HinGUI,HinP1I,HinS1I,HinS2I,Hpy99III,HpyF10I,HsoI,MnnIV,NgoEII,SciNI
<3>G^CGC
<4>
<5>Haemophilus species A
<6>S.K. Degtyarev
<7>I
<8>Rechkunova, N.I., Prikhod'ko, E.A., Shevchenko, A.V., Degtyarev, S.K., Unpublished observations.

<1>KpnI
<2>Acc65I,AhaB8I,Asp718I,BspJ106I,Eco149I,Esp19I,KpnK14I,MvsI,MvsAI,MvsBI,MvsCI,MvsDI,MvsEI,NmiI,Sau10I,SthI,SthAI,SthBI,SthCI,SthDI,SthEI,SthFI,SthGI,SthHI,SthJI,SthKI,SthLI,SthMI,SthNI,Uba76I,Uba85I,Uba86I,Uba87I,Uba1201I
<3>GGTAC^C
<4>4(6)
<5>Klebsiella pneumoniae OK8
<6>ATCC 49790
<7>ABCDEFGHIJKLMNOQRSTU
<8>Kiss, A., Finta, C., Venetianer, P., (1991) Nucleic Acids Res., vol. 19, pp. 3460.
Smith, D.I., Blattner, F.R., Davies, J., (1976) Nucleic Acids Res., vol. 3, pp. 343-353.
Tomassini, J., Roychoudhury, R., Wu, R., Roberts, R.J., (1978) Nucleic Acids Res., vol. 5, pp. 4055-4064.

<1>NotI
<2>CciNI,CspBI,MchAI
<3>GC^GGCCGC
<4>?(4)
<5>Nocardia otitidis-caviarum
<6>ATCC 14630
<7>ABCDEFGHJKLMNOQRSTU
<8>Borsetti, R., Wise, D., Qiang, B.-Q., Schildkraut, I., Unpublished observations.
Morgan, R.D., Unpublished observations.
Morgan, R.D., Benner, J.S., Claus, T.E., US Patent Office, 1994.
Qiang, B.-Q., Schildkraut, I., (1987) Methods Enzymol., vol. 155, pp. 15-21.

<1>TaqI
<2>CviSIII,EsaBC3I,HpyV,Hpy26II,HpyF14III,HpyF16I,HpyF23I,HpyF24I,HpyF26III,HpyF30I,HpyF35I,HpyF40II,HpyF42IV,HpyF45I,HpyF49I,HpyF52I,HpyF59III,HpyF62II,HpyF64I,HpyF65II,HpyF66IV,HpyF71I,HpyF73II,HpyJP26II,PpaAII,Taq20I,Tbr51I,TfiA3I,TfiTok4A2I,TfiTok6A1I,TflI,Tsc4aI,Tsp32I,Tsp32II,Tsp358I,Tsp505I,Tsp510I,TspAK13D21I,TspAK16D24I,TspNI,TspVi4AI,TspVil3I,Tth24I,TthHB8I,TthRQI
<3>T^CGA
<4>4(6)
<5>Thermus aquaticus YTI
<6>J.I. Harris
<7>ABCDEFGIJLMNOQRSTU
<8>Anton, B.P., Brooks, J.E., Unpublished observations.
Fomenkov, A., Xiao, J.-P., Dila, D., Raleigh, E., Xu, S.-Y., (1994) Nucleic Acids Res., vol. 22, pp. 2399-2403.
McClelland, M., (1981) Nucleic Acids Res., vol. 9, pp. 6795-6804.
Sato, S., Hutchison, C.A. III, Harris, J.I., (1977) Proc. Natl. Acad. Sci. U. S. A., vol. 74, pp. 542-546.
Zebala, J.A., (1993) Diss. Abstr., vol. 54, pp. 1394-1398.

File: proto

 
REBASE version 305                                              proto.305
 
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    REBASE, The Restriction Enzyme Database   http://rebase.neb.com
    Copyright (c)  Dr. Richard J. Roberts, 2003.   All rights reserved.
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 
Rich Roberts                                                    Apr 30 2003
 



	    TYPE II ENZYMES
	    ---------------

BseYI                          CCCAGC (-5/-1)
BsiYI                          CCNNNNN^NNGG
BsrI                           ACTGG (1/-1)
HaeIII                         GG^CC
HpaII                          C^CGG
Ksp632I                        CTCTTC (1/4)
MaeII                          A^CGT



	    TYPE I ENZYMES
	    ---------------

EcoAI                          GAGNNNNNNNGTCA
EcoBI                          TGANNNNNNNNTGCT
EcoDI                          TTANNNNNNNGTCY
EcoDR2                         TCANNNNNNGTCG
EcoDR3                         TCANNNNNNNATCG
EcoDXXI                        TCANNNNNNNRTTC
EcoEI                          GAGNNNNNNNATGC
EcoKI                          AACNNNNNNGTGC



	    TYPE III ENZYMES
	    ---------------

EcoPI                          AGACC
EcoP15I                        CAGCAG (25/27)
HinfIII                        CGAAT
StyLTI                         CAGAG

Output file format

Output files for usage example

File: embossre.equ

Bsc4I BsiYI
Bse1I BsrI
BshI HaeIII
BsiSI HpaII
Bsu6I Ksp632I
HpyCH4IV MaeII

Directory: REBASE

This directory contains output files.

The output files are held in the REBASE subdirectory of the EMBOSS data directory. There are three:

rebaseextract will also (by default) produce an 'embossre.equ' file in the EMBOSS data directory. This can be turned off by setting the -equivalences option to be false. This option calculates an 'embossre.equ' file using restriction enzyme prototypes in the "withrefm" file. The 'embossre.equ' file is a file of preferred isoschizomers. You may edit it to contain your available restriction enzymes.

Data files

The "withrefm" file of an REBASE distribution is the input file for this program.

Notes

The home page of REBASE is: http://rebase.neb.com/

Running this program may be the job of your system manager.

The ready-made files produced by this program may already be available at the REBASE web site: http://rebase.neb.com/rebase/rebase.files.html or http://rebase.neb.com/rebase/rebase.f37.html

References

  1. Nucleic Acids Research 27: 312-313 (1999).

Warnings

The program will warn you if the input file is incorrectly formatted.

Diagnostic Error Messages

Exit status

It exits with status 0 unless an error is reported.

Known bugs

See also

Program nameDescription
aaindexextract Extract data from AAINDEX
cutgextract Extract data from CUTG
printsextract Extract data from PRINTS
prosextract Build the PROSITE motif database for use by patmatmotifs
tfextract Extract data from TRANSFAC

Author(s)

Alan Bleasby (ajb © ebi.ac.uk)
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

History

Completed 12th April 1999

Target users

This program is intended to be used by administrators responsible for software and database installation and maintenance.

Comments

None