Friday, June 13, 2008

BiRG MINUTES June 13, 2008

Protein and specialized Sequence Databases

 Types of Organisms: Prokaryotic, Eukaryotic, and Archea

Protein Maturation

Deciphering a Swiss-Prot entry

Specialized protein databases: KEGG (the metabolic pathways database) or PDB (structure database)

2 ways to predict genetics

1.       Genes to proteins or translation (genomics)

2.       DNA

We must merge the two

From Gene to functional Protein

                DNA > mRNA > proteins > upon maturing > transportation > destination

Protein Maturation:

                -removal of some fragments

                -specific protein cleavage

                -chemical modifications

                -Phosphorylation (addition of phosphate that gives the protein its shape)

                -adition of lipids or sugars (glycosylation)

-Proteins are often modified to make them active

www.ebi.ac.uk/RESID

-Modification can imply attaching a lipid or a sugar

www.glycosuite.com

-Use these resources to determine the details of the modification

www.lipidbank.jp

Swiss-Prot Database – (British) entries describe all proteins that have known functions

                tremble contains the 4 mill putative proteins found in GenBank

                Swiss-Prot contains the subset of tremble with a known function

                This is redundant to create many databases using the same information

                A Swiss-Prot entry: www.expasy.org/uniprot/P00533

Gen Info (accession number), References, Commments, Cross-reference, feature table,  sequence

General Information: Entry Name, Primary Accession Number (PXXXX [P is for protein]), Last Modified, Protein name and synonyms, from/taxonomy fields (tells where protein came from), references section

                Comments section lists all the known functions of the protein

                Features Section localizes precisely every known function of your protein, each on its sequence

                      TRANSMEM: Transmembrane domain (something that passes through the membrane)

                      ACT_SITE: Active sites    (where chemicals can bond)

                      BINDING: Binding sites

                      DISULPHID: Bridge of cysteines


          EMBL: GenBank original DNA sequence

          PDB: Experimental structure of your protein

          DIP: Proteins interacting with your protein

          GlycoSuiteDB: Glycolsylations

          MIM: List of genetic diseases involving your protein

          Ontologies: Function of your protein

          Profiles: Known protein domains in your protein

          ENSEMBL: Genomic location of your protein

 
By alternative splicing, the protein can have MANY functions

          To find out about the function of your protein, you will need to determine

        Where your protein works

        Metabolic pathway in which the protein is involved

        The protein's 3D structure

        Which protein family it belongs to

Where do proteins work?

                Part of the metabolic pathway

                                Chain of production linking several different proteins

                                Modify metabolites by passing them from one enzyme to the next

                                On KEGG pathway, each enzyme appears w/its EC number

          www.genome.ad.jp/kegg

        KEGG is the most extensive database of metabolic pathways

        You can use it to compare species    Japan

          www.chem.qmul.ac.uk/iubmb

        The IUBMD assigns the EC numbers used to describe an enzyme activity   UK

          www.ecocy.org

        An exhaustive list of all known metabolic pathways in E. coli and other bacteria

Some important Protein Families

          www.kinasenet.org

        Kinases control everything in us; their deregulation is the cause of many cancers

          imgt.cines.fr

        Immunoglobulins are key elements of our natural defenses

          rebase.neb.com

        This site is a key resource on restriction enzymes

      Predicting protein function is a central goal in biology

          Protein databases help organize knowledge

          They provide the material for

        Developing new biological experiments

        Developing new prediction algorithms

        Extrapolating experimental data to unknown sequences


No comments:

IU News: Science

IU News: Technology