Friday, July 25, 2008

BiRG Minutes : June 11, 2008

Analyzing Protein Sequences

 In-silico biochemistry

 Sliding-windows techniques – most ancient way of looking at sequences

                -used if the strand of DNA was cut in the middle

                -ND THE WAS where the A was cut off

                                NDT HEW AS

                                DTH EWA S

                                THE WAS

                -Use past experiences and what proteins have been together in the past

-Hydrophobicity is the most popular analysis – a good indicator of transmembrane segments or core regions within a protein.

 
Predicting transmembrane domains

ProtScale allows one to compute and represent the profile produced by any amino acid scale on  a selected protein.

amino acid scale is defined by a numerical value assigned to each type of amino acid.

THMM Transmembrane Helix Prediction is a method for predicting transmembrane helices based on a hidden Markov Model (HMM)

HMM - a statistical model in which the system being modeled is assumed to be a Markov process with unknown parameters, and the challenge is to determine the hidden parameters from the observable parameters. The extracted model parameters can then be used to perform further analysis, for example for pattern recognition applications.

THMM creates a prediction, or what it should have been

ProtScale has parameters and shows what it is

Predicting post-translational modifications w/PROSITE

                Proteins get modified between the cell and getting read

                PROSITE motifs are written as patterns

      Short patterns are not very informative by themselves

      They only indicate a possibility

      Combine them with other information to draw a conclusion

NOT EVERYTHING IS IN PROSITE

                Interpreting PROSITE patterns

                                Some patterns may suggest nonexistent protein features

                                Short patterns are more informative if they are conserved across homologous sequences

Domains is defined as " independent globular folding units".  It is a portion of protein that can keep its shape if you remove it from the rest of the protein.  It consists of at least 50 amino acids.  -  Domains are like the various components of our kitchen – such as the oven, the microwave, the refrigerator, etc.  All together they constitute the complete kitchen, but they can also exist separately.  You only need to use microwave when making pop corn - which can be done outside the kitchen.

An average protein consists of 2 or 3 domains.  Usually each domain plays a specific role in the function of the protein.  It  may interact with other proteins, or bind ion like calcium or zinc, or it may contain an active site.  It is common to have a catalytic domain associated with a binding domain and a regulatory domain.   Imagine -   a toaster, where you have the grill [catalytic], the toast holder [binding], and the switch [regulation].

Domains are like independent functions that can be taken out of a program but still function

Researchers

A domain is a multi-sequence alignment similar to a puzzle

Using Domain collections

Scientists have been discovering and characterizing protein domains for more than 20 years

Manual collections are precise but small; where the researchers must document everything on their own

Automatic collections go out and find data in research documents etc

It is probably that only one of these servers will have the information to help you understand your protein

No comments:

IU News: Science

IU News: Technology