|
Lecture
Notes |
462a
Home
Reading - Chapter 5
Practice problems - Chapter 5, #15, 17; simple
heptapeptide sequence problem; Proteins extra
problems
Levels of
Structure
The function of a protein can only
be understood in terms of its structure. The three
dimensional structures of many proteins have been determined
and from these structures a few general principles can be
derived. Protein structure is discussed in terms of
four levels of organization:
- Primary
Structure is the amino acid sequence of its
polypeptide chain(s). Every protein has a unique
amino acid sequence.
- Secondary Structure
is the local spatial arrangement of the
polypeptide backbone, giving rise to recurring structural
patterns, ignoring the conformation of the individual
sidechains (R groups).
- Tertiary
Structure is the three dimensional structure of
the entire polypeptide, including conformations of side
chains.
- Quaternary
Structure refers (only in proteins that are
composed of two or more polypeptide chains, called
subunits) to the three dimensional spatial arrangement of
the subunits.
- (See Lehninger Principles of Biochemistry, Fig. 5-16
and related text material.)
Primary
Structure
|

|
- This is the primary structure of bovine
insulin, which is composed of two polypeptide
chains
(A
and
B).
The two polypeptide chains are joined by two
interchain disulfide bonds - the A chain
also contains an intrachain disulfide
bond.
|
- Determining the amino acid sequence of a protein used
to be a very laborious and time consuming process
involving chemical and enzymatic degradation.
- Today, the amino acid sequence of proteins is usually
determined from the nucleotide sequence of the gene - a
relatively simple and rapid process.
- The amino acid sequence of the same protein from many
sources, e.g., cytochrome c, shows that some amino acid
residues are conserved among all the proteins,
whereas others are not conserved.
- Such an analysis provides valuable information about
amino acid residues that may be essential for a protein's
function.
The importance of
amino acid side chains: Real Life Example - sickle cell
hemoglobin
- Hemoglobin is the oxygen transport protein in
blood.
- It is a tetramer containing two a
and two b chains
(Hemoglobin).
- Hemoglobin exists in two states: an oxy form and a
deoxy form.
- Several hundred mutant hemoglobins are known to
exist. In most, a single amino acid replacement
occurs in either the a or
b chain of normal Hb
A.
- Many of these changes cause no known effect, but
several lead to pathologies associated with abnormal
O2 transport.
- In sickle cell hemoglobin, HbS, there is a single
amino acid replacement of a Val for Glu at position 6 of
the b chain. This seemingly
innocuous change places a hydrophobic sidechain on the
surface of the protein. In the deoxy
conformation the Val sidechain of a b
chain in one Hb binds to the b
chain of another Hb. This leads to polymer
formation and precipitation of the deoxy Hb. This
leads to red cell lysis and anemia
(Hemoglobin
S).
Amino Acid
Composition
- The amino acid composition is a fundamental
characteristic of any protein.
- Hydrolysis of the protein in acid releases
the amino acids which are then quantitated using
ion exchange chromatography in an automated
amino acid analyzer.
- The amino acid peaks can be detected using
ninhydrin, which reacts with the free amino
groups of amino acids to produce a purple color,
or by reaction with reagents that generate
fluorescent derivatives, permitting detection of
much smaller amounts of each amino acid.
|

|
Amino Acid
Sequence
|
- The amino acid of each protein is unique and
determination of the amino acid sequence is an
important part of characterizing proteins.
Today, most protein amino acid sequences are
deduced from the sequences of their genes,
because sequencing DNA is much easier than
sequencing proteins.
- However, determination of protein sequences
is still an important tool in Biochemistry. We
use an automated process based on the Edman
reaction and chromatographic techniques to
identify the PTH-derivative.
- Although these reactions proceed to > 90%
yields at each step, eventually (about 25-75
cycles) it becomes difficult to detect the newly
released product. Thus a single series of
Edman degradation reactions is not able to
determine the entire sequence of a
protein.
|
- What is needed are smaller fragments, with new amino
termini, which can be individually purified and
sequenced. This is accomplished by cleaving the
protein with a proteolytic enzyme, such as trypsin, or a
chemical reagent such as cyanogen bromide, which
generates a set of peptides, fragments of the original
protein, that can be separated and sequenced.
- Trypsin cleaves peptide bonds on the carboxyl side
of Lys or Arg residues, as illustrated below.
- Chymotrypsin cleaves peptide bonds on the carboxyl
of Phe, Trp or Tyr residues, but also sometimes on the
carboxyl side of other hydrophobic amino acids, e.g.
Val, Leu, Ile, or Met.
- Other proteases have different specificities.
- Cyanogen bromide cleaves on the carboxyl side of
Met residues, but the chemistry of the cleavage
converts the Met residue at the C-terminus of the new
peptide to a derivative that is converted by acid
hydrolysis to homoserine (R group is
-CH2-CH2-SH) rather than Met, so
amino acid composition of the new peptide would show
homoserine.
- There are thus a variety of ways to fragment the
protein under investigation to determine the sequences
of manageable-size peptides.
- The problem, of course, is that once the proteolysis
has been accomplished and the peptides separated and
sequenced, you don't know how they are ordered in the
original protein. Reestablishing the order is the
big problem in protein sequencing. The method is like
solving a puzzle -- the sequences of the families of
peptides obtained from two different cleavage methods are
examined for OVERLAPS. For an example, see simple
practice problem for sequence
of a heptapeptide, and also the strategy
for sequencing
the B chain of insulin.
Mass
Spectrometry
- Recently mass spectrometry has become an important
technique in peptide/protein chemistry. Mass
spectrometers consist of three basis parts
- An ion source that creates charged molecules in the
gas phase
- a mass analyzer that uses a physical property, e.g.,
time-of-flight (TOF), to separate ions
- a detector.
- Two important methods are used to create protein
ions:
- In matrix-assisted laser desorption ionization
(MALDI) ions are created by using a laser to
excite proteins in a crystalline matrix. MALDI
is particularly suited for determining the molecular
weight of proteins, often to accuracies of a few parts
per million. The spectrum shown above
illustrates the molecular masses of several peptides
in a mixture.
- In electrospray ionization (ESI) ions are
created by applying a potential to a flowing
liquid. This causes the liquid to spray and
protein ions to be created. This method can also
be used to measure molecular weight, but is most
powerful when used in tandem MS/MS.
- A tandem mass spectrometer combines two mass
analyzers with a method to energetically activate ions.
In the first spectrometer a particular ion is isolated
from all other ions that enter the mass analyzer (as
marked above), dissociated, and the m/z values of the
dissociation products determined in the second mass
analyzer. The dissociation process causes covalent bonds
to fragment. In the case of peptide ions,
fragmentation processes predominate at or around the
amide bond, creating a ladder of ions that is indicative
of an amino acid sequence, as illustrated below.
Sequence
Homology
- Once the amino acid sequence of a protein has been
determined, there are powerful computer programs (If you
are interested, go to this web site to see some of the
tools available for proteomics)
that can be used to determine if the sequence is similar
to other proteins. Such a search might give the
results shown below.
#1 MKRTYQPNRRKRSKVHGFRARMSTKNGRKVLARRRRKGRKVLSA
#2 MKRTWQPSKLKHARVHGFRARMATKNGRKVIKARRAKGRVRLSA
#3 MKRTYQPSRVKRNRKFGFRARMKTKGGRLILSRRRAKGRMKLTV
#4 MKRTFQPSILKRNRSHGFRTRMATKNGRYILSRRRAKLRTRLTV
#5 MKRTYQPSKQKRNRTHGFRARMATKNGRQVLNRRRAKGRKRLTV
#6 TKRTFQPNNRRRARKHGFRARMRTRAGRAILSARRGKNRAELSA
#7 SKRTFQPNNRRRAKTHGFRLRMRTRAGRAILANRRAKGRASLSA
#8 GKRTFQPNNRRRARVHGFRLRMRTRAGRSIVSDRRRKGRRTLTA
|
|
- The degree of identity between the sequences
can be used to construct a distance matrix,
which indicates how closely related the
different sequences are. Here is one
for cytochrome c from a variety of
species.
|

|
- Based on such a distance matrix, one can
then construct a phylogenetic tree, as
illustrated here for cytochrome c.
|

|
Genomics and
Proteomics
- There has been a great deal of effort directed
towards determining the complete sequence of the human
genome (genomics) and many other genomes (including
yeast, and the fruit fly Drosophila melanogaster).
Once the complete sequence is finished, an important
issue looms: what to do with the data! Being able
to UNDERSTAND (and ultimately to make use of) the
information in the DNA sequence requires figuring out
what the proteins encoded by the genome are and what they
do (proteomics). In many cases we can deduce
the nature of the protein product of a gene by homology
to other proteins already sequenced, but in many other
cases (maybe >30%), we have no clue. We can use
biotechnology techniques to produce the protein, which
can then be purified and studied in order to try to
deduce its function. One important approach is to
determine its three dimensional structure, which may give
a clue to its function. The future of protein
biochemistry is indeed exciting!
lecture
notes |
462a
Home
Biochemistry 462a
http://www.biochem.arizona.edu/classes/bioc462/462a/462a.html
Department of
Biochemistry and Molecular
Biophysics
The University of Arizona
mawells@email.arizona.edu
All contents copyright © 1998-2000. All rights
reserved.
Last revision spring/summer 2000
|