Proteins: Purification and Characterization

 Lecture Notes | 462a Home


Reading - Chapters 5 and 6, especially pp. 130-150
Practice problems - Chapter 5: 8,12-14,18
(primary sequence determination: #15, 17; simple heptapeptide sequence problem)

Key Concepts
  • Proteins can be purified by various methods, mainly chromatographic, given:
    • a source of the protein
    • a detection method specific for the protein of interest

  • Progress of purification can be followed using a purification table to monitor
    • total protein (need a method to measure protein concentration)
    • total activity (or other specific property) of protein of interest
    • Specific activity (total activity/total protein), which indicates relative degree of purity; constant specific activity is one indication that protein may be pure.

  • Separation methods based on:
    • differential centrifugation (to separate soluble from particulate fractions of crude cell lysate)
    • solubility (fractional precipitation, "salting out")
    • column chromatography
      • gel filtration (= size exclusion chromatog. = molecular sieve chromatog.), based on size and shape
      • ion exchange chromatography (based on net charge of protein at the working pH)
      • affinity chromatography (based on specific ligand binding)

  • Protein characterization:
    • molecular weight
      • electrophoresis (SDS-PAGE --> individual polypeptide chain molecular weights)
      • gel filtration (calibrated column --> approx. native molecular weight if column run under nondenaturing conditions)
      • ultracentrifugation (depends on size and shape, but can give very accurate molecular weight)
    • isoelectric point (charge properties)
      • isoelectric focusing (often used as the first dimension in 2-D gel separations to look at ALL the proteins in a complex mixture)
    • spectroscopic properties (give various kinds of structural and functional information)
      • uv-visible spectroscopy
        • absorbance spectroscopy
        • fluorescence spectroscopy
        • circular dichroism spectroscopy
      • nmr spectroscopy
    • determination of primary structure
      • inference from sequence of nucleotides in the gene, and/or
      • chemical methods
        • amino acid composition
        • amino terminal residue determination
        • Edman degradation
        • fragmentation and determination of overlapping fragment sequences
        • mass spectrometry (useful in many other ways, too, e.g. for identifying proteins, even in complex mixtures)
    • complete 3-dimensional structure determination
      • X-ray diffraction from crystals of protein
      • nmr of protein in solution (only for small proteins, with current technology)

PROTEIN PURIFICATION
-Studies on pure proteins are essential for understanding structural and functional properties of proteins.
-Method for each protein worked out by trial and error on small samples

  • goal: separate the protein you want from other proteins and small molecules
  • mild conditions to avoid denaturation (usually low temperature, 0–4° C, and avoiding extremes of pH)
  • need detection method (e.g. biological activity, or spectroscopy)
  • usually use several purification methods, one after another
  • start with (mixture of) proteins in buffered solution, e.g. extract of proteins from cells that have been lysed (broken open)

Source of Protein
In order to purify a protein you need a source: could be blood or some other biological fluid, but most often whole cells, usually a specific type (liver, muscle, yeast, bacteria, etc.) 

  • Cells must be broken open (lysed, e.g., by osmotic shock or by mechanical disruption such as with a "French press" or a tissue homogenizer) to disrupt cell membranes to release proteins in soluble form without damaging the protein.
  • Membrane-bound proteins can also be purified, but different approaches are required.

Detection Method (ASSAY)
have to be able to measure the specific protein of interest in order to separate the "goody" from the "crud" in mixture

  • assay = test for unique property of protein of interest, e.g. specific catalytic activity of an enzyme, or spectroscopic property of a unique prosthetic group
  • also need to measure total amount of protein present in the mixture (goody + crud), by colorimetric measurements or sometimes using absorbance of protein (e.g., A280nm, which really is detecting aromatic sidechains)
  • specific activity = ratio of activity/total protein
    • As protein is being purified, ratio of activity (proportional to amount of "goody") to total protein (goody + crud) should increase, as you keep goody but get rid of crud in stepwise fashion.
    • Thus specific activity should increase until protein is pure (at which point you can't get rid of any more of the total protein without losing a proportional amount of activity, so the specific activity reaches a plateau, becomes constant.
    • Goal of a purification scheme is to maximize specific activity, which is maximal when protein is pure.

Initial fractionation of homogenate

  • usually by differential centrifugation --> several fractions (successive pellets, supernatants) of decreasing density, each with lots of proteins
  • assay each fraction to find which fraction contains most of the protein of interest, and fractionate that further by more discriminating methods.

Separation Methods
based on differences in properties of different proteins: differential solubility, or size, or charge, or binding affinity for specific ligands, etc.

  • Fractional Precipitation (based on differences in solubility properties) ("salting out")
    often
    the first step in purification
    • Proteins require H2O molecules interacting with surface groups, in order to stay in aqueous solution (hydration).
    • Salting out usually uses increasing concentrations of ammonium sulfate [(NH4)2SO4] to compete with the protein groups for the available H2O.
    • method is crude (no precise separations) but a good way to rapidly get rid of a lot of "crud"
    • Like all purification methods, salt fractionation has to be worked out empirically for each protein of interest
    • Every protein in the solution has its own solubility limits in ammonium sulfate, independent of the other proteins in the mixture.
      • Solubility is affected by concentration of the protein of interest, pH, and temperature
      • In general,
        • small proteins more soluble than large proteins
        • the larger the number of charged side chains, the more soluble the protein
        • proteins usually least soluble at their isoelectric points (pI, the pH where a molecule's (protein's) net charge is zero)
          • determined by protein's amino acid composition
        • solubility of protein X is totally independent of solubility of protein Y -- solubility depends on the surface properties of each individual kind of protein.
    • Useful method for concentrating the protein (precipitate it out and then redissolve it in smaller
      volume) as well as for crude separation from other proteins.
  • Column Chromatography
    • Invention of column chromatography a critical event in biochemistry, because it was the basis for development of procedures for obtaining pure proteins.
    • Different kinds of chromatographic separations based on one of the following:
      • size of protein (molecular sieve chromatography = gel filtration = size exclusion chromatography), or
      • net charge of protein (ion exchange chromatography), or
      • specific ligand binding properties of protein (affinity chromatography)
    • In column chromatography a solid phase ("matrix", "resin", generally some kind of polymer, often a polysaccharide)(see below) is placed in a glass tube, the column.
    • terminology:
      • adsorbent: solid material/matrix, a "stationary phase" that some molecules bind to (adsorb to
      • elution: the process of washing something off an adsorbent (with an eluting buffer; the solution coming off the column is the eluate.)
    • See also Nelson & Cox, Lehninger Principles of Biochemistry, 3rd ed., Fig. 5-17 (p. 131).
    • Protein mixture is passed into the column.
    • Either due to molecular size differences or different binding affinities for column matrix, some proteins are retained longer on the column (e.g., some bind more tightly than others), and so elute later.
    • Binding properties obviously depend on what type of stationary phase (column matrix) is used
    • By repeating this procedure with several different adsorbents, pure protein can be obtained.  
    • Another view can be seen in this animation.
    • The properties of some different types of column packing materials (for separations based on molecular size, charge or specific ligand binding) are described below.
  • Gel Filtration Chromatography (also called "Molecular Sieve" chromatography, or "Size Exclusion" chromatography)
    • stationary phase (column matrix) = "beads" of a polysaccharide material that separates proteins based on size and shape
      • Different column packing materials (hydrated, porous beads of carbohydrate polymer (e.g. dextran or agarose) or polyacrylamide) available, with wide range of molecular exclusion limits, for separating proteins of all sizes.
      • Solution of mixture of proteins, small molecules, etc. "filters" through the beads:
        • Large molecules can’t get into the smaller pores in the beads and move more rapidly through the column, emerging (eluting) sooner.
        • Smaller molecules and ions can enter all the pores in the beads with the buffer, and thus have more space to "explore" on their way down the column, and elute later.
    • For any particular column dimensions and material, volume of buffer required to elute a specific protein depends mostly on molecular weight of the protein (but shape plays an important role also -- separation is really based on differences in hydrodynamic volume). Thus, one can separate proteins by size.
    • Fig. 5-18a (Nelson & Cox, Lehninger Principles of Biochemistry, 3rd ed.): Size exclusion chromatography
    • This animation illustrates how size exclusion chromatography works.  Note how the small red spheres pass into the channels in the beads, whereas the large blue spheres do not.  Thus, the small spheres have a longer "distance" to transverse than the large spheres to get to bottom of column, which means that a larger volume of solvent must pass through the column before the red spheres are eluted.

    • The following plot of relative amount of the large solute (blue) and of the smaller solute (red) goes with the animation.
    • Larger solutes elute EARLIER, smaller solutes LATER, from a size exclusion column.

     

     

     

    • calibrate the column:
      • determine elution volumes of proteins with known molecular weights
      • construct a calibration curve relating(known) molecular weight to (measured) elution volume specifically for that column.  
    • Such a calibration curve can then be used to estimate the molecular weight of an unknown protein.
  •  

  • Ion Exchange Chromatography
    • Ion exchange resins have charged groups covalently attached to the stationary phase (adsorbent, matrix), either positive or negative.  Obviously, if ionizable groups are weak acids or bases, the pH of the buffer determines the charge state of the matrix.
    • Proteins bind to the matrix by electrostatic interactions
    • Strength of these interactions depends on
      • net charge on the protein (a function of buffer pH and the nature of the ionizable groups on that protein, reflected in the pI of the protein), and
      • salt concentration of the buffer (high salt concentrations reduce the interaction and can be used to elute the proteins by competing with the protein groups for binding to the charged groups on the matrix).
    • The higher the net charge on the protein at the pH of the environment on the column, the more tightly it sticks to an oppositely charged matrix, and the higher the salt concentration required to elute it from the column.
    • The further the "working pH" is from the isoelectric point (pI) of a protein, the greater the net charge on the protein, and the more tightly it will stick to an ion exchanger of opposite charge.
    • By proper choice of eluting buffer (often a gradient with increasing salt concentation, or changing the pH), specific proteins can be eluted from the column and separated from other proteins in the mixture

    • Fig. 5-18b (Nelson & Cox, Lehninger Principles of Biochemistry, 3rd ed.): Ion Exchange Chromatography
      • Example in figure is cation exchange chromatography -- column packing beads have covalently attached negatively charged groups
      • Negatively charged solutes move down the column more or less without sticking, so they elute first.
      • Positively charged solutes bind, and the higher the positive charge on a molecule, the tighter it binds, so the later it elutes.
  • Example: Suppose you have 5 different proteins, with relative isoelectric points as indicated on the pH scale below.

pH SCALE (working pH = 6.5 for these examples):

0  -----------pI#5----------pI#4------- 6.5 --------pI#1-----------pI#2----------pI#3-------------  14


Suppose that your column is equilibrated and being eluted at pH 6.5 (the working pH is 6.5), by washing the column with a gradient of buffer of increasing salt concentration.

Protein
What's the RELATIVE net charge at pH 6.5?
1
2
3
4
5
    • CATION EXCHANGE
      Cation exchange matrix has – charged groups (e.g., carboxymethyl (CM) groups).
      • A molecule with a net - charge won't stick, so will wash on through and elute before anything else (proteins 4 and 5 in the current example).
      • Molecules with net + charge will elute in the order of their pI values, because of differences in net charge: the most + charged one (the one whose pI is furthest from the working pH) sticks the most tightly (elutes last). See elution profile below.
      • ANION EXCHANGE
        Anion exchange matrix has + charged groups (e.g., DEAE (diethylaminoethyl) groups).
        • A molecule with a net + charge won't stick, so will wash on through and elute before anything else (proteins 1, 2 and 3 in the current example).
        • Molecules with net - charge will elute in the order of their pI values, because of differences in net charge: the most - charged one (the one whose pI is furthest from the working pH) sticks the most tightly (elutes last). See elution profile below.

      Label the peaks below with #1, #2, #3, #4, and/or #5, based on the expected order of elution of proteins #1-5 from a cation exchange column, or from an anion exchange column, at pH 6.5.
  • Affinity Chromatography
    • a more specific adsorbent in which a ligand specifically recognized by the protein of interest is covalently attached to the column material 
    • When a mixture of proteins is passed through the column, only those few that bind strongly to the ligand stick, while the others pass through the column. 
    • Protein of interest is eluted with a buffer containing the free ligand, which competes with the column ligand to bind to the protein, and protein washes off (with bound ligand).
    • Fig. 5-18c (Nelson & Cox, Lehninger Principles of Biochemistry, 3rd ed.): Affinity Chromatography  
    • some variations:
      • immunoaffinity chromatography: an antibody specific for a protein is immobilized on the column and used to affinity purify the specific protein.
      • "polyHis tags" on recombinant proteins: a sequence of His residues is placed (by genetic engineering of a cloned gene) at the C-terminus of a specific recombinant protein to be produced in vivo, and that protein can be purified on a column with Ni2+ ions (or Cu2+ or Co2+ or Zn2+) held in chelated form on an affinity column; the His imidazole groups on the end of the recombinant protein bind with high affinity, but other proteins don't stick. The recombinant protein can then be eluted with an imidazole buffer.
  • Dialysis/Ultrafiltration
    • "bags" made of semipermeable membranes
      • allow passage of small molecules but exclude the passage of proteins
    • Sacs made of such material allow the salt and buffer components of a protein solution to be changed to another buffer
    • very convenient when protein elutes from one column in a high salt buffer and you need to transfer it to a lower salt (or different pH, etc.) buffer for the next column

Monitoring progress of a purification scheme: the "Purification Table":

  • Table 5-5 (Nelson & Cox, Lehninger Principles of Biochemistry):
  • If one more purification step (e.g., another chromatographic column) resulted in the following,
    volume (ml)
    total protein (mg)
    Activity (units)
    Specific Activity
    (units/mg protein)

    5
    2.4
    36,000
    15,000
    What would that suggest about the purity of the protein of interest, and why?
    (Do problem #14, p. 156, in your textbook, for practice.)

PROTEIN CHARACTERIZATION
(These methods are used more for characterization than for purification, though some might sometimes be used for purification.)

Electrophoresis

  • In an electric field, a protein or other charged macromolecule will move with a velocity that depends directly on the charge on the macromolecule and inversely on its size and shape.
  • pH obviously important in determining net charge
  • Gel electrophoresis is carried out in some supporting media, usually polyacrylamide or agarose, with pores of big enough to allow passage of the macromolecule. 
  • Electric field is applied, and molecules move toward electrode opposite to their net charge, but they’re slowed down ("friction") by the gel
    • larger or more elongated shaped molecules move the most slowly
    • smaller, most compact molecules move faster.
  • The proteins in the gel are easily stained for detection purposes. 
  • Because the net charge on a protein and its molecular weight are characteristic properties of a protein, electrophoresis is a powerful method for characterizing degree of purity of a protein preparation, but can also be used for purification of small amounts of proteins
  • Discontinuous Gel Electrophoresis ("disc gel electrophoresis")
    3 experimental variations to ordinary gel electrophoresis:
    1) 2 gel layers, a lower or resolving gel and an upper or stacking gel
    2) The buffers used to prepare the 2 gel layers are of different ionic strengths and pH
    3) The stacking gel (upper gel) has a lower acrylamide concentration, so its pore sizes are larger.
    These variations cause formation of highly concentrated bands of sample in stacking gel and greater resolution of sample components in lower (resolving) gel.
    • The following copyrighted figures are from a course (Biology 3515/Chemistry 3515) taught at the University of Utah by Dr. David P. Goldenberg.
    • Stacking and separation in a discontinuous gel:

    • Buffer compositions control stacking and separation:

    • Glycine equilibria:
    • Formation of an ion front:
    • The voltage gradient sharpens the ion boundary:
    • What happens to the proteins?
      Proteins have mobilities between those of Gly and Cl-.
    • In separating gel,
    • Glycine mobility increases, becomes greater than protein mobility, but still slower than Cl-.
    • Protein sample, now in a narrow band, encounters both the increase in pH and decrease in pore size.
      Increase in pH would tend to increase electrophoretic mobility, but smaller pores decrease mobility.
      Relative rate of movement of ions in lower gel is chloride > glycinate > protein.
      Proteins separate based on charge/mass ratio and on size and shape parameters.

     

  • SDS-PAGE (Sodium Dodecyl Sulfate-PolyAcrylamide Gel Electrophoresis)
    a variant of electrophoresis in which the buffers contain SDS, a detergent that binds to proteins.

    Sodium dodecyl sulfate, SDS
    CH3(CH2)10CH2-SO4-, Na+
  • Most proteins bind SDS at a constant ratio of about 1.4 g SDS/g protein, i.e., about 1 SDS for every 2 amino acid residues, unfolding the proteins
  • Sample treatment before running gel included b-mercaptoethanol reduction (so no disulfide bonds left) and heating to ensure complete unfolding and complete separation of different polypeptide chains
  • large negative charge resulting from the bound SDS masks the native charge on the protein, so that all proteins have essentially the same charge to mass ratio (very negative), and same shape ("random coil") so
  • rate of movement in the electric field (toward the + pole because of – charge on sulfates) depends only on the molecular weight of individual polypeptide chains (which travel separately)
  • Protein mobility INVERSELY proportional to the log of the MASS of individual polypeptide chains, and net charge of protein itself hardly makes any difference at all.
  • SDS-PAGE often used to
    • ESTIMATE PURITY (number of stained or radioactive or fluorescent bands on the gel) and to
    • DETERMINE MOLECULAR WEIGHT of INDIVIDUAL POLYPEPTIDE SUBUNITS of proteins (using standards of known polypeptide chain mass)
    • Purification of small amounts of polypeptide for sequence analysis

Fig. 5-20 (Nelson & Cox, Lehninger Principles of Biochemistry, 3rd ed.): Estimating protein molecular weight from SDS gel electrophoresis
a) Diagram of a stained SDS gel: standards of known molecular weight (lane 1) and pure protein of unknown M.W. in lane 2
b) "standard curve" (calibration) to relate M.W. to mobility on THIS GEL

  • This figure illustrates several of the techniques discussed above. It is taken from "Isolation, Characterization, and cDNA Sequence of Two Fatty Acid-Binding Proteins from the Midgut of Manduca sexta Larvae". A. F. Smith, K. Tsuchida, E. Hanneman, T. C., Suzuki, and M. A. Wells, J. Biol. Chem. 267, 380-384 (1992).
    • Elution profile from an anion exchange resin (binds negatively charged proteins)
    • Proteins were eluted by increasing NaCl concentration in the eluting buffer.
    • Total protein was measured by determining the absorbance at 280 nm.
    • In order to "assay" (identify) the fatty acid-binding proteins, they were labeled by binding radioactive fatty acids (CPM=counts per minute - gray shading).
    • Purity of each peak was assessed using SDS-PAGE (insert/overlay).
    • There are two nearly pure proteins that bind fatty acids.
    • The two proteins were obtained in pure form following one additional step (not shown).




  • Western blotting is an immunological technique for detecting a specific protein in a mixture separated by gel electrophoresis, using antibodies specific for that protein to detect it on the gel.

Isoelectric Focusing  

  • separation based on differences in ISOELECTRIC POINT (pI) (so based on CHARGE DIFFERENCES)
  • Fig. 5-21 (Nelson & Cox, Lehninger Principles of Biochemistry, 3rd ed.): Isoelectric Focusing
  • pH gradient set up first (using purchased mixture of ampholytes, different molecules designed to have range of pIs, which are first electrophoresed on the gel to form the pH gradient)
  • Mixture of molecules (proteins) is then applied, electric field is turned on, and each protein moves to the position (pH) at which its net charge is zero, i.e., its pI.
  • Two-dimensional Electrophoresis
    isoelectric focusing in first dimension, followed by SDS-PAGE at 90o to that (2nd dimension)

Ultracentrifugation 

  • Molecular Weight and Shape = fundamental physical properties of a protein. 
  • Estimates of molecular weight can be obtained using SDS-PAGE or gel filtration, as described above.
  • One very useful technique for measuring molecular weight and shape is centrifugation.
  • A particle that's subjected to a centrifugal field by being spun in a centrifuge is subjected to a force,

where m is the mass of the particle, r is the distance of the particle from the center of rotation, and w is the angular velocity.
= buoyancy factor, which accounts for the fact that particle is buoyed up by the surrounding solvent of density r (g/ml). 

is the specific volume of the particle (ml/g) (= 1/density of the particle).
If = r then the particle will not move.

  • Movement of particle through the solvent is resisted by a frictional coefficient, f, that depends on the shape of the particle.
    Frictional coefficient is an important factor in any transport process, such as centrifugation or gel filtration.
    A spherical particle has f = 1.0, whereas a cigar-shaped or cylindrically-shaped particle will have f > 1.0.
  • Movement of any particle under the influence of a centrifugal field is characterized by its sedimentation coefficient, S, which is directly proportional to its molecular mass, M, and inversely proportional to f.


    , where N is Avogadro's number. 

  • Ultracentrifugation is used in two ways to characterize proteins:

In sedimentation equilibrium experiments, the centrifuge is operated at a relative low speed so that the forces of sedimentation and diffusion balance and the protein distributes in the centrifuge cell in a manner proportional to its molecular weight.

In sedimentation velocity experiments, the centrifuge is operated at maximal speed, which causes the protein to sediment to the bottom of the tube.  The rate at which the boundary moves gives S, which when combined with M gives f, a measure of the shape of the protein.

 

Spectroscopic Methods
Spectroscopy = the study of the interactions between (proteins) and electromagnetic radiation.

FYI, there are good, brief explanations of different types of spectroscopy for biochemical applications, with nice examples, in a textbook that used to be used for this course: C. K. Mathews & K. E. van Holde, Biochemistry, 2nd ed. (1996), Benjamin/Cummings Publishing Co., pp. 204-210.     The discussion below comes from that source.

    • Basic principles of absorption of radiation, using a diatomic molecule for illustration:
      • When 2 atoms interact to form a molecule, the potential energy curve for the lowest-energy electronic state (the ground state) will look like the lower curve in Fig. 6A.1 below.
      • Excited electronic states will have simlar curves for energy vs. interatomic distance, but at higher energies.
      • For each electronic state of the molecule, there will be a series of allowed vibrational states, with energies indicated by horizontal lines in the figure.
    • Basics of molecular spectroscopy -- 2 simple rules:
      1. Transitions are possible only between allowed energy states of the molecule (energy levels are quantized); and
      2. The energy (DE) that has to be absorbed in any transition determines the wavelength (l) of the radiation that is absorbed to accomplish that transition. The energy in a quantum of radiation is inversely proportional to l:
        D
        E = hc/l ; DE = Efinal state - Einitial state
        where h is Planck's constant (6.626 x 10-34 J•s), and c is the velocity of light (3 x 108 m/s).
    • High-energy transitions between electronic states of a molecule lead to absorption in the visible or ultraviolet region of the spectrum, whereas low-energy transitions between different vibrational levels correspond to absorption of infrared energy -- see Fig. 6A.1b below.
    • Fig. 6A.1 from Mathews and van Holde, Biochemistry, 2nd ed., 1996: The principles of absorption spectroscopy. (a) Electronic and vibrational transitions in a diatomic molecule. (b) The electromagnetic spectrum.
  • Ultraviolet-visible spectroscopy (uv-vis frequency range)
    • ABSORPTION SPECTROSCOPY
      • terminology:
        Absorption
        = transfer of energy from a photon (light) to a molecule
        Chromophore = a molecule or a group on a molecule that absorbs light
      • Chromophores in proteins include
        • the peptide bond (maximum wavelength of absorbed light, lmax, ~220 nm, "far" uv)
        • aromatic a.a. residues (lmax ~280 nm for Trp, "near" uv; aromatics also absorb ~220 nm)
        • some prosthetic groups (tightly bound non-amino acid components in proteins, e.g., the heme in hemoglobin and myoglobin is red -- it absorbs visible light.)
      • USES of absorbance spectroscopy:
        • determine concentration (Beer's Law)
        • conformational changes (environment of chromophore affects lmax and absorbance)
      • detect and quantitate ligand binding (e.g., O2 binding to hemoglobin changes absorbance of the heme)

Example: Absorption spectra of deoxy- and oxyhemoglobin

  • Stryer, Biochemistry, 4th ed. (1995), Fig. 7.12: The visible absorption spectrum of hemoglobin changes markedly upon binding of O2 or CO
    • DeoxyHb (blue) has single absorbance maximum ~550 nm.
    • OxyHb (red) has 2 absorbance peaks, at about 540 nm and 575 nm.
    • Maximum difference between deoxy and oxy spectra is seen at about 576 nm.
    • For a given Hb solution with no O2 present, value of A576nm indicates all "empty" sites (all deoxy, so ([occupied sites]/[total sites]) q = 0)
    • When [O2] has been increased to a concentration sufficient to essentially saturate the binding sites on the Hb (q = 1), that maximal A576nm indicates that all binding sites in the solution are "occupied".
    • As O2 concentration increases from 0 to saturating, q increases and can be monitored by the CHANGE in A576nm, DA576nm, up to the maximum DA576nm, which occurs when all the sites have O2 bound (q = 1).
  • FLUORESCENCE SPECTROSCOPY
    • Fig. 6A.4 from Mathews and van Holde, Biochemistry, 2nd ed., 1996: Fluorescence. (a) The principle of fluorescence. (b) Absorption and fluorescence emission spectra of tyrosine.
    • In most cases, molecules raised to an excited electronic state by absorption of radiant energy return to ground state by radiationless transfer of the excitation energy to the surrounding molecules in the form of heat.
    • Sometimes an excited-state molecule will lose only part of its energy by transfer (yellow arrow below), and will re-radiate the larger part as light (green arrow below). That emitted light is fluorescence.
    • Since energy of emitted light is always lower than energy of absorbed light, fluorescence emission is always at a longer wavelength than wavelength of the exciting (absorbed) light (Fig. 6A.4(b) below).
    • terminology
      Fluorophore
      = a molecule that absorbs light but then returns to the ground state by emitting some of the light as a photon rather than losing all the energy as heat
    • wavelength and intensity of emitted light both very sensitive to the environment of the fluorophore (e.g., hydrophobic vs. aqueous environment can shift emission spectrum)
    • measurements very sensitive so can detect small amounts of protein or other fluorophore
    • Fluorophores in proteins
      • Trp (maximum wavelength of fluorescence emission (lmax ~340 nm) is the strongest source of intrinsic fluorescence in proteins without fluorescent prosthetic groups, but tyrosine also contributes to intrinsic fluorescence (see Fig. 6a.4(b) above.
      • Some ligands and prosthetic groups are fluorescent, e.g. the chromophore in green fluorescent protein
    • USES of fluorescence spectroscopy -- examples:
      • detect conformational changes
        • e.g. during protein folding (environment of chromophore affects lmax and intensity of Trp fluorescence; the more hydrophobic the environment, e.g. as Trp residues get buried in the interior of the protein during folding, the shorter the wavelength of maximum fluorescence emission)
      • detect and quantitate ligand binding


  • CIRCULAR DICHROISM (CD) SPECTROSCOPY
    • CD measures interactions of polarized light with chiral protein components that are formed into different types of 2° structure (asymmetric), and with other chromophores (ligands or prosthetic groups or aromatic R groups) that are asymmetrically bound.
      • Unpolarized light consists of waves vibrating in all planes perpendicular to the direction of travel.
      • Plane polarized light has waves vibrating in a single plane (Fig. 6a.5(a) below, top diagram). In plane polarized light, the varying electric field of the radiation has a fixed orientation.
      • In circularly polarized light, the direction of polarization rotates with the frequency of the radiation (Fig. 6A.5(a) below, bottom diagram). If you observe a circularly polarized beam of light coming toward you, the electric field can be rotating in either a clockwise direction (right circularly polarized light) or a counterclockwise direction (left circularly polarized light).
    • Asymmetric molecules and components of molecules (e.g. D- and L-amino acids; right- and left-handed protein helices; etc.) preferentially absorb either left or right circularly polarized light.
    • CD = different degrees of absorption of left and right circularly polarized light:
    • where AL is the absorbance for left circularly polarized light, AR is the absorbance for right circularly polarized light, and A is the absorbance for unpolarized light. Since DA can be either positive or negative, a CD spectrum is unlike a normal absorption spectrum (see Fig. 6A.5(b) below).
    • wavelength region for CD signal = same as absorbance for that chromophore:
      ~185-240 nm ("far uv") for 2° and 3° structure (peptide bond absorbance, aromatic R groups, disulfide bonds)
      ~260-290 nm ("near uv") for 3° structure (aromatic R groups)
    • Fig. 6A.4 from Mathews and van Holde, Biochemistry, 2nd ed., 1996: Circular Dichroism. (a) Polarization of light. Top, plane polarized light; bottom, circularly polarized light. (b) Circular dichroism spectra for polypeptides in various conformations. Which of these CD spectra would you expect would most resemble the CD spectrum for myoglobin?
    • ligand binding: wavelength of CD signal depends on ligand's absorbance properties
    • USES of CD: (NOT for determination of complete 3-D structure)
      • estimate content (amount) of various secondary structural elements (CD spectrum in 220 nm region different for a-helix, b conformation, and "random" conformation) -- see Fig. 6A.5(b) above.
      • conformational changes (e.g., changes in 2° structure from changes in CD ~220nm region, or changes in 3° structure from changes in environment of aromatic R groups shown by CD in 280 nm region, during time course of refolding)
      • ligand binding
  • NUCLEAR MAGNETIC RESONANCE (NMR) SPECTROSCOPY (microwave, i.e. radio, frequency range)
    • Basis: A spinning charged particle (in this case, a nucleus) behaves as a magnet, and can interact with an externally imposed magnetic field such that absorbance of electromagnetic radiation of appropriate energy (in the microwave, i.e. radio, frequency range) can flip the spin.
    • Nuclei used in biochemical studies include 1H (proton NMR), 2H, 13C, 14N, 17O, 31P, and 19F (in 19F-Tyr).
    • To get an NMR spectrum (in ppm), you "sit" on a magnetic field strength and change the radio frequency to get resonance.
      • The type of nucleus you're observing, but also the molecular structural environment of the nucleus (including its solvent and surroundings in 3 dimensional space) affect the width and position (position = "chemical shift") of the NMR signal (peak) for that nucleus.
      • Interaction with a nearby nucleus within the molecule can cause spin coupling, which is seen as splitting of the NMR signal (double peak).
      • Altering the spin on one nucleus can affect the spin on a nearby nucleus (< ~5Å away), and for small proteins it is possible by NMR to do enough distance measurements between nuclei within the tertiary structure to determine the entire 3-dimensional structure.
    • USES of NMR:
      • complete 3-D structure of small proteins in solution (< 25,000 daltons)
      • conformational changes (e.g., during folding)
      • determination of pKa of an ionizable group, e.g. His
      • follow ligand binding
      • dynamics (motion in solution), e.g. Tyr and Phe ring flips

  • BRIEF SUMMARY OF SPECTROSCOPIC METHODS FOR PROTEINS
TYPE PROTEIN COMPONENTS USES
1. uv-vis spectroscopy    

a) absorbance

  • peptide bonds (~220 nm)
  • aromatic residues [esp. W (280 nm), (Y)]
  • some ligands & prosthetic groups
  • determine protein concentration
  • conformational changes
  • ligand binding
b) fluorescence
  • W [lmax,ex=280m, lmax,em=~340nm]
  • some ligands & prosthetic groups
  • conformational changes
  • ligand binding
c) circular dichroism (CD)
  • secondary structure (180-240nm) (peptide bonds)
  • tertiary structure (environment of aromatic R groups)(260-300nm)
  • ligands & prosthetic groups
  • 2o structure (amount & type)(far uv)
  • conformational changes (2o & 3o structural changes)
  • ligand binding
2. NMR
  • nuclei: protons (1H, esp. in aromatic residues & His), but also 2H, 13C, 14N, 17O, 31P, and 19F (natural abundance or isotopically labeled proteins or ligands)
  • complete 3-D structure (only of small proteins)
  • conformational changes
  • ligand binding, including pH titrations (pKa determination)
  • dynamics (motions in structure)

Protein Primary Structure Determination

insulin.gif (1620 bytes)

 

This is the primary structure of bovine insulin, which is composed of two polypeptide chains (A and B).  The two polypeptide chains are joined by two interchain disulfide bonds - the A chain also contains an intrachain disulfide bond.

 

  • Determining the amino acid sequence of a protein used to be a very laborious and time consuming process involving chemical and enzymatic degradation. 
  • Today, the amino acid sequence of proteins is usually determined from the nucleotide sequence of the gene, a relatively simple and rapid process.
  • The amino acid sequence of the same protein from many sources, e.g., cytochrome c, shows that some amino acid residues are conserved among all the proteins, whereas others are not conserved. 
  • Such a sequence comparison provides valuable information about possible roles of specific amino acid residues that may be either
    • essential for protein's function, or
    • essential for protein's structure (e.g., residues needed for correct tertiary folding, or needed for interacting with another subunit)

The importance of amino acid side chains: real life example, sickle cell hemoglobin

  • Hemoglobin is the oxygen transport protein in blood. 
    • HbA (human adult hemoglobin) a tetramer containing two a and two b chains, a2b2 (Hemoglobin)
    • exists in two conformational states: an oxy form and a deoxy form.
  • Several hundred mutant hemoglobins are known to exist. 
    • In most, there's a single amino acid substitution in either the a or b chain of normal HbA. 
    • Many of these changes cause no known effect, but several lead to pathologies associated with abnormal O2 transport. 
  • Sickle cell hemoglobin, HbS, has a single amino acid replacement of a Val for Glu at position 6 of the b chain.
    • This seemingly innocuous change places a hydrophobic sidechain on the surface of the protein.  
    • In the deoxy conformation the Val sidechain of a b chain in one Hb binds to the b chain of another Hb,
    • leading to polymer formation and precipitation of the deoxy Hb,
    • which causes red cell lysis and anemia. (Hemoglobin S)
  •  

Amino Acid Composition

  • The amino acid composition is a fundamental characteristic of any protein.
  • Total acid hydrolysis (aqueous acid: 6N HCl, 110oC, 10-100 hrs in vacuo) of the protein
    • releases all the free amino acids
    • amino acids in hydrolysate quantitated using ion exchange chromatography or HPLC (automated amino acid analyzer)
  • Amino acid peaks can be detected using ninhydrin, which reacts with the free amino groups of amino acids to produce a purple color, or (much more sensitive detection method) by reaction with reagents that generate fluorescent derivatives, permitting detection of much smaller amounts of each amino acid
  • NOTE: All sequence information is lost upon total acid hydrolysis.

  Amino Acid Sequence Determination

  • Amino-terminal residue can be determined by
    • derivatization of whole peptide or protein, e.g. by reaction of amino groups with either
      • 1-fluoro-2,4-dinitrobenzene, FDNB ("Sanger's reagent") --> yellow dinitrophenyl derivatives,
      • or reagents that give fluorescent derivatives (detect smaller quantities)
      • followed by total acid hydrolysis, releasing
        • the derivatized N-terminal residue (plus all the previously "internal" free amino acids, including the e-amino derivatives of Lys residues)
        • a-amino derivative (the original N-terminal residue) identified by chromatographic analysis/comparison with known standards
      • but rest of sequence is destroyed by the total acid hydrolysis -- can only determine N-terminal residue this way

  • Edman Degradation:
    • One residue at a time from the amino terminus can be chemically derivatized, removed and identified.
    • The amino-terminal amino acid residue is derivatized and removed (cleaved off) for subsequent identification, leaving the peptide or protein one residue shorter, for another round of derivatization, removal and identification of the next amino acid residue in the sequence, etc., for up to about 75 residues from the amino terminus)

  • Coupling (labeling): chemical modification of the a-amino group of a peptide or protein by the Edman reagent (PITC, phenylisothiocyanate)
  • Cleavage ("release"): anhydrous acid (e.g., anhyrous trifluroacetic acid)
    Why is it crucial that the acid be anhydrous? (What happens to all the peptide bonds in a protein in strong aqueous acid?)
  • Conversion of the initial unstable derivative to the more stable phenylthiohydantoin derivative for identification (by chromatographic behavior and comparison with known PTH standards) (Berg, Timoczko & Stryer Fig. 4.21 below doesn't mention this step, and Nelson & Cox Lehninger text Fig. 5-25 doesn't mention the crucial ANHYDROUS acid step for the cleavage!)
  • Whole procedure has been automated, done by a single machine (sequenator), with output to a computer.

Fig. 4.21 (Berg, Timoczko & Stryer, Biochemistry, 5th ed., 2000): Edman Degradation (You're not responsible for the chemical structures, but should know the NAME of the Edman reagent, the general outline of how the degradation is done, and the conditions permitting the derivative to be cleaved off the rest of the peptide ("release" step), leaving the rest of the peptide intact.)

  • Although these reactions proceed to > 99% yields at each step, eventually (about 25-75 cycles) it becomes difficult to detect the newly released product.  Thus a single series of Edman degradation reactions is not able to determine the entire sequence of a protein.  (FYI: calculate what would happen to the YIELD of the derivative for the 20th residue after 20 steps if the procedure is only 90% efficient at each step.)

What is needed are smaller fragments, with new amino termini, which can be individually purified and sequenced. 

  • This is accomplished by cleaving the protein with a proteolytic enzyme, such as trypsin, or a chemical reagent such as cyanogen bromide, which generates a set of peptides, fragments of the original protein, that can be separated and sequenced. 
    • Trypsin cleaves peptide bonds on the carboxyl side of Lys or Arg residues, as illustrated below.
      • Chymotrypsin cleaves peptide bonds on the carboxyl side of Phe, Trp or Tyr residues, but also sometimes on the carboxyl side of other hydrophobic amino acids, e.g., Val, Leu, Ile, or Met.
      • Other proteases have different specificities.
      • Cyanogen bromide cleaves on the carboxyl side of Met residues, but the chemistry of the cleavage converts the Met residue at the C-terminus of the new peptide to a derivative that is converted by acid hydrolysis to homoserine (R group is -CH2-CH2-OH) rather than Met, so amino acid composition of the new peptide would show homoserine.
    • There are thus a variety of ways to fragment the protein under investigation to determine the sequences of manageable-size peptides.
  • Problem: once the proteolysis has been accomplished and the peptides separated and sequenced, how were they ordered in the original protein?
    • Reestablishing the order is the big problem in protein sequencing.
    • The method is like solving a puzzle -- the sequences of the families of peptides obtained from two different cleavage methods are examined for OVERLAPS.
    • For an example, see simple practice problem for sequence of a heptapeptide, and also the strategy for sequencing the B chain of insulin.

Mass Spectrometry 

  • Recently mass spectrometry has become an important technique in peptide/protein chemistry, for sequence detemination and for identification of "unknown" proteins
  • VERY accurate determination of mass of a protein or peptide, needing only tiny amounts of material
  • Peptide mass fingerprinting (for identification of "unknown" proteins)
    • a small sample of an "unknown" protein, e.g. from an unidentified spot on a 2-D gel, is cleaved specifically into pieces (peptides),
    • the masses of the components in the mixture are analyzed by mass spec, giving a pattern of masses (a fingerprint) characteristic of that protein.
    • That fingerprint can be compared by a computer with the "virtual fingerprints" of a whole database of proteins that have been "cleaved" electronically by computer simulation of the same cleavage method, and often the "unknown" protein sample can be matched to a known protein sequence, so it can be identified.

  • Mass spectrometers consist of three basic parts:

  1. An ion source that creates charged molecules in the gas phase
  2. a mass analyzer that uses a physical property, e.g., time-of-flight (TOF), to separate ions
  3. a detector.
  • Two important methods are used to create protein ions:
    • In matrix-assisted laser desorption ionization (MALDI) ions are created by using a laser to excite proteins in a crystalline matrix.  MALDI is particularly suited for determining the molecular weight of proteins, often to accuracies of a few parts per million.  The spectrum shown above illustrates the molecular masses of several peptides in a mixture. 
    • In electrospray ionization (ESI) ions are created by applying a potential to a flowing liquid.  This causes the liquid to spray and protein ions to be created.  This method can also be used to measure molecular weight, but is most powerful when used in tandem MS/MS.

 

 

  • A tandem mass spectrometer combines two mass analyzers with a method to energetically activate ions. In the first spectrometer a particular ion is isolated from all other ions that enter the mass analyzer (as marked above), dissociated, and the m/z values of the dissociation products determined in the second mass analyzer. The dissociation process causes covalent bonds to fragment.  In the case of peptide ions, fragmentation processes predominate at or around the amide bond, creating a ladder of ions that is indicative of an amino acid sequence, as illustrated below.

Sequence Homology

  • Once the amino acid sequence of a protein has been determined, there are powerful computer programs (If you are interested, go to this web site to see some of the tools available for proteomics) that can be used to determine if the sequence is similar to other proteins.  Such a search might give the results shown below.

#1 MKRTYQPNRRKRSKVHGFRARMSTKNGRKVLARRRRKGRKVLSA #2 MKRTWQPSKLKHARVHGFRARMATKNGRKVIKARRAKGRVRLSA #3 MKRTYQPSRVKRNRKFGFRARMKTKGGRLILSRRRAKGRMKLTV #4 MKRTFQPSILKRNRSHGFRTRMATKNGRYILSRRRAKLRTRLTV #5 MKRTYQPSKQKRNRTHGFRARMATKNGRQVLNRRRAKGRKRLTV #6 TKRTFQPNNRRRARKHGFRARMRTRAGRAILSARRGKNRAELSA #7 SKRTFQPNNRRRAKTHGFRLRMRTRAGRAILANRRAKGRASLSA #8 GKRTFQPNNRRRARVHGFRLRMRTRAGRSIVSDRRRKGRRTLTA

 

 

  • The degree of identity between the sequences can be used to construct a distance matrix, which indicates how closely related the different sequences are.  Here is one for  cytochrome c from a variety of species.

  • Based on such a distance matrix, one can then construct a phylogenetic tree, as illustrated here for cytochrome c.

Three Dimensional Structure 

  • 3-D structure very important in understanding function of protein
  • 2 major methods:
    • X-ray crystallography (excellent online tutorial can be found here)
      • can be used for any size protein or even a huge macromolecular complex the size of a ribosomal subunit (many proteins + RNA molecules) IF the molecule or complex can be crystallized (combination of scientific skill, art, and luck!)
    • NMR (currently only useable for total structure determination for small proteins, < ~25 kilodaltons)
  • more than 10,000 structures have been determined, most in the last decade as new, more powerful instruments have become available.

Genomics and Proteomics

  • There has been a great deal of effort directed towards determining the complete sequence of the human genome (genomics) and many other genomes (including E. coli, yeast, and the fruit fly Drosophila melanogaster -- the list grows rapidly).
  • Once the complete sequence is finished, an important issue looms: what to do with the data! 
  • Being able to UNDERSTAND (and ultimately to make use of) the information in the DNA sequence requires figuring out what the proteins encoded by the genome are and what they do (proteomics). 
  • In many cases we can deduce the nature of the protein product of a gene by homology to other proteins already sequenced, but in many other cases (maybe >30%), we have no clue. 
  • We can use biotechnology techniques to produce the protein, which can then be purified and studied in order to try to deduce its function.  One important approach is to determine its three dimensional structure, which may give a clue to its function.
  • The future of protein biochemistry is indeed exciting!

 

 

Structural Homology 

  • In addition to sequence homology for proteins with identical functions from different organisms, there are often domains in a protein that are conserved. 
    • e.g., most proteins that bind nucleotides, such as ADP, have a common nucleotide-binding motif.

  • There are even a few cases in which proteins with entirely different functions have very similar three dimensional structures, as shown below for lysozyme, an enzyme, and a-lactalbumin, a milk protein.

Lysozyme

a-Lactalbumin

 


lecture notes | 462a Home


Biochemistry 462a
http://www.biochem.arizona.edu/classes/bioc462/462a/462a.html
Department of Biochemistry and Molecular Biophysics
The University of Arizona
zieglerm@u.arizona.edu 
All contents copyright © 1998-2003. All rights reserved.
Last revision fall 2003