Symmetry
issues in PDB files
A PDB file contains the minimum amount of information necessary to describe completely the crystal structure of the molecule or molecules. In some cases, including those dealing with protein-DNA interactions, this information may not be sufficient to display the biologically relevant structure.
Consider a protein-DNA interaction involving a completely symmetrical DNA sequence bound by a protein dimer. If the two subunits of the dimer are also completely symmetrical (a form of symmetry called 2-fold rotational symmetry), then it is possible that the complex will crystallize with only a portion of the biologically relevant molecule in the crystal's asymmetric unit, which is the unique portion of the crystal. In these cases, a symmetry element such as a two-fold rotation axis will run completely through the crystal and also through the molecule, so that a complete molecule consists of two asymmetric units related by two-fold rotation symmetry. The PDB file will only contain the one asymmetric unit, however. Clearly, you would be interested in having both DNA strands and both protein subunits. Here is how to get that information, given with an example that illustrates the concern.
When you link to the PDB, you get a home page that invites you to enter a PDB accession number, if you know it, in a box on the right under "Search". Enter "2DGC" and click on Explore. This gives you the page for this structure (GCN4 bound to DNA). Notice that in the thumbnail view of the structure at the upper right there is only one alpha helix visible. This is a tip-off that you have the symmetry problem, since you expect to find two helices. Note that it's really helpful to know something about your protein, if you want to be sure to avoid this concern. All the PDB files that I've put on the server are complete.
Instead of clicking on "Download/display file", as you ordinarily would to get the PDB file, click on "Other sources". In the middle of the resulting page, the large gray area in the middle has a line starting with "EBI MSD Macromolecule File Server". The last column in this row has the same 2dgc accession number; if you click on that, you get a PDB file that contains both DNA strands and both protein subunits.
If you try this for other proteins, let me know if it works!
Back to RasMol home page
Bioc/MCB568 Home Page