|
|
showing DNA-protein interactions |
Scripts are files that contain a series of commands like the ones you type in yourself in the RasMol Command Line window. Usually they contain pauses for you to study the image, and then you hit a key to proceed. One advantage of scripts is that you can view them without knowing the commands or anything about the molecule. The disadvantage is that you don't have much control over what is displayed.
However, you can create your own scripts. As stated, these are files, like any other text files, and they can be edited using either a text editor (such as SimpleText on the Mac or Notepad on the PC) or a word processing program (such as Word or WordPerfect). As will become clear, when you save a script in RasMol (using the command "write script <filename>"), the resulting script contains some useful information, as well as a lot of extra lines that can be replaced with a relatively small number of commands.
Hence, if you want to write a script that shows several views, such as the ones used in class (and downloaded from the RasMol page), it is a two-step process. First, make a view of the protein that you like, entering commands at the command line, and save the script as above. Then, you delete most of the information, retaining certain commands near the beginning (as spelled out below), and replace them with the commands you already entered at the command line.
One way to see the difference is to look at two scripts, one made by RasMol and one I modified in this way. The view is similar to one that was generated midway through the script "contacts.txt" , without some of the zooming and rotating and so forth. The RasMol-generated script is Rasmol466.txt; the one I made is lys466.txt . These two scripts are 48 KB and 2 KB in size, respectively, and both generate the same image (put them into your "motifs" folder to run them). Most of the latter one consists of instructions for how to color or display particular atoms or groups of atoms. It also does not contain any comments or pauses; I put a couple of these into lys466.txt . Comparing these two scripts will help you see what needs to be retained and what can be deleted. See also comments at these links:
|
Commands |
Comments |
|
script "cIload.spt" |
This script can be obtained from the link above. It loads the molecule and positions it appropriately. Alternatively, you can enter the commands yourself, as follows. |
|
Alternatively, you can position it and color various parts by entering the following commands (in green): |
|
|
Click on File/Open, highlight 1lmb.pdb |
This loads the lambda repressor:DNA complex |
|
rotate z 9 |
This positions the molecule appropriately |
|
select dna |
This simplifies the coloring scheme. Some of this could be done with the menus as well. This should give you a view of lambda CI bound to its specific operator site, with the protein in dark gray and the DNA in red. Note: the color command tells it how much red, green, and blue to color it on a scale of 0-255; so it's "color [R,G,B]" |
|
zoom 400 |
Get up close and personal. |
|
Whether you used the script or entered the above commands, you should see a rather jumbled close-up view of part of the protein and DNA.
Examples of the use of "define" commands: These commands let you refer to particular groups of atoms as a single unit. They save a lot of time when you want to refer to such a group repeatedly. They will be used here together with "select" commands to
create a view |
|
|
define g14 dna and 14 and not backbone |
This defines the base of guanine 14 as a single unit. The syntax is "define x [as the atoms that meet the criteria] y", where y are the criteria. |
|
select g14 |
These commands highlight this particular base in blue and give partial spacefilling. |
|
define asn55 *:4 and 55 and not backbone |
This defines the Asn55 of one subunit; the *:4 means "any atom in chain 4". NOTE: this expression contains a colon " : " not a period. The colon is used to refer to a chain in the file, while the period is used in defining a particular atom.. |
|
select asn55 |
Highlight this residue side chain |
|
Exercise: Now define Lys4 using the same syntax, select it, color it orange ("color orange"), and spacefill it. Don't be confused by the fact that it's residue 4 in chain 4! When you've done this, you should see Lys4 in orange close to the other two groups. |
|
|
define triple g14, asn55, lys4 |
This defines all three groups as a single unit. The "," means "or"; in other words, to be a member of "triple", it has to be g14 OR asn55 OR lys4. |
|
At this point, it's hard to see much; we're looking at
the base edge-on, and we can't see the contacts. |
|
|
restrict triple |
The "restrict" command means that only the selected material will be displayed. |
|
There are two ways to rotate it. The easy way is to enter "rotate a b" where a is a dimension (x,y, or z) and b is an angle. The more complicated way, for which you need the scripts (they are in the web site), lets you rotate it gradually. We'll try both ways. Before rotating, however, we must tell it where the center of rotation is. Otherwise it often rotates out of view! |
|
|
center selected |
This says to make the center of rotation the selected atoms ("triple" in this case). You won't see any changes, but it will make a difference in the next step. |
|
rotate y 90 |
This rotates about the y (vertical) axis. (See note at the end about "rotate" and "translate" commands.) |
|
rotate y -90 |
This puts it back to the previous position so we can try the script. |
|
script y90-2.scr |
There are six scripts like this; each moves it in 2 degree increments along one axis. giving a fairly smooth rotation. They are at the motifs link. If you don't have the scripts, you can use the "rotate" command to achieve the same purpose; in this case, "rotate y 90". |
|
script y-90-2.scr |
This does rotation in the opposite direction. |
|
script y90-2.scr |
Back to the position of choice. |
|
define c28 dna and 28 and not backbone |
Define the base paired with g14 |
|
select c28 |
Select c28, color it a darker blue, and display it. Now you can see the base pair; unfortunately there's no good way to make the hydrogen bonds. |
|
Now let's highlight the specific contacts between the aa
side chains and the base pairs. To identify particular
atoms, click on them, and the command line tells you their
identity. You could select them by the command "select
atomno=x" where x is the atom number you discovered by
clicking on the atom. For this exercise, they will be
identified (since you don't know which are important). Note:
in the expressions that follow, the *.nd2, for example,
contains a period and not a colon. |
|
|
select asn55 and *.nd2(or "select atomno=1908") color [255,150,100]o |
Select a residue in Asn55 and make its color a bit lighter. Now it's a peach tone. Again, the numbers in the "color" command refer to the amount of red, green and blue respectively on a scale of 0 to 255. You have to experiment to get the most pleasing shade. |
|
select g14 and *.n7 |
Same thing for its counterpart in G14. A hydrogen bond is formed between the two atoms. Click on each atom to identify its number, which is used in the next command. |
|
monitor 277 1908 |
This gives a poor substitute for drawing in a bond; it also gives the distance between the two atoms. This particular distance (2.90 angstroms) is about ideal for an H bond. Note: the color of the number is that of the first atom (277 in this case) listed. |
|
rotate z -90 |
Rotate in the z direction (out of the screen) so you can see the distance better. |
|
A view like this would be fine for showing the specific
contact (for instance in a problem set).
|
|
|
echo " " |
Note that when this is entered it is "echoed" at the command line. In a script, the RasMol prompt would not be present; hence, the first line (echo " ") is useful for spreading out your commentary. How long your lines are depends on how big the window is on the viewer's screen; you can make multiline comments by having several sequential "echo" statements. In a script, you would have a command "pause" following the echo statements, so the viewer would have a chance to read the comments. Look at some of the scripts I wrote for examples. |
|
Now let's draw back and look at the context. |
|
|
script x90-2.scr |
First rotate it to give a view in the plane of the bases ("rotate x 90" is equivalent) |
|
select dna and not triple and not c28 |
Select all the DNA except the bases of G14 and C28, turn on the wireframe. |
|
select protein and not triple |
Same thing for the protein. |
|
zoom 100 |
This repositions the molecule. You could do this with the mouse and keyboard, but by following these commands it will be in a defined place for what follows. |
|
define arm *:4 and 2-10 |
This selects aa residues 2-10 of the subunit, including Lys4, and colors them green |
|
script y-90-2.scr (or rotate y -90) |
This shows how this part of the protein, including Lys4, is an arm reaching around the "back side" of the DNA. This feature of CI is unusual. |
|
rotate y 90 |
This selects the part of the protein including Asn55, colors it purple, and shows that it is not in an alpha-helical region, indicated by a free-form loop. The point is that not all contacts are with residues in an alpha helix. |
|
ribbons off |
De-emphasize this part of the protein, except for Asn55. Note: If you want to repeat commands quickly, the up-arrow key scrolls through previous ones (useful here with the color command) |
|
|
|
|
select *:4 and 44-51 |
This selects the "recognition helix", and shows that it is alpha helix. It also shows that Asn55 is not part of the recognition helix. |
|
select *:4 and (44,45,46,48,49) and sidechain |
This selects the side chains in the recognition helix that make specific contacts with the bases, and shows more or less how they lie in the major groove. Note: Many of the contacts are made with protein backbone carbons, and are not shown here. |
|
define ser45 *:4 and 45 and not backbone |
Now focus on one of the contacts.
|
|
restrict sg |
Simplify the image |
|
monitor 1841 319 |
This Ser -OH makes not one but two H bonds with the guanine. |
|
Exercise: How would you turn on the base that is paired to G16? Hint: First you have to find it. Turn on the DNA as above (select dna and not g16), so that anything you do to the DNA doesn't affect G16. |
|
1. The rotate commands are "relative"; they rotate the image relative to its current orientation. By contrast, the translate commands are not relative; they translate the image relative to a particular frame of reference, which is defined by the PDB file and doesn't change when you manipulate the image.
So, if you enter "rotate y 90" twice, it will rotate it both
times. But if you enter "translate x 10", the second time it won't
move, because it isn't relative to the previous position.
The occasional student is tempted to cut the paper out of the
journal so he/she can have a copy that's in color. This has happened
in the past to journals in the library.
Don't do
this; anybody wanting to use the
journal later won't even find the paper.
When you read a paper and try to use the information to look at a PDB file, you need to be able to relate the residue numbers in the paper to those in the file.
Proteins: Generally, the residues in the PDB file have the same numbering scheme as those in the paper. These numbers refer to the residue numbers in the native protein. Often, in fact almost always, the proteins used for x-ray crystallography are not the intact protein, but some fragment of the protein (this is because the intact proteins usually don't crystallize well; the catch-all explanation for this is that they are "floppy"). Nonetheless, the residues aren't numbered starting from 1 in the fragment, but in the intact protein. For instance, the glucocorticoid receptor fragment used for structure determination (1glu.pdb) comprises residues 440 to 525 of the intact protein, and these are numbered as such. You can find this information by looking at the PDB file with a text editor (see below).
If there are several subunits of the same protein, the numbering scheme is consistent even when different subunits don't have the same residues showing (as is the case with lambda repressor, for which only one arm is visible).
DNA: Here the case is more complicated. Co-crystal structures always contain synthetic DNA oligonucleotides. The numbering scheme for these DNA molecules isn't standardized; every paper has its own nomenclature, and it varies a lot. A typical example is lambda repressor, for which the paper uses a nomenclature like position 1 or 1' to refer to the base pairs, not the individual bases. Other papers might call a base 1 and its complement 1' or -1. The PDB files generally don't follow these conventions. In turn, there are several different ways that the bases are numbered. Examples:
For lambda repressor (1lmb.pdb), the bases are numbered such that one strand starts at 1 and goes to 21, the other strand starts at 22 and goes to 42. In this case, you can identify a base uniquely by its number; you don't need to specify the strand (using something like *:2).
For glucocorticoid receptor (1glu.pdb), the bases on one strand are numbered starting at -10 to 9; those on the other strand are also numbered -10 to 9. In this case, if you select dna and -10, you get two residues not one; you would have to say select dna and -10 and *:c to get the one in strand c.
To determine how the PDB file for your DNA does it, open the PDB file with a word processing program and look at it. If you do this, be sure that you don't save it in the format of your word processor, because RasMol won't be able to use it; you don't need to save it, but if you need to save it, do so as a text file. It might be helpful to print it out if it's not too long.
To establish the correspondence between the paper's nomenclature and that of the PDB file, you have to study the sequence of the two, and make a table showing the relationship. This will allow you to translate the information in the paper to using RasMol properly to identify the bases. Most papers have a figure showing the sequence of the oligonucleotide used; you can Xerox this figure and write on the xerox. Don't write in the journal.
1. When you edit a script, you will probably do so using a text editor or word processor like Word or WordPerfect. When you save it, be sure that you save it as a text file (in WordPerfect, for example, under the save menu it gives you the option of saving as a variety of types).
2. You can get ideas for how to achieve certain effects by looking at the scripts I wrote. Again, you can look at these scripts (or at PDB files) using Word or WordPerfect; but don't change them, or save them as Word documents, when you're done looking at them, or they won't work in RasMol.
3. When you write a script, it is a good idea to get a view that you like, then save it. The script has all the information needed to load the molecule and set things up right, and to recreate the image. Often, however, the program will put in information for hundreds of atoms! This information can be much more simply entered by a series of commands such as those we've gone through above.
You can delete the information for the atoms and substitute them with commands, using a word processor. This isn't essential if you only want to present only a single view, but if you want to have a progression of views you have to do it this way. It also makes the file shorter.
4. At certain points, you may want to give the viewer the option of rotating the molecule to check out certain features, as is done in "motifs" for example. (This is a refinement that isn't necessary for problem sets!) In any case, the script has commands of the form
echo "Move it as you like, hit a key to restore"
pause
Although this is a very useful tool, the problem is that, if the molecule is moved around, it won't wind up back in the same orientation after this process.
You can restore the orientation by the following steps:
a. Save the view that you want to restore as a temporary script (enter the command "write script tempscr.txt" or whatever name you like). The goal is not to use that script, but to obtain from the script a set of commands that are located near the start of the script, of the form
rotate z 63
rotate y 52
rotate x 4
translate x -1
translate y 40
zoom 150
These commands position the molecule at it was in the view you wanted to restore. You can excise them from this temporary script with your word processor. That's all you need this temporary script for; it can then be deleted.
b. In the script you're developing, insert the command "reset", followed by this series of commands. The view will be restored, so that you can control what the viewer sees in the following steps.
http://www.biochem.arizona.edu/classes/bioc568/bioc568.htm
Last modified September 4, 2009
All contents copyright © 2009. All rights reserved.