Tutorial for making scripts
showing DNA-protein interactions

BIOC/MCB 568 -- Fall 2009
John W. Little--University of Arizona

568 home page
Rasmol tutorial
Files for this tutorial

Scripts are files that contain a series of commands like the ones you type in yourself in the RasMol Command Line window. Usually they contain pauses for you to study the image, and then you hit a key to proceed. One advantage of scripts is that you can view them without knowing the commands or anything about the molecule. The disadvantage is that you don't have much control over what is displayed.  

However, you can create your own scripts. As stated, these are files, like any other text files, and they can be edited using either a text editor (such as SimpleText on the Mac or Notepad on the PC) or a word processing program (such as Word or WordPerfect). As will become clear, when you save a script in RasMol (using the command "write script <filename>"), the resulting script contains some useful information, as well as a lot of extra lines that can be replaced with a relatively small number of commands.

Hence, if you want to write a script that shows several views, such as the ones used in class (and downloaded from the RasMol page), it is a two-step process. First, make a view of the protein that you like, entering commands at the command line, and save the script as above. Then, you delete most of the information, retaining certain commands near the beginning (as spelled out below), and replace them with the commands you already entered at the command line.

One way to see the difference is to look at two scripts, one made by RasMol and one I modified in this way. The view is similar to one that was generated midway through the script "contacts.txt" , without some of the zooming and rotating and so forth. The RasMol-generated script is Rasmol466.txt; the one I made is lys466.txt . These two scripts are 48 KB and 2 KB in size, respectively, and both generate the same image (put them into your "motifs" folder to run them). Most of the latter one consists of instructions for how to color or display particular atoms or groups of atoms. It also does not contain any comments or pauses; I put a couple of these into lys466.txt . Comparing these two scripts will help you see what needs to be retained and what can be deleted. See also comments at these links:

Additional comments and suggestions for writing scripts are at the end of the tutorial

Commands

Comments

script "cIload.spt"

This script can be obtained from the link above. It loads the molecule and positions it appropriately. Alternatively, you can enter the commands yourself, as follows.

Alternatively, you can position it and color various parts by entering the following commands (in green):

Click on File/Open, highlight 1lmb.pdb

This loads the lambda repressor:DNA complex

rotate z 9
rotate y -64
rotate x -38
zoom 115

This positions the molecule appropriately

select dna
color red
select protein
color [100,100,100]

This simplifies the coloring scheme. Some of this could be done with the menus as well. This should give you a view of lambda CI bound to its specific operator site, with the protein in dark gray and the DNA in red. Note: the color command tells it how much red, green, and blue to color it on a scale of 0-255; so it's "color [R,G,B]"

zoom 400
translate x 50
translate y -30

Get up close and personal.


Whether you used the script or entered the above commands, you should see a rather jumbled close-up view of part of the protein and DNA.


Examples of the use of "define" commands: These commands let you refer to particular groups of atoms as a single unit. They save a lot of time when you want to refer to such a group repeatedly.

They will be used here together with "select" commands to create a view
of a particular interaction in lambda repressor.

define g14 dna and 14 and not backbone

This defines the base of guanine 14 as a single unit. The syntax is "define x [as the atoms that meet the criteria] y", where y are the criteria.

select g14
color blue
spacefill 250

These commands highlight this particular base in blue and give partial spacefilling.

define asn55 *:4 and 55 and not backbone

This defines the Asn55 of one subunit; the *:4 means "any atom in chain 4". NOTE: this expression contains a colon " : " not a period. The colon is used to refer to a chain in the file, while the period is used in defining a particular atom..

select asn55
color red
spacefill 250

Highlight this residue side chain

Exercise: Now define Lys4 using the same syntax, select it, color it orange ("color orange"), and spacefill it. Don't be confused by the fact that it's residue 4 in chain 4!

When you've done this, you should see Lys4 in orange close to the other two groups.

define triple g14, asn55, lys4

This defines all three groups as a single unit. The "," means "or"; in other words, to be a member of "triple", it has to be g14 OR asn55 OR lys4.

At this point, it's hard to see much; we're looking at the base edge-on, and we can't see the contacts.
To simplify life for now, we can get rid of the rest of the molecule and rotate it.

restrict triple

The "restrict" command means that only the selected material will be displayed.

There are two ways to rotate it. The easy way is to enter "rotate a b" where a is a dimension (x,y, or z) and b is an angle. The more complicated way, for which you need the scripts (they are in the web site), lets you rotate it gradually. We'll try both ways. Before rotating, however, we must tell it where the center of rotation is. Otherwise it often rotates out of view!

center selected

This says to make the center of rotation the selected atoms ("triple" in this case). You won't see any changes, but it will make a difference in the next step.

rotate y 90

This rotates about the y (vertical) axis. (See note at the end about "rotate" and "translate" commands.)

rotate y -90

This puts it back to the previous position so we can try the script.

script y90-2.scr

There are six scripts like this; each moves it in 2 degree increments along one axis. giving a fairly smooth rotation. They are at the motifs link. If you don't have the scripts, you can use the "rotate" command to achieve the same purpose; in this case, "rotate y 90".

script y-90-2.scr

This does rotation in the opposite direction.

script y90-2.scr

Back to the position of choice.

define c28 dna and 28 and not backbone

Define the base paired with g14

select c28
color [0,0,150]
spacefill 250

Select c28, color it a darker blue, and display it. Now you can see the base pair; unfortunately there's no good way to make the hydrogen bonds.

Now let's highlight the specific contacts between the aa side chains and the base pairs. To identify particular atoms, click on them, and the command line tells you their identity. You could select them by the command "select atomno=x" where x is the atom number you discovered by clicking on the atom. For this exercise, they will be identified (since you don't know which are important). Note: in the expressions that follow, the *.nd2, for example, contains a period and not a colon.
The colon refers to an entire molecule, not an atom.

select asn55 and *.nd2(or "select atomno=1908")

color [255,150,100]o

Select a residue in Asn55 and make its color a bit lighter. Now it's a peach tone. Again, the numbers in the "color" command refer to the amount of red, green and blue respectively on a scale of 0 to 255. You have to experiment to get the most pleasing shade.

select g14 and *.n7
color [100,150,255]

Same thing for its counterpart in G14. A hydrogen bond is formed between the two atoms. Click on each atom to identify its number, which is used in the next command.

monitor 277 1908

This gives a poor substitute for drawing in a bond; it also gives the distance between the two atoms. This particular distance (2.90 angstroms) is about ideal for an H bond. Note: the color of the number is that of the first atom (277 in this case) listed.

rotate z -90

Rotate in the z direction (out of the screen) so you can see the distance better.

The other hydrogen bond is formed between Lys4 (in orange) and another atom in G14. The two atoms in question are those that look the closest in this view. Click on them to identify their atom numbers, then use the monitor command to measure the distance. What value do you get? Are the identities of these atoms chemically reasonable for making a hydrogen bond?

A view like this would be fine for showing the specific contact (for instance in a problem set).
All it needs is a little commentary.


echo " "
echo "Specific contacts between Guanine 14 and aa's Lys4 and Asn55"

Note that when this is entered it is "echoed" at the command line. In a script, the RasMol prompt would not be present; hence, the first line (echo " ") is useful for spreading out your commentary. How long your lines are depends on how big the window is on the viewer's screen; you can make multiline comments by having several sequential "echo" statements. In a script, you would have a command "pause" following the echo statements, so the viewer would have a chance to read the comments. Look at some of the scripts I wrote for examples.

Now let's draw back and look at the context.

script x90-2.scr

First rotate it to give a view in the plane of the bases ("rotate x 90" is equivalent)

select dna and not triple and not c28
wireframe

Select all the DNA except the bases of G14 and C28, turn on the wireframe.

select protein and not triple
wireframe

Same thing for the protein.

zoom 100
translate x 0
translate y 0

This repositions the molecule. You could do this with the mouse and keyboard, but by following these commands it will be in a defined place for what follows.

define arm *:4 and 2-10
select arm
wireframe 100
color green

This selects aa residues 2-10 of the subunit, including Lys4, and colors them green

script y-90-2.scr (or rotate y -90)

This shows how this part of the protein, including Lys4, is an arm reaching around the "back side" of the DNA. This feature of CI is unusual.

rotate y 90
select *:4 and 52-60
color purple
wireframe off
ribbons
rotate y -90

This selects the part of the protein including Asn55, colors it purple, and shows that it is not in an alpha-helical region, indicated by a free-form loop. The point is that not all contacts are with residues in an alpha helix.

ribbons off
wireframe
color [100,100,100]
select arm
wireframe
color [100,100,100]

De-emphasize this part of the protein, except for Asn55.

Note: If you want to repeat commands quickly, the up-arrow key scrolls through previous ones (useful here with the color command)

Now let's look at the alpha helix.

select *:4 and 44-51
color yellow
wireframe off
ribbon

This selects the "recognition helix", and shows that it is alpha helix. It also shows that Asn55 is not part of the recognition helix.

select *:4 and (44,45,46,48,49) and sidechain
wireframe 50
rotate x 10
zoom 200
translate y -50
rotate y 90

This selects the side chains in the recognition helix that make specific contacts with the bases, and shows more or less how they lie in the major groove. Note: Many of the contacts are made with protein backbone carbons, and are not shown here.

define ser45 *:4 and 45 and not backbone
select ser45
spacefill 250
define g16 dna and 16 and not backbone
select g16
spacefill 250
define sg ser45,g16
select sg
center selected
zoom 400
translate x -30
translate y -80

Now focus on one of the contacts.

 

 

 

restrict sg
monitors off
rotate x 90

Simplify the image

monitor 1841 319
monitor 1841 322

This Ser -OH makes not one but two H bonds with the guanine.

Exercise: How would you turn on the base that is paired to G16? Hint: First you have to find it. Turn on the DNA as above (select dna and not g16), so that anything you do to the DNA doesn't affect G16.



Back to top of page

Additional comments about certain commands:

1. The rotate commands are "relative"; they rotate the image relative to its current orientation. By contrast, the translate commands are not relative; they translate the image relative to a particular frame of reference, which is defined by the PDB file and doesn't change when you manipulate the image.

So, if you enter "rotate y 90" twice, it will rotate it both times. But if you enter "translate x 10", the second time it won't move, because it isn't relative to the previous position.

Caution for Reading Papers

The occasional student is tempted to cut the paper out of the journal so he/she can have a copy that's in color. This has happened in the past to journals in the library. Don't do this; anybody wanting to use the journal later won't even find the paper.

Tips for interpreting PDB files and relating them to papers

When you read a paper and try to use the information to look at a PDB file, you need to be able to relate the residue numbers in the paper to those in the file.

Proteins: Generally, the residues in the PDB file have the same numbering scheme as those in the paper. These numbers refer to the residue numbers in the native protein. Often, in fact almost always, the proteins used for x-ray crystallography are not the intact protein, but some fragment of the protein (this is because the intact proteins usually don't crystallize well; the catch-all explanation for this is that they are "floppy"). Nonetheless, the residues aren't numbered starting from 1 in the fragment, but in the intact protein. For instance, the glucocorticoid receptor fragment used for structure determination (1glu.pdb) comprises residues 440 to 525 of the intact protein, and these are numbered as such. You can find this information by looking at the PDB file with a text editor (see below).

If there are several subunits of the same protein, the numbering scheme is consistent even when different subunits don't have the same residues showing (as is the case with lambda repressor, for which only one arm is visible).

DNA: Here the case is more complicated. Co-crystal structures always contain synthetic DNA oligonucleotides. The numbering scheme for these DNA molecules isn't standardized; every paper has its own nomenclature, and it varies a lot. A typical example is lambda repressor, for which the paper uses a nomenclature like position 1 or 1' to refer to the base pairs, not the individual bases. Other papers might call a base 1 and its complement 1' or -1. The PDB files generally don't follow these conventions. In turn, there are several different ways that the bases are numbered. Examples:

For lambda repressor (1lmb.pdb), the bases are numbered such that one strand starts at 1 and goes to 21, the other strand starts at 22 and goes to 42. In this case, you can identify a base uniquely by its number; you don't need to specify the strand (using something like *:2).

For glucocorticoid receptor (1glu.pdb), the bases on one strand are numbered starting at -10 to 9; those on the other strand are also numbered -10 to 9. In this case, if you select dna and -10, you get two residues not one; you would have to say select dna and -10 and *:c to get the one in strand c.

To determine how the PDB file for your DNA does it, open the PDB file with a word processing program and look at it. If you do this, be sure that you don't save it in the format of your word processor, because RasMol won't be able to use it; you don't need to save it, but if you need to save it, do so as a text file. It might be helpful to print it out if it's not too long.

To establish the correspondence between the paper's nomenclature and that of the PDB file, you have to study the sequence of the two, and make a table showing the relationship. This will allow you to translate the information in the paper to using RasMol properly to identify the bases. Most papers have a figure showing the sequence of the oligonucleotide used; you can Xerox this figure and write on the xerox. Don't write in the journal.

back to top

Additional suggestions for scripts:

1. When you edit a script, you will probably do so using a text editor or word processor like Word or WordPerfect. When you save it, be sure that you save it as a text file (in WordPerfect, for example, under the save menu it gives you the option of saving as a variety of types).

2. You can get ideas for how to achieve certain effects by looking at the scripts I wrote. Again, you can look at these scripts (or at PDB files) using Word or WordPerfect; but don't change them, or save them as Word documents, when you're done looking at them, or they won't work in RasMol.

3. When you write a script, it is a good idea to get a view that you like, then save it. The script has all the information needed to load the molecule and set things up right, and to recreate the image. Often, however, the program will put in information for hundreds of atoms! This information can be much more simply entered by a series of commands such as those we've gone through above.

You can delete the information for the atoms and substitute them with commands, using a word processor. This isn't essential if you only want to present only a single view, but if you want to have a progression of views you have to do it this way. It also makes the file shorter.

4. At certain points, you may want to give the viewer the option of rotating the molecule to check out certain features, as is done in "motifs" for example. (This is a refinement that isn't necessary for problem sets!) In any case, the script has commands of the form

echo "Move it as you like, hit a key to restore"
pause

Although this is a very useful tool, the problem is that, if the molecule is moved around, it won't wind up back in the same orientation after this process.

You can restore the orientation by the following steps:

a. Save the view that you want to restore as a temporary script (enter the command "write script tempscr.txt" or whatever name you like). The goal is not to use that script, but to obtain from the script a set of commands that are located near the start of the script, of the form

rotate z 63
rotate y 52
rotate x 4
translate x -1
translate y 40
zoom 150

These commands position the molecule at it was in the view you wanted to restore. You can excise them from this temporary script with your word processor. That's all you need this temporary script for; it can then be deleted.

b. In the script you're developing, insert the command "reset", followed by this series of commands. The view will be restored, so that you can control what the viewer sees in the following steps.

Back to top


BIOC/MCB 568 -- University of Arizona

http://www.biochem.arizona.edu/classes/bioc568/bioc568.htm
Last modified September 4, 2009
All contents copyright © 2009. All rights reserved.