
Below are instructions for working your way through a demonstration of the the `xgenproc' script, which is an almost fully automated way of running the X-GEN data processing software package.
What you need to do.
First, log in to one of the Linux machines with your official login. Bring up a Unix terminal. Most of you can do this by clicking on the icon on your task bar at the bottom of the screen that has a cartoon of a terminal screen on it.
Once you have a terminal window somewhere on your screen, enlarge it to 132 columns wide by grabbing the lower-righthand corner of the terminal window and dragging it to the right until the size of the box appears as 24x132. Then prepare yourself for typing in the terminal window by clicking on the top stripe of the terminal window. This should cause a dark rectangle to appear to the right of your prompt message on the terminal window. You're now ready to type into that window.
Here's what you type:
mkdir ZnMyo
(In this and all other operating-system commands,
you should end the line by hitting the ENTER key).
This command tells the operating system that you want to create a
directory (folder) called ZnMyo.
cd ZnMyo
This makes ZnMyo your working directory.
ln -s /xsdata/howard/07_07_14_IIT/ZnMyo/*.0* .
This tells the system to make a symboli link from each of the detector-image files in a repository in my login directory. Thus, you can work with them as if they were local to your directory without actually making a new physical copy of the image.
Now you're ready to actually run the application. To do so, type
xgenproc -x2027 -y2004 -i19 -c2000 -r6 -j8000
This tells the system to use the xgenproc script to process this ZnMyo dataset. The only parameters you're feeding to it are the (X,Y) coordinates of the beam center (X=2027, Y=2004), the spacegroup (19), the minimum number of spots it should search for during refinement (2000), the number of images it should bundle during on-the-fly refinement during integration (6); and an (X,Y) error parameter during integration that you can read about on the X-GEN webpage.. Watch the results of this processing as it proceeds by observing the output in your terminal window. The processing would probably succeed even if you didn't do this step, but it'll perform more smoothly this way.
Deviations of recomputed (inverse-spline) points from inputs:
RMS Dev(x) RMS Dev(y) Max Dev # pts
All Points: -0.0007 -0.0007 0.00081 0.00081 0.00207 3969
Observed: -0.0007 -0.0007 0.00081 0.00081 0.00207 3969
Interpolated: 0.0000 0.0000 0.00000 0.00000 0.00000 0
calculated beam center is -0.113 0.113
X-GEN command calibrate -j 2027 2004 exited with status 0
Number of properly summed spots = 939 Number of spots close to XY edge = 1 Number of spots near 1st/last fm = 162 Number of spots with many Ct- After this the code will auto-index and refine the data. X-GEN uses a difference-vector granularity approach (first developed by R. Sparks in the 1960's for small-molecule data) rather than the Fourier-transform approach described by Jim Pflugrath last week. The program will quickly find a Delauny reduced cell for this crystal, and you'll see a line of diagnostic code that looks like
Refining a:Y b:Y c:Y alpha:Y beta:Y gamma:Y omega:Y chi:Y phi:Y Value: 39.878 47.599 78.160 90.000 90.000 90.000-121.278-125.192 90.663The way you'll be able to see that you're on the right track is in terms of the histogram of integerness for (h,k,l). This histogram at any stage of the refinement looks something like-----Root-Mean-Squared Errors In-------- <#frms # of # of Index X Y phi Gamma Profile moved> refls indices 0.0315 1.2657 1.4332 0.2672 0.0373 0.0503 0.0416 782 2346 0.0266 1.1201 1.2508 0.1977 778 2334 D> 5.13 D> 3.97 D> D> 3.28 2.93 D> 2.70 D> 2.40 D> 2.16 D> 1.36 Shells 0.0154 0.0213 0.0233 0.0353 0.0277 0.0305 0.0294 0.9493 Err(h) 116/348 107/321 52/156 116/346 81/243 110/330 78/234 122/364 ind/ref Index err(h)> 0.000 0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 h 762 18 0 2 0 0 0 0 0 0 k 757 23 0 0 0 0 0 0 0 0 l 691 79 6 4 0 0 1 0 0 1which shows you that (at that stage) 757 of the reflections had k values that were either perfectly integral or within 0.05 of the closest integer, and so on. In general these lists should get better and better, i.e. the histogram should move to the left, the longer the refinement proceeds.At some point the refinement will decide to attempt to determine the spacegroup. In this case it finds that it's really primitive centered orthorhombic; the default primitive orthorhombic spacegroup is P222, but because we invoked the script with "-i19", it will override the default and replace it with spacegroup P2(1)2(1)2(1), which is International Tables Spacegroup number 19. If the indexing had been very different from what we expected, e.g. if we hadn't gotten an orthorhombic primitive lattice as the first choice initially, the "-i19" would have been ignored and the system would have gone with its best guess. The refinement will proceed somewhat beyond that determination, with the unit cell parameters, the orientation angles, the detector parameters, and the rocking-curve parameters γ0, γ1, and γ2 all being optimized.
After the refinement the program actually determines the integrated intensities of the Bragg diffraction. This is the core of the effort. The integration involves predicting where the spots are, measuring their background-subtracted intensities by both profile-fitting and conventional summation, and periodically refining the crystal and detector parameters and the three-dimensional profiles used to define the integration boxes and the characteristics of the profile-fitting effort. At the end of any given image, the integration program will put out a line of diagnostic output that begins something like
5 ZnMyo11_1.0005 1.140e+10 0.00 224498 10607567 580 516 310 12 -1.27 18546.8 24.69 93.1 27.1 1.33 1.7 1.6 0.2The columns here areAfter 20 minutes or so the integration will be completed and the program will proceed to the final stages.
- image number
- file name
- total counts
- the loss fraction (always zero for non-multiwire detectors)
- number of pixels contained within the reflections
- number of pixels used in background-updating calculations
- number of reflections predicted to peak in the image
- number of reflections integrated
- number of integrated reflections with intensity greater than 12 *σ
- average change in background since previous image
- average intensity of measured reflections
- average intensity/sigma for measured reflections
- I/sigma at sin(theta)=0, as extrapolated from the pseudo-Wilson plot of I/sigma vs. sin(theta)
- Bsig, i.e. the slope of the pseudo-Wilson plot
- D2, i.e. the resolution value in Ångstroms at which <I/σ> falls to 2.
- RMS Errors in X,Y, and Z. The errors in X and Y are in pixels; the error in Z is in frames.
The final stages involve reformatting the data, computing a model for systematic error-correction on the data (typically termed "scaling" the data), identifying and removing outliers from the data, and producing scaled, merged output data files. The merged output files are in several formats:
- An X-GEN specific format called a mulist called sm1_1_.mu
- A SCALEPACK-style file called ZmMyo11_1.sca
- A SCALEPACK-style file with Friedel-mates separated called ZnMyo11_1.asc
- A Sheldrick (SHELX, XPREP) -compatible file called ZnMyo11_1.hkl
- A CNS-format file called ZnMyo11_1.xcn
- An X-PLOR format file called ZnMyo11_1.xxp
pushd xgen
Then you're ready to look. Using vi, for example, you would invoke it withvi ZnMyo11_1.xlg
and examine the logfile. I won't try to teach you how to use a Unix text editor here. Each of the commands to the operating system will generate at least a few lines of output in that logfile, and the segment of the logfile associated with that run will begin with a line that looks something likeX-GEN V.4.0 Data Reduction: Run of reject on Fri Jul 18 15:36:38 2003
so you can search for strings like that to find the portion of the logfile that summarizes how any particular operating-system command in the package actually worked out. I encourage you to go through that logfile to see what was happening. Additionally, you can examine your data graphically. To do so, make sure you're in the right directory by typingpwd
which means "print working directory." You should get an answer back that looks like /home/xsblahblah/ZnMyo. If it does, then you can saypushd xgen
If instead you get back /home/xsblahblah/ZnMyo/xgen then you don't need to change directories. In either case, you then typesource *.com
which tells the operating system that you, too, want to look directly at your data. Having issued those commands, you can now saypdisplay 4
which puts up an X-Windows screen with a roughly 512x512 reduction of the detector face for image number 4 (hence the 4 in the invocation). The color scheme may not be ideal at this point: if necessary, click the Spectral button and then drag the Top Intensity slider down to a lower value-- say, around 1400. This will probably produce a readable image that contains Bragg spots, but you may need to tweak Top Intensity and Top Multiplier further to get the image you really want. The computational software does not depend on your ability to see clearly what's going on on the individual images, but it's still useful to see what's going on. You can also insert overlays over the image. Try pulling down the list of sub-commands under Overlays and clicking on border: this will produce a graphical indication of where the active area of the detector is. Similarly (after you shut off the overlay border functionality) you can click on refine to find the predicted reflections displayed over the top of the raw data.
Furthermore, you can look at the quality of the processed data. For example, turn off all Overlay functions and click the Multiref menu. Click on DelI/sig. What you will see is the positions on the detector face on which your observations were made, color-coded according to how large (Iij - <I>j) / σij is. The color stripe on the right will show you how large your (Iij - <I>j) / σij values are. You will see that the plot has features in it, which suggests that perhaps there are unmodeled systematic errors in the data.
Try some of the other options in the Multiref and Urefls menus. Most of them should make sense without explanations if you think about them; if not, ask me!