This file is a tutorial on how to use the BnP package for AUTOMATIC protein phasing given MAD, SIRAS, SAS (SAD), or SIR data. A procedure sufficient for phasing any structure is described, and it is illustrated by working through test data for a three-wavelength selenomethionine MAD data set. It assumes that BnP has been correctly installed and is in the user's path.
This tutorial is divided into three parts. The first part describes launching the program and supplying the required data and parameters. The second part describes how to follow progress of the structure determination and examine the results. The third part describes how to export the final phases, coordinates, maps, skeletonized electron density, and control files to external programs/packages such as O, CNS, CCP4, RESOLVE and Arp/Warp for subsequent processing. If they are properly installed on your system, RESOLVE and/or Arp/Warp can also be launched directly from within BnP for automatic chain tracing/model building. Use of RESOLVE, Arp/Warp and MTZ file options requires that CCP4 version 5.0, Arp/Warp 6.1.1, and RESOLVE 2.08 or later are installed on your system and are in your path. Automatic solution of the heavy atom/anomalous scatterer substructure, phasing the protein, and producing an electron density map suitable for chain tracing however, does not require any of these external programs.
Important processing considerations!!!
It has been found that in some situations, particularly those involving modern Linux releases, that BnP can run EXTREMELY slowly (exceeding 20 times slower!) if the data files and directory from which BnP is launched are not resident on the computer whose cpu is used to run the job. The BnP programs themselves can reside on any accessible machine with little consequence, but the data files and working directories should be local to the cpu used for good performance to be obtained. If the users working directory is not local to the cpu, i.e. it sits on a server disk somewhere but is accessible through cross-mounting via NFS, one way to make it local is to make use of the /tmp directory present on most unix/linux systems. This directory is usually accessible to all users. In that case the user can log in to the machine with the desired cpu and cd to /tmp. Then create a new subdirectory there and copy the reflection and sequence files to it. Launching BnP from that subdirectory will insure that all data and working files are local, and the best performance should be obtained. Upon BnP completion however, the entire contents of that directory should be copied back to the users "normal" working directory (preserving the subdirectory structure) as the /tmp area may be emptied periodically by the system manager (or system itself, through scheduled cleanup processes).
In general, to carry out the automatic structure determination process you need to follow the instructions below, and it is advised that each new user reads through this document at least once. HOWEVER, TO SIMPLY AND QUICKLY RUN THROUGH THE TEST DATA one needs only carry out the explicit instructions given in the boxed TEST DATA sections that follow, in the order that they appear. The test structure to be solved is Methylmalonyl-CoA Epimerase, with the PDB code 1JC4. In this example, there are four molecules in the asymmetric unit and, based on the sequence, there are 28 selenium sites expected. To run the test case, you will need four files - three containing the reflection data corresponding to the leading edge (inflection point), peak, and remote (high energy) wavelengths, respectively, and the fourth containing the amino acid sequence. These files can be obtained as described below.
1) Create a directory (generally with a SHORT name) and place in it only the available reflection data files along with the sequence file. The data files can be in SCALEPACK, D*trek, or MTZ format, and should contain only UNIQUE reflections. If anomalous scattering data are to be used, then the files must, of course, contain Bijvoet pair information. The "extensions" (i.e. last few characters of the file names) are important since they indicate the "type" of file. Make sure the reflection files end in ".sca", ".dtk", and ".mtz" for SCALEPACK, D*trek, and CCP4 MTZ files, respectively. The sequence file is a simple text file and should end in ".seq". See the 1jc4.seq file as an example. ONLY ONE UNIQUE COPY OF THE SEQUENCE should be provided, even if multiple copies of that sequence are present in the asymmetric unit. Separate chains of differing sequence by an intervening line containing only ">>>" (without the quotes).
TEST DATA: If you are already running BnP just select "Copy test data" from the "Tutorials" menu at the top of the screen. Then click on the "General Information" tab at the top of the screen and continue starting with the boxed area in step 3 below. Otherwise, create a new directory called 1jc4, go to it, and copy to it the edge.sca, peak.sca, remote.sca and 1jc4.seq files supplied with the BnP distribution (they can be found in the subdirectory "examples/1jc4" under the BnP home directory). Then continue starting with the boxed area in step 2 below.
2) From the directory containing the data files, type BnP (case sensitive) to launch the program. When the first screen ("About BnP") appears, click on either the "General Information" tab at the top or the "Continue" button at the bottom to bring up the "General Information" screen.
TEST DATA: Type BnP from the directory containing the data. Once launched, click on the "General Information" tab at the top of the screen.
3) Click on the "Import" button and follow the on-screen instructions. You will see a listing of all files in the directory where you launched the program, and you should "select" (i.e. click to highlight) any data and sequence files to be used in the structure solution. For a three-wavelength MAD data set, one would normally select the three data files plus a sequence file. To select more than one file, hold down the "CTRL" key while clicking on the file name. Once all of the desired files are highlighted, click on the "Import" button at the bottom of the window. A new window should then open.
TEST DATA: Click on the "Import" button near the top of the screen. In the next window, just hold down the "CTRL" key, and click on the edge.sca, peak.sca, remote.sca and 1jc4.seq files. Then, release the "CTRL" key, and click on the "Import" button at the bottom of the window.
4) In the "Import Data" window that will pop up, certain information will have to be provided although most fields will have been filled automatically using information in the input files.
In the "Title" field, enter any text you would like identifying the project.
In the "Structure ID" field, enter a SHORT (5 characters or less suggested) string identifying the protein you are working on.
TEST DATA: In the new window, enter 1jc4 in both the "Title" and "Structure ID" fields.
5) Verify that the space group has been picked up properly. If not, click on the button next to the space group to open a pulldown menu, and select from it the correct space group. NOTE! The space group is usually obtained from the reflection file header records. Sometimes, however, during data reduction when these files are created, a space group having LOWER symmetry is used when the true space group has yet to be determined. For example, data reduction sometimes is done in P222 or P2 when the real space group is in doubt, or one has yet to determine whether screw axes are present. Be sure that the correct space group is set! If in doubt, you will have to try different space groups in separate BnP runs.
Check that the "Type of experiment" is set correctly and that the "Heavy element" fields are also set correctly for each data set shown in the table. If not, correct them via the pulldown menus. For MAD data, the default heavy element is "Se", but it can be changed for ALL data sets by using the "If MAD, heavy element" pulldown menu. For all other types of experiments, the heavy element must be supplied explicitly for each data set by clicking in the "Heavy Element" field associated with that data set and then selecting from the resulting pulldown menu. Note that there is a "blank" entry at the beginning of the table for native data sets not containing anomalous scatterers.
One MUST next associate a "TYPE" with each data set. For each data set shown in the table, click in the corresponding "Type" field, and select from the pulldown menu to identify the way the corresponding data set is to be treated in the phasing process. Use the key on the screen to make the association (i.e. choose IP, PK, HR, LR, NAT, or DER for inflection point (leading edge), peak, high-energy-remote, low-energy-remote, native, or derivative data sets, respectively).
TEST DATA: Click in the "Type" field corresponding to each of the three data sets that appear in the table. From the pulldown menu select "IP" for the edge.sca, "PK" for the peak.sca, and "HR" for the remote.sca files, respectively.
6) If a sequence file was input, verify that the file has been correctly read. You should see a table with a line for each chemically distinct chain that includes a chain number, a guess at how many of these chains are present in the asymmetric unit, and the beginning of the sequence. Below the table you will see the corresponding Matthew's coefficient and an estimate of the solvent fraction in the unit cell. If you want to change the number of chains of each type in the asymmetric unit, click in the corresponding "No. per ASU" field, and adjust via the up and down arrows. If you change the number, then be sure to click on the "Recalculate Vm & Solvent Fraction" button to update those parameters. You can also use this as a tool to investigate likely NCS possibilities. Just be sure to return to the desired setting and click on "Recalculate Vm & Solvent Fraction" before continuing.
TEST DATA: You should see the space group as P21, the experiment type as MAD, the heavy element types all as Se, the "Types" as IP for inflection point (edge.sca), PK for peak (peak.sca) and HR for high energy remote (remote.sca), one chain type with an estimate of 4 chains in the ASU (which is correct for 1JC4), a Matthews coefficient Vm of 2.01, and a solvent fraction of 0.36.
7) Once everything is set, click on the "Continue" button at the bottom of the window. If all necessary parameters were established, the window will disappear returning to the main BnP "General Information" screen. If a necessary parameter could not be determined, then a window will pop up telling you that the general information is incomplete. This just means that one or more items were either not supplied or couldn't be deduced from the supplied data files. Just click on the "OK" button to continue since you can (and MUST) supply the missing information explicitly on the next screen.
TEST DATA: Click on the "Continue" button.
8) You will now be back to the "General Information" screen, and you should check that ALL FIELDS have been appropriately filled in. Check the Title, Structure ID, space group, and cell dimensions, inserting information or correcting any if needed.
If a sequence file was input, in the "Native Asymmetric Unit Contents" section you should see the total number of residues in the asymmetric unit (including multiple copies if NCS is present), and the corresponding numbers of C, H, N, O and S atoms. If no sequence file was used, you MUST enter the TOTAL number of residues in the asymmetric unit in the optional "No. Residues" field, and hit return. This will estimate the number of C, H, N and O atoms for you. You should then manually add the expected number of S atoms based on what you do know about the sequence. You can add the number of S atoms by leaving a space following the last number in the field, entering S, and then entering the number of sulfur atoms (with no spaces between the S and the number).
You should see below a table containing one column for each data set input. You should check to see that all fields are correctly filled. It is crucial that ONE AND ONLY ONE data set have the "Data set type" field set to "native" or "MAD native" since this is the data set that will actually be phased. The wavelengths should all be supplied, but they are utilized only to determine whether or not Cu radiation was used. If a data set contains Bijvoet pairs but they are considered unreliable and should not be used, then the appropriate entry in the "Anomalous dispersion" row should be set from the pulldown menu in the corresponding data set column. The number of heavy atom/anomalous scatterers expected for each data set should be set in each "No. expected sites" field. For MAD data with Se as the anomalous scatterering type and a sequence available, the entries should all be essentially correct by default (obtained by counting the number of Met residues in the sequence). However, if some other anomalous scattering element is used (including non-MAD cases), you will have to specify the "No. expected sites" explicitly since they can't be deduced from the sequence. You might also want to adjust the anomalous scattering factors f' and f" if you believe you have better numbers, but the defaults should lead to good, though not necessarily optimal, results for MAD experiments. For non-MAD cases, the f' and f" values MUST be explicitly entered unless the wavelength is 1.54 +/- 0.02 (i.e. Cu K alpha radiation).
When all necessary parameters are defined, click on the "Auto Run" tab at the top of the screen.
TEST DATA: Click on the "Auto Run" tab at the top of the screen.
9) You should now see the "Auto Run" screen. In general, the defaults are all sufficient and one can just click on the "Submit job" button. There are, however, some options you may occasionally want to change.
In the "Structure Determination (SnB)" section, the default "Difference data set(s) to use" appears. Depending on the number and nature of the supplied data sets, there may be several choices. In the pulldown menu, a list of iso (isomorphous or dispersive) differences and ano (anomalous) differences appears for each appropriate data set as well as a combined "MAD ano/iso" option for MAD data sets. The user may select any of these choices of difference data to use FOR THE DETERMINATION OF HEAVY ATOM/ANOMALOUS SCATTERER SITES. For the "MAD ano/iso" case, the anomalous differences corresponding to the data set having the largest f" will be tried first, and if no substructure solution is obtained, then the data sets yielding the largest dispersive differences will be tried next. The high resolution cutoff, minimum E/sig(E) or signal-to-noise ratio, number of trials, and random seed can be modified if desired. Note that these values all pertain ONLY TO THE SUBSTRUCTURE DETERMINATION PROCESS, and NOT to the final protein phasing.
In the "Job Submission" area the defaults are also generally used, but they can be changed if desired. The default is to carry out complete parameter and protein phase refinement once the substructure and enantiomorph have been determined, but one can turn refinement off by selecting "No" in the "Refine substructure etc." field. In that case, the substructure parameters are held fixed at the values determined from direct methods, and the protein phases are computed only once. This will result in much faster execution, but with less than optimal results.
A "File ID" must be present which will be used as a filename prefix, and defaults to "auto". The "job type" defaults to "Quick", but it can be changed to "Thorough" if desired. In "Quick" mode, the map grid spacing is increased for all but the last Shake-and-Bake iteration, and half as many iterations are performed for each trial. This leads to much faster execution per trial, but may occasionally require that more trials be run to find a solution. In "Thorough" mode, the original SnB protocol is used.
Once all of the options are set as desired, click on the "submit job" button to launch the auto structure determination process.
TEST DATA: Click on the "Submit job" button.
The remainder of the "Auto Run" screen is used to follow progress of the structure determination, to examine intermediate and final results, and to output data for (and in some cases, run) external programs for autobuilding, map fitting, or other data processing.
The "Review Results" section is split into three parts, with each pertaining to certain tasks in the structure determination process. When a task (job) is run, it will usually create an entry in one of the three areas, thereby telling the user what job is running and what jobs have completed. Entries without a "*" (asterisk) next to them correspond to completed jobs, while entries with a "*" are currently running. One can always examine (or make use of) files created by a job once it has completed, but in some cases intermediate results can also be examined while the job is still running. This section is automatically updated every 20 seconds, but an update can be forced by clicking on the "Update List" button to the right of the section. Progress on the structure determination proceeds from left to right through the table. Each of the three sections are now described, in the order that results will usually appear in them.
1) Data Preparation. This section provides information about the tasks used (i) to scale and create anomalous and isomorphous (dispersive) differences and (ii) to generate the normalized structure factors (E values) and structure invariants needed for direct methods. Results for any of the tasks listed in this section can be viewed by clicking once on the task name to select (highlight) it, and then clicking on the "View Results" button to the right of the section. For example, viewing the IP_iso task will reveal information, including R factors, for the isomorphous (dispersive) differences formed from the "inflection point" and "mad native (generally high energy remote)" data sets. The IP_ano results will show similar information for the "inflection point" anomalous differences. If the selected task also corresponds to the differences used for substructure determination, then two additional popup windows appear; one will show the results from E generation and the other will show the results from invariant generation.
TEST DATA: When the entry "PK_iso" appears without a "*" next to it, click to highlight it. Then click on the "View Results" button. Click on "OK" to close the window when you are done. Do the same for "PK_ano" and any other data sets you wish to examine.
2) Substructure/Phasing. This section provides information about the SnB job used to locate the heavy atom/anomalous scatterer substructure. If an entry appears in this field, the current results of a job can be checked even while it is still running (i.e. while it has a "*" next to it). However, it is a good idea to let the job run for a minute or so before you try checking it so that some results can accumulate. All of the buttons in the lower left panel under "Substructure Phasing" can be used to check progress and to assess the current results. NOTE! If a substructure solution is automatically detected, a small window will pop up indicating so, further processing of trial substructures will be terminated, and the automatic structure determination will proceed to the next step. One can use the tools below (from the "Substructure Phasing" panel) to examine the current results either after a solution has been detected or during the run to monitor progress.
The "View Histogram" button will bring up a histogram of Rmin (minimal function) values showing all trials (each starting from randomly positioned atoms) processed so far. Viable solutions will stand out at the left separated from non-solutions at the right.
The "View Sorted Trials" button will list trials in ascending order of Rmin value (i.e. the best solutions will occur at the beginning of the list.)
The "Trace Rmin" button will show a plot of the minimum function value during each iteration of the Shake-and-Bake algorithm for the trial that is currently deemed "best." Good solutions usually show an abrupt drop followed by a levelling off near a low Rmin value.
The "View Coordinates" button will bring up a list of fractional coordinates and peak heights corresponding to the largest peaks in the map produced from the trial phases currently deemed "best."
TEST DATA: When the "PK_ano/auto" entry appears in the "Substructure Phasing" section and has been there for a minute or so, try clicking on the buttons in the "Substructure Phasing" panel in the lower left part of the screen to see what each shows. After a while, when the "Probable solution has been found" window pops up, click "OK" to close it, and try all the buttons from the lower left panel again to see the results for the final (best) solution.
3) Protein Phasing. This section deals with validating the substructure sites, determining the proper enantiomorph, refining parameters (scaling, substructure, and lack-of-closure), computing protein phases, and improving these phases by solvent flattening/negative density truncation. In general, one should wait for each task in this section to finish before examing the results. Results should be examined in the order indicated below, once the appropriate entry appears in the table without a "*" next to it. The results are generally examined by using the tools in the "Protein Phasing" panel in the middle at the bottom of the screen.
3a) Validate. The validate task refines the occupancy parameters of the highest 1.5N peaks (where N is the number of expected sites) on the SnB direct-methods map. Occupancies are refined against the data with the largest anomalous differences or against the largest isomorphous differences if no anomalous differences are present. Once the "validate" entry appears without a "*" next to it, click on the "View Site Validation Summary" button in the "Protein Phasing" panel. A window will pop up listing the site number (position in original peak list), a "Select" flag, and the occupancy after a very quick refinement process. Sites are automatically accepted as correct if they refined to appreciable occupancies (> 0.2 if refined against ano data; > 0.35 if refined against iso data). Accepted sites appear with a "Yes" in the "select" column. Thus, counting up the number of "Yes" entries reveals the number of substructure sites reliably determined.
TEST DATA: When the "validate" entry appears in the protein phasing table without a "*" next to it, click on the "View Site Validation Summary" button in the "Protein Phasing" panel. You should see about 24 sites that refined to high occupancy and were selected for use in subsequent protein phasing.
3b) Test_hand. The test_hand task quickly computes protein maps (using unrefined parameters) based on both the set of validated sites and on the set of sites enantiomorphous to the validated set. Various procedures are then used to determine the correct enantiomorph, depending on the data sets used in phasing. (i) If both anomalous and isomorphous/dispersive differences are used, a measure of the density variation in the protein and solvent regions of each map is computed, and the map showing the greater difference in density variation (i.e. greater contrast) between the two regions is deemed to have the correct hand. (ii) If only anomalous differences are used, then solvent flattening/negative density truncation is carried out on both maps, and final Figures of Merit, Correlation Coefficients, and R factors are used to determine the correct hand. (iii) if only isomorphous/dispersive differences are used the hand can not be determined at this stage, and full processing is carried out on both hands independently. If any anomalous differences are used, once BOTH the "test_hand" and "test_hand_en" entries appear without a "*" next to them, click on the "View Hand Determination Summary" button in the "Protein Phasing" panel. A window will pop up containing the results of the "hand" (enantiomorph) determination procedure. The table will indicate which hand is correct, either the original (substructure coordinates as determined by direct methods), or alternate (coordinate signs opposite those determined by direct methods). If the hand was deemed to be reliably determined, the correct choice will be used in all subsequent calculations. If the hand could not be reliably determined then both possible choices will be fully processed independently, and the user can decide which is correct by examining the final maps. Note that if only isomorphous/dispersive differences are used, then no "test_hand" or "test_hand_en" entries will appear, but both possible choices will be tried independently with full processing on each.
Note! Although not usually necessary, one can confirm the automatic hand assignment (if both isomorphous and anomalous differences were used in the phasing) by viewing projections over a few sections of the electron density maps computed assuming each hand possibility and comparing them. Just click on the "test_hand" table entry to select (highlight) it, and then click on the "View Electron Density Map" button in the "Protein Phasing" menu. After a few seconds an electron density map will be displayed assuming the original hand. From the on-screen menu click (left mouse) on the "Add Next Section" button 4 or 5 times to accumulate a projection. For the correct hand there should be discernable contrast between some high density regions occupied by protein and low density regions occupied by solvent. Look at the map for a few seconds, then use the menu to "exit", and type "n" (without the quotes) followed by a return. Now select the "test_hand_en" (for enantiomorph or alternate hand) entry from the table, and repeat the process to see the map corresponding to the opposite hand. These maps will be noisy since no refinement has been done, but they should still be good enough to show which map had better protein/solvent contrast.
TEST DATA: When the "test_hand" and "test_hand_en" entries appear without "*"'s next to them, click on the "View Hand Determination Summary" button from the "Protein Phasing" panel. Examine the results that will be displayed to see what hand was chosen.
3c) Auto or Auto_en. Once the automatic hand determination procedure completes, protein phasing (usually with refinement) begins starting with the validated substructure sites and the correct hand. This procedure can take considerable time depending on the substructure size and number of reflections. Thus, its progress can be checked at any time including when the job is still running. If the default was used, the job entry will either be "auto" or "auto_en", depending on whether the "original" or "alternate" hand (enantiomorph) was deemed to be correct, respectively. In cases where the hand could not be reliably determined or hand determination is deferred until final phase sets are available for both enantiomorphs, both entries will appear. Two log files are available to follow the phasing/refinement progress. First click on the "auto" or "auto_en" entry in the "Protein Phasing" table section to select (highlight) it, depending on which results are to be examined. Then, click on either the "View Refinement Summary" button in the "Protein Phasing" panel (to see a condensed version of the log file), or on the "View Results" button in the menu to the right of the table (to see the complete log file). Five passes of refinement are automatically carried out, thus checking the pass number will indicate how far along the process is. One can also click on the "Graph Phasing Statistics" button in the "Protein Phasing" panel to quickly see the current results. This will create a popup window with a menu allowing different types of graphs to be visualized, based on the current phases at the time the button was pressed. Be sure to do this again after the job completes to see graphs representing the final statistics. When the "auto" and/or "auto_en" jobs complete (no "*" by the corresponding entry), then the initial, refined, protein phases are available.
TEST DATA: After the "auto" or "auto_en" entry appears in the "Protein Phasing" section of the table and the job has been running for a couple of minutes, click on it to select (highlight) it, and then click on the "View Refinement Summary" button in the "Protein Phasing" panel to see the condensed log file. Also click on the "Graph Phasing Statistics" button and select from the displayed menu to see plots of various phasing statistics. Close the windows and repeat the process periodically (or just wait) until the job is finished, and view the final statistics.
3d) Auto_sf or Auto_en_sf. After the refinement completes, the solvent flattening/negative density truncation procedure is automatically invoked. This will produce an entry of "auto_sf" or "auto_en_sf" in the "Protein Phasing" section of the table, again depending on which hand was deemed correct. As before, if the hand was not reliably determined, then both entries will be present. The solvent flattening/negative density truncation process is rather fast, and the results can't be used until the job completes. Once the job finishes (no "*" next to its entry(s)), the auto structure determination process is complete and no further processing takes place by default. Solvent flattening statistics can be obtained from the log file, which can be viewed by selecting (highlighting) the appropriate (auto_sf or auto_en_sf) entry, and then clicking on the "View Results" button in the menu to the right of the table.
The overall packing scheme and map quality can be easily visualized by selecting the appropriate entry (either auto_sf or auto_en_sf) from the table, and then clicking on the "View Electron Density Map" button in the "Protein Phasing" panel. After a few seconds, a contoured electron density map covering multiple unit cells will appear. You can scroll through the map using the displayed menu buttons (left mouse button), or accumulate projections by clicking on the "Add Next Section" button a few times. You can quit from the map viewing program by clicking on the "Exit" menu item, and then typing "n" (without the quotes) at the prompt. Note that you can also view the original, non-solvent flattened map in the same fashion by selecting the "auto" or "auto_en" table entries prior to clicking on the "View Electron Density Map" button, thus observing the effects of solvent flattening/negative density truncation by comparing the two maps.
TEST DATA: Wait for the "auto_sf" or "auto_en_sf" entry to appear in the "Protein Phasing" section of the table without a "*" next to it. Then click on the entry to select (highlight) it. Once it's selected, click on the "View Electron Density Map" button in the "Protein Phasing" panel. In a few seconds a contoured map will appear. When it does, click on the "Add Next Section" button a few times to accumulate a projection. Note the protein/solvent contrast and packing pattern. Then click on "Exit" from the menu, and type "n" (without the quotes) in response to the prompt. Now select "auto" or "auto_en" from the table and repeat the map viewing process to see the map prior to solvent flattening. The impact of solvent flattening should be evident. Go back and select the solvent flattened entry again (the one with _sf appended) from the table, and then click on the "View Results" button to the right of the table to view the solvent flattening statistics. When finished, click on "OK" to close the window.
At this point the auto procedure has completed: ALL RESULTS ARE AVAILABLE
In this case there are various options, depending on what one wants to do with the results and what external programs are available (and desired to be used). All of these options are invoked by first choosing the set of phases and coordinates to be exported. To do this one must first select (highlight) an entry in the "Protein Phasing" section of the table. One can select either the refined but NON-solvent flattened phases ("auto" or "auto_en" entries), or the refined AND solvent flattened phases ("auto_sf" or "auto_en_sf" entries). Each of these entries will have associated with it a coordinate file and a phase file. Once the desired entry has been selected, simply click on the desired option from the "Export Files/Access External Software" panel at the lower right side of the screen. The options available are now described.
1) "Export Files to O Format" - After selecting the "auto_sf" or "auto_en_sf" table entry, click on this button. A window will pop up displaying parameters to be used when creating a map and skeleton file for use in "O". One can change these parameters if desired, but the defaults nearly always are appropriate so one normally just clicks on the "Continue" button. When the window closes five files will have been created for use with "O", and they will reside in the directory from which BnP was launched. The files will be as follows:
If "auto_en_sf" was selected rather than "auto_sf", then in the above file names each "auto_sf" will be replaced with "auto_en_sf".
USING THE FILES WITH "O": If "O" has been properly installed, then from a directory containing the files above, one can just start up "O" and keep pressing returns until the graphics window appears. Then from the controlling terminal prompt, just type "@auto_sf.omac" or "@auto_en_sf.omac" (without the quotes), whichever is appropriate. After a few seconds a contoured map, possible main chain skeleton, and spheres representing the substructure sites will appear. At this point, one really needs to determine only if the map has any protein features, and that the hand is correct. It is a good idea to toggle off the contours and look only at the skeleton. Examine the skeleton to check for any beta sheet regions and/or helices. Try to find a helical segment, center on it, and adjust the slab for good depth queing. Then verify that the helix is right handed. If any sheets and/or helices are found, the phasing was essentially correct, and the map should serve as a good starting point for automated chain tracing/model building. If LEFT-HANDED helices are found and anomalous scattering data was used in the phasing, this DOES NOT mean that the hand determination algorithm failed. In fact it did the correct thing, but THE INPUT DATA WAS INDEXED IN A LEFT HANDED COORDINATE SYSTEM! There are three ways to remedy this. (i) You could go back to the data reduction stage and reprocess the data. (ii) You could edit the input files to interchange each F(hkl) and F(-h,-k,-l) value along with their sigmas, or (iii) you could edit the input files to change the signs of h, k, and l for each reflection. Once the files have been corrected, rerunning BnP should give the proper results.
As a final quality check (for selenomethionine MAD structures), center the display on one of the spheres and toggle the map back on. You should see the sphere in density at the end of a long side chain, appropriate for the SD position in methionine.
TEST DATA: Click to highlight "auto_sf" or "auto_en_sf" in the "Protein Phasing" table section. Click on the "Export Files to O Format" button. From the popup window, click "Continue". Then from the terminal window that launched BnP, run "O" in the normal manner. Keep pressing returns until the graphical display window pops up. From the terminal window, type "@auto_sf.omac" or "@auto_en_sf.omac", whichever is appropriate. When contoured density, main chain skeleton, and substructure sites appear, toggle off the contours. You should see lots of beta structure in the skeleton. Find a helical segment, center on it, and verify that the helix is right handed. Then center on one of the spheres and toggle the contours back on. You should see the sphere in electron density near the end of a long side chain where the SD atom in methionine should be. Exit "O".
2) "Export Heavy Atoms (PDB Format)" - After selecting the "auto" or "auto_en" entries from the "Protein Phasing" part of the table, click on this button. Follow the instructions in the popup window to create a PDB file containing the substructure sites. Normally the defaults suffice so one can simply click on the "Continue" button. The file is written to the directory from which BnP was originally launched.
TEST DATA: Select (highlight) the "auto" or "auto_en" entry from the "Protein Phasing" section of the table and then click on the "Export Heavy Atoms (PDB Format)" button. Click on the "Continue" button in the popup window.
3) "Export Phases" - Select the "auto_sf" or "auto_en_sf" entry if solvent flattened phases are desired, or the "auto" or "auto_en" entry if NON-solvent-flattened phases are desired. Then click on this button. Follow the instructions in the popup window to create the desired phase file. Use the pulldown menu to select the output file format to be MTZ, CNS, or FREE-FORMAT. After selecting the format, click on the "Help" button for an explanation of that format. When the format and file name are set, click on the "Continue" button to create the file. The file is written to the directory from which BnP was originally launched.
TEST DATA: Select (highlight) the "auto_sf" or "auto_en_sf" entry from the "Protein Phasing" section of the table and then click on the "Export Phases" button. Choose "CNS format" from the pulldown menu. Then click on the "Continue" button in the popup window.
4) "Run ARPwARP" - Select the "auto_sf" or "auto_en_sf" entry from the "Protein Phasing" section of the table. Then click on this button. From the popup window, verify that the information supplied is ok (make sure a sequence file is specified; if a sequence file was given in the "IMPORT" option earlier, then the corresponding ".pir" sequence file needed will have automatically been created, and should apear as the default). Usually all defaults suffice. Then click on the "Create Input & Run ARPwARP" button. This will create appropriate files and scripts for ARPwARP to automatically build and refine a model. When the files are ready a new window will pop up asking if you want to submit the ARPwARP job now. If so, click on the "Yes" button and close the window. If not, click on the "No" button. In either case, a subdirectory "ARPwARP" will be created under the directory originally used to launch BnP, and the files needed for ARPwARP to run will be placed there. If the job was submitted, results will start accumulating there immediately. If the job was NOT submitted, then at any future time one can simply go to that directory and submit the "arpwarp_launch.sh" file with a "sh arpwarp_launch.sh" or "csh arpwarp_launch.sh" command, and the results will then start accumulating in that directory.
TEST DATA: Select (highlight) the "auto_sf" or "auto_en_sf" entry in the "Protein Phasing" section of the table. Then click on the "Run ARPwARP" button. In the popup window, click on the "Create Input & Run ARPwARP" button. Click on "Yes" in the small popup window to submit the ARPwARP job. Results will start accumulating in the ARPwARP subdirectory present in the directory from which BnP was originally launched.
5) "Run RESOLVE" - Select the "auto_sf" or "auto_en_sf" entry if solvent flattened phases are desired, or the "auto" or "auto_en" entry if NON-solvent flattened phases are desired. Then click on this button. In the popup window, verify that the options are all set appropriately (make sure a sequence file is specified; it will be if one was given in the "IMPORT" option earlier). Usually the defaults suffice for a very rapid model build against the input electron density map if solvent flattened phases are used, but the results here are often far from optimal. The best results are obtained (at the expense of MUCH, MUCH longer cpu time!) if NON-solvent flattened phases are used and "Iterative Model Building, Density Modification and Refinement" is selected instead of the default. Note that for either of the selectable "RESOLVE job types" involving density modification, one should select a NON-solvent-flattened phase set as input since RESOLVE will do its own solvent flattening. For the "Model Building Only" job type, RESOLVE will NOT do any solvent flattening, so it is best to use a solvent-flattened phase set as input to provide the best possible map for fitting. If the user ignores this strategy, a warning message will pop up.
Once the options are set, click on the "Creat Input & Run RESOLVE" button. This will create appropriate files and scripts for RESOLVE to automatically build a model. When the files are ready, a new window will pop up asking if you want to submit the RESOLVE job now. If so, click on the "Yes" button and close the window. If not, click on the "No" button. In either case, a subdirectory "RESOLVE" will be created under the directory originally used to launch BnP, and the files needed for RESOLVE to run will be placed there. If the job was submitted, results will start accumulating there immediately. If the job was NOT submitted, then at any future time one can simply go to that directory and submit the "auto_resolve.csh" file with a "csh auto_resolve.csh" command, and the results will then start accumulating in that directory.
TEST DATA: Select (highlight) the "auto_sf" or "auto_en_sf" entry in the "Protein Phasing" section of the table. Then click on the "Run RESOLVE" button. In the popup window, click on the "Create Input & Run RESOLVE" button. Click on "Yes" in the small popup window to submit the RESOLVE job. Results will start accumulating in the RESOLVE subdirectory present in the directory from which BnP was originally launched. This will do a quick and dirty build against the solvent-flattened map.
As an alternative, one can select the "auto" or "auto_en" entry, click on the "Run RESOLVE" button, change the default "RESOLVE job type" to "Iterative Model Building, Density Modification and Refinement", click the "Create Input & Run RESOLVE" button and proceed as before. This job will take much, much longer, but it will build a much better, and more complete, model by iteratively modifying the map as the model improves and using the new maps to further extend and improve the model.
6) "Check RESOLVE/ARPwARP Status" - Clicking this button informs the user whether any ARPwARP, RESOLVE or REFMAC jobs are running on the computer.
When processing within BnP is completed (and possibly once RESOLVE or ARPwARP jobs have been started from the BnP screen), one can exit BnP by clicking on the "File" menu in the upper left screen corner. From the menu, click on the "Save" option to save the current status in a "configuration" file. In auto mode, by default the file will be called "backup.config", but the user can save it under any name by using the "save as" menu item.
In the future, to continue where you left off, just go to the directory from which BnP was originally launched, type "BnP" (without the quotes), and the "About BnP" screen will appear. Then click on the "File" menu in the upper left corner of the screen, and select the "Open" option. From the new popup window, click to highlight the appropriate "config" file and then click on the "Open" button. This will restore the information that was present in all the fields of the BnP GUI screens at the time the configuration file was last saved. One can then click on the "General Information", "Auto Run", or other tabs to get to the desired screen where further processing can take place. Existing configuration files can also be opened by choosing option A at the top of the "General Information" screen.