CHARMM Version 2.2 CONTENTS: Commands (doc/command.doc ) ............................... 2 Usage (doc/usage.doc ) ............................... 4 Charmm (doc/charmm.doc ) ............................... 14 * Analysis: (doc/analys.doc ) ............................... 15 * Block: (doc/block.doc ) ............................... 17 * Cons: (doc/cons.doc ) ............................... 20 * Coordinates: (doc/corman.doc ) ............................... 27 * Correl: (doc/correl.doc ) ............................... 39 * Crystl: (doc/crystl.doc ) ............................... 54 * Dynamics: (doc/dynamc.doc ) ............................... 63 * Energy: (doc/energy.doc ) ............................... 78 * Ewald: (doc/ewald.doc ) ............................... 86 * Graphx: (doc/graphx.doc ) ............................... 89 * H-bond: (doc/hbonds.doc ) ............................... 100 * H-build: (doc/hbuild.doc ) ............................... 103 * Images: (doc/images.doc ) ............................... 106 * Internal: (doc/intcor.doc ) ............................... 112 * I/O : (doc/io.doc ) ............................... 118 * Minimiz: (doc/minimiz.doc ) ............................... 133 * Miscellany: (doc/miscom.doc ) ............................... 139 * Molvib: (doc/molvib.doc ) ............................... 149 * Monitor: (doc/monitor.doc ) ............................... 158 * Non-bonded: (doc/nbonds.doc ) ............................... 160 * Parameters: (doc/parmfile.doc) ............................... 169 Pdetail (doc/pdetail.doc ) ............................... 177 * Pert: (doc/pert.doc ) ............................... 204 Perturb: (doc/perturb.doc ) ............................... 210 Piplem: (doc/piplem.doc ) ............................... 224 * Pressure: (doc/pressure.doc) ............................... 230 Rtop (doc/rtop.doc ) ............................... 233 * Sbound: (doc/sbound.doc ) ............................... 237 * Scalar: (doc/scalar.doc ) ............................... 238 * Select: (doc/select.doc ) ............................... 240 * Structure: (doc/struct.doc ) ............................... 244 Support (doc/support.doc ) ............................... 250 * Test: (doc/test.doc ) ............................... 252 * Travel: (doc/travel.doc ) ............................... 254 * Umbrella: (doc/umbrel.doc ) ............................... 261 * Vibration: (doc/vibran.doc ) ............................... 265 Recentmods (doc/recentmods.doc) ............................... 283 Install (doc/install.doc ) ............................... 286 Developer (doc/developer.doc) ............................... 297 Testcase (doc/testcase.doc) ............................... 309 C DEC/CMS REPLACEMENT HISTORY, Element COMMANDS.DOC C *5 5-FEB-1992 23:37:59 WON "info directive modification" C *4 7-DEC-1991 04:21:24 WON "To include EWALD and PATH documentation" C *3 12-SEP-1991 19:13:28 WON "put analysis, pressure, molvib and travel in the list" C *2 6-MAY-1991 16:41:42 WON "Info directive fixed" C *1 8-APR-1990 19:49:47 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element COMMANDS.DOC  File: Commands, Node: Top, Up: (doc/charmm.doc), Previous: (doc/developer.doc), Next: (doc/install.doc) CHARMM commands The commands available for use in CHARMM are classified in several groups. * Menu: * Analysis: (doc/analys.doc ). Analysis facility * Block: (doc/block.doc ). BLOCK free energy simulation * Cons: (doc/cons.doc ). Harmonic and other constraints or SHAKE. * Coordinates: (doc/corman.doc ). Commands to manipulate coordinates * Correl: (doc/correl.doc ). Time series and correlation functions. * Crystl: (doc/crystl.doc ). Crystal facility * Dynamics: (doc/dynamc.doc ). Dynamics commands * Energy: (doc/energy.doc ). Energy evaluation * Ewald: (doc/ewald.doc ). Ewald summation * Graphx: (doc/graphx.doc ). The graphics subsection for workstations * H-bond: (doc/hbonds.doc ). Generation of hydrogen bonds * H-build: (doc/hbuild.doc ). Construction of hydrogen positions * Images: (doc/images.doc ). Use of periodic or crystal environment. * Internal: (doc/intcor.doc ). Manipulation of internal coordinates * I/O : (doc/io.doc ). I/O of data structures and files * Minimiz: (doc/minimiz.doc ). Description of the minimization methods * Miscellany: (doc/miscom.doc ). Miscellaneous commands * Molvib: (doc/molvib.doc ). Molecular vibrational analysis facility * Non-bonded: (doc/nbonds.doc ). Generation of the non-bonded interaction * Parameters: (doc/parmfile.doc). CHARMM energy parameters * Path: (doc/path.doc ). Reaction path calculations * Perturb: (doc/pert.doc ). Free energy perturbation simulations. * Pressure: (doc/pressure.doc). Pressure calculation and usage. * Sbound: (doc/sbound.doc ). Stoichastic boundary * Scalar: (doc/scalar.doc ). Scalar command for atom properties * Select: (doc/select.doc ). Use of the atom selection facility * Structure: (doc/struct.doc ). Structure manipulation (PSF generation) * Test: (doc/test.doc ). Commands to test various things. * Topology: (doc/rtop.doc ). Residue Topology File * Travel: (doc/travel.doc ). Reaction coordinate refinement command * TSM: (doc/perturb.doc ). Thermodynamic Simulation Method * Umbrella: (doc/umbrel.doc ). Umbrella Sampling * Vibration: (doc/vibran.doc ). Vibrational analysis facility C DEC/CMS REPLACEMENT HISTORY, Element USAGE.DOC C *4 5-FEB-1992 23:39:41 WON "info directive modification" C *3 10-MAY-1991 13:38:39 WON "Info directive fix" C *2 6-MAY-1991 17:55:54 WON "CHARMM 22.0.b Usage" C *1 8-APR-1990 19:51:02 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element USAGE.DOC  File: Usage, Node: Top, Up: (doc/charmm.doc), previous: (doc/install.doc), Next: (doc/support.doc) How to use CHARMM The user of CHARMM controls its execution by executing commands sequentially from a command file or interactivly. In general the ordering of commands is limited only by the data required by the command. For example, the energy cannot be calculated unless the arrays holding the coordinates, the parameters, etc., have already been filled. This section deals with overall usage, as opposed to the detailed description of any given command. This is a good place to start when first learning CHARMM. * Menu: * Meta-Syntax:: Describing the Syntax of Commands * Command Syntax:: Rules for composing command input files. * Run Control:: Ways to modify control flow and stream switching. * I/O Units:: Correspondence between files and unit numbers used by CHARMM. * AKMA:: Units of Measurement used in CHARMM * Data Structures:: Data Structures used by CHARMM * Standard Files:: Descriptions of parameters, topologies, and coordinates available. * Examples:: Sample runs * Interface:: How to make your own private version of CHARMM * Syntactic Glossary:: Glossary of syntactic terms * Glossary:: Glossary of non-syntactic terms.  File: Usage, Node: Meta-Syntax, Up: Top, Next: Command Syntax, Previous: Top Rules for Describing the Syntax (The Meta-Syntax) The syntax of commands is described using the following rules: Capitalized words are keywords that must be specified as is. However, if the word is partially capitalized, it may be abbreviated to the capitalized part. Lower case words are to be replaced by a corresponding data entry. The symbol "::=" means "has the following syntactic form:". Anything enclosed in square brackets, "[]", is optional. If several things are stacked in square brackets, one may choose one optionally. Anything enclosed in curly brackets, "{}", specifies that a selection must be made of the choices stacked vertically inside. The syntactic entities which appear as an argument to "repeat" may be repeated any number (including zero) times. Defaults for optional parameters may be enclosed in apostrophes and placed under the entity they stand for. However, defaults are not specified in this manner if the rules for the default are complex. The syntactic glossary, see *note glossary: Syntactic Glossary, contains further syntactic entities which are used in the command descriptions. Finally, the options and operands in each command can usually be specified in any order except if otherwise noted.  File: Usage, Node: Command Syntax, Up: Top, Next: Run Control, Previous: Meta-Syntax Command language rules and lore A CHARMM run is controlled by a command file (or files). This section of the documentation describes the basic rules for the command file. Details of command level run control are described in the next node. A command file for CHARMM should begin with a specification of the title of the run. (See the syntactic glossary, *note syn: syntactic glossary, for the syntax of a title.) Then, any number of commands may be specified. Each command consists of a command line possibly followed by other data. The command line is scanned free field. This command line may be longer than one line in the file; to do this, one must place a hyphen at the end of line which is to be continued on the next line. Comments may be placed on a command line by preceding the comments by exclamation points. All lower case characters are converted to upper case. This format is identical to that used by the VAX command language interpreter. In addition, blank lines are permitted to separate blocks of commands for increased readability. The first word of every command line specifies the command. Generally, required operands of a command must follow in order. On the other hand, options may generally be specified in any order. Further, any number is always preceded by a key word so that any numeric operands, can be placed in arbitrary order. The command line is scanned in units of words and delimited strings. A word is defined by a sequence of non-blank characters, A delimited string consists of a keyword followed by a string of characters of variable length followed by a delimiter string. One example of where a delimeter string is used is in atom selection where the syntax is; SELE ...... END. Note, that the "END" is required and delimits the atom selection. Abbreviations are permitted in various contexts. The first word may be abbreviated to four characters and numerous options and operands may also be abbreviated to four characters. However, some key words which are used to mark numbers may not be abbreviated. See the processing for individual commands to see what can and cannot be abbreviated. Many of the various options and numeric values are maintained from one invocation of a command to the next. Once a value is specified, it is maintained until it is changed in any command. Therefore, if CUTNB is specified in a NBON command, that value will be used in the DYNA command unless it is changed therein. Usually, when a free field command line is read in, it is echoed onto a standard output. Each such echo will be prepended by a short marker, eg. "CHARMM>", which identifies the line of input as well as the command processor which is interpreting it. In general, as each of the command is interpreted, it is deleted from the command line. When command processing is finished, a check is made to see that nothing is left over. The presence of extraneous junk indicates that something was mistyped. For some commands, such as DYNAmics, where a mistake may be costly, extraneous characters result in a fatal error.  File: Usage, Node: Run Control, Up: Top, Next: I/O Units, Previous: Command Syntax Controlling a CHARMM Run IF command-parameter test-spec comparison-string command-spec GOTO label-string LABEL label-string STREAM [UNIT integer] [file-specification] RETURN SET command-parameter string INCRement command-parameter [BY real] DECRement command-parameter [BY real] This node describes commands that are used to modify the usual sequential interpretation of commands from the command file. Three methods are available to accomplish this: IF tests to conditionally execute a single command GOTO and LABEL transfers within a file STREAM and RETURN transfers to different command files. In addition commands can be modified by the use of command parameters. The command line reader scans input lines for parameters (specified by @n where n is an alphanumeric character) and will subsitute the appropriate parameter string. Command parameters are defined using the SET command to set one of the 36 command parameters, and their values (if numeric) can be modified by the INCRement command, which decodes the parameter string, does real arithmetic and encodes the result. The command parameters are identified by alphanumeric characters (0-9,A(a)-Z(z)(not case-sensitive)). IF compares the string in the specified parameter string to the comparison-string using the test-spec (GT GE EQ NE LE LT). If the comparison is true then the rest of the command line is executed (otherwise it is ignored). The EQ and NE comparisons are done as string comparisons, but the others require decoding of the two strings and comparison by real arithmetic. The command-spec can be any valid command line (including another IF test or a GOTO or STREAM specification). GOTO causes the current command file to be rewound and searched for a line containing the correct LABEL and label-string. The label-string is a single word. If multiple occurrences of a label are present, the first will be used. Command interpretation begins on the line following the LABEL (any information after the LABEL keyword and label-string is ignored). STREAM iunit begins reading commands from the specified fortran logical unit or from the stream file. The stream file is treated exactly as the main command file. It begins with a title and ends with a STOP or RETURN, the latter causing control to return to the previously active command file at the point where the stream switch occurred. The logical unit in OPEN, CLOSE, and REWIND commands are useful in working with streams see *note MISCOM:(doc/miscom.doc). EXAMPLE: * This is a sample command file for CHARMM which calls a stream file * to build a structure and then maps out an adiabatic potential * surface defined by a pair of dihedrals * OPEN UNIT 10 READ FORM NAME makestruc.inp STREAM UNIT 10 SET 1 -180. SET 2 -180. LABEL LOOP CONS CLDH CONS DIHE first-dihedral-angle-spec FORCE 100.0 MIN @1 CONS DIHE second-dihedral-angle-spec FORCE 100.0 MIN @2 MINI minimization-spec INCR 1 BY 30.0 IF 1 LT 170. GOTO LOOP SET 1 -180. INCR 2 BY 30.0 IF 2 LT 170. GOTO LOOP STOP  File: Usage, Node: I/O units, Up: Top, Next: AKMA, Previous: Run Control Fortran I/O Units Usage by CHARMM In order to keep CHARMM as machine independent as possible, all specification of files is done through Fortran unit numbers. Two unit numbers have special signifigance, 5 and 6. Unit 5 is the command file interpreted by CHARMM. Unit 6 is the output file for all printed messages. As commands are read from unit 5, they are echoed on unit 6. All other unit numbers have no predefined meaning. The CHARMM OPEN command may be used to assign files to units. The tream file in "STREAM file-specfication" may be assigned to a logical unit between 100 and 119 (80 and 99 on Cray machines). Logical unit 0 through 9 may be used for CHARMM internal file handling. We recommend logical units 10 through 79 for user data files.  File: Usage, Node: AKMA, Up: Top, Next: Data Structures, Previous: I/O Units The CHARMM system of units: AKMA. CHARMM uses a distinct system of units, the AKMA system. I.e. Angstroms, Kilocalories/Mole, Atomic mass units. All distances are measured in Angstroms, energies in kcal/mole, mass in atomic mass units, and charge is in units of electron charge. Using this system, the AMKA unit of time is 4.888821E-14 seconds (based on the constants tabulated in Abramowitz and Stegun (1970)), however, for all input and output, the time is listed in picoseconds (20 AKMA time units is .978 picoseconds). In some places, the users may specify values in AKMA time units, and in some places both picosecond and AKMA time are output. Angles are given in degrees for the analysis and constraint sections. In parameter files, the minimum positions of angles are specified in degrees, but the force constants for angles, dihedrals, and dihedral constraints are specified in kcal/mole/radian/radian. Any numbers used in the documentation may be assumed to be in AKMA units unless otherwise noted.  File: Usage, Node: Data Structures, Up: Top, Next: Standard Files, Previous: AKMA Data Structures You Should Understand There are a number of data structures that CHARMM manipulates. Many of these data structures are important for most operations; others which are less important, are described with the commands that use them. Much more specific information is available in the various common blocks whose extension is .fcm in the source directory, ~/charmm/source/fcm ([...CHARMM.SOURCE.FCM] on VAX). The important data structures are given below: Each data structure name is followed by its abbreviation which is used as its name in commands. 1) Residue Topology File (RTF) The residue topology file stores the definitions of all residues. The atoms, atomic properties, bonds, bond angles, torsion angles, improper torsion angles, hydrogen bond donors and acceptors and antecedents, and non-bonded exclusions are all specified on a per residue basis. The term "residue" is somewhat historical, but can be any basic unit. 2) The Parameters (PARA or PARM) The parameters specify the force constants, equilibrium geometries, van der Waals radii, and other such data needed for calculating the energy. 3) Structure File (PSF) The structure file is the concatenation of information in the RTF. It specifies the information for the entire structure. It has a hierarchical organization wherein atoms are grouped into residues which are grouped into segments which comprise the structure. Each atom is uniquely identified within a residue by its IUPAC name, residue identifier, and its segment identifier. Identifiers may be up to 4 characters in length. 4) The Internal Coordinates (IC) The internal coordinates data structure contains information concerning the relative positions of atoms within a structure. This data structure is most commonly used to build or modify cartesian coordinates from known or desired internal coordinate values. It is also used in conjunction with the analysis of normal modes. Since there are complete editing facilities, it can be used as a simple but powerful method of examining or analyzing structures. 5) The Coordinates (COOR) The coordinates are the Cartesian coordinates for all the atoms in the PSF. There are two sets of coordinates provided. The main set is the default used for all operations involving the positions of the atoms. A comparison set (also called the reference set) is provided for a variety of purposes, such as a reference for rotation or operations which involve differences between coordinates for a particular molecule. Associated with each coordinate set is a general purpose weighting array (one element for each atom). 6) The Non-bonded List (NBON) The non-bonded list contains the list of non-bonded interactions to be used in calculating the energies as well as optional information about the charge, dipole moment, and quadrapole moments of the residues. This data structure depends on the coordinates for its construction and must be periodically updated if the coordinates are being modified. 7) The Hydrogen Bond List (HBON) The hydrogen bond list contains the list of hydrogen bonds. Like the non-bonded list, this data structure depends on the coordinates and must be periodically updated. 8) The Constraints (CONS) There is a variety of available constraints. All data pertaining to constraints reside in this data structure. 9) The Images data structure (IMAGES) The images data structure determines and defines the relative positions and orientations of any symmetric image of the primary molecule(s). The purpose of this data structure is to allow the simulation of crystal symmetry or the use of periodic boundary conditions. Also contined in this data structure is information concerning all nonbonded, H-bonds, and bonded interactions between primary and image atoms.  File: Usage, Node: Standard Files, Up: Top, Previous: Data Structures, Next: Examples, Files available for general use There are number of residue topology files, parameter files, coordinates files and files of other data structures available. The most important files generally available are residue topology and parameter files. Both such classes of files are stored for general use in the CnnPT: directories. The file names used for both these files consists of an alphabetic part followed by a number, e.g. PARAM7. There are two copies of each file; one with extension, .INP, which is a character files used as an command file to generate the binary file, with extension, .MOD. The .INP is meant for human eyes; the .MOD files is meant for CHARMM to read efficiently. The numeric part of each name is its version number. In general, one should use the highest version number of a file. Although parameter files and toplogy files are separate, they are usually associated, and they must be taken together when generating a structure (PSF). For example, a parameter set for proteins will not work with a DNA topology file. For information on the general use of directories, and the files they contain, see the following sections. * Menu: * Parameters: (doc/parmfile.doc). Description of all the parameter files * Residue: (doc/rtop.doc). Description of the topology files (RTF)  File: Usage, Node: Examples, Up: Top, Previous: Standard Files, Next: Interface Sample CHARMM Runs For an example of specification of a CHARMM run, examine a test case in ~/charmm/test. The file, TEST.INP, is an input to CHARMM which performs the test and contains examples of many commands. The file, TEST.OUT, contains the output from CHARMM produced on Fortran unit 6. Other test cases are found in the test directory.  File:Usage, Node: Interface, Up: Top, Next: Syntactic Glossary, Previous:Examples Interfacing to CHARMM A mechanism has been provided to allow users of the CHARMM to write their own special purpose subroutines which can be incorporated into the system without threatening its integrity. There are six "hooks" into the CHARMM which have been specially provided for casual modifiers. For detailed descriptions of each of these hooks, consult the routine in ~/charmm/source/main/usersb.src on UNIX machines or [...CHARMM.SOURCE.MAIN]USERSB.SRC under VAX/VMS. 1) USERSB The USER command invokes the subroutine, USERSB, and performs no other action. USERSB is a subroutine with no arguments. However, parameters may be passed to this subroutine via the COMMON blocks. These COMMON blocks store nearly all of the systems data. These common blocks may be obtained by including them from the directory containing the sources for the version of the program you are using. 2) USERE A user supplied energy routine may be provided that will be invoked on every energy evaluation. The force arrays should be modified accordingly. 3) USRSEL If one need to be able to select atoms in a manner not possible with the existing options, a user selection routine may be specified. One such example would be for for selecting atoms within a given rectangular solid, or other (nonsperical) solid. 4) USERNM Within VIBRAN, a user specified vector or mode may be generated with this routine. One command that appends this motion onto the existing set of vectors is "EDIT INCL USER integer". 5) USERF A user specified parameter fitting routine may be specified. 6) USRTIM A user specified time series routine may be provied for use in computing correlation functions. To simplify the use of these hooks and to allow users to replace subprograms in the CHARMM with their own versions of said subprograms, the command procedure BUILD has been provided. BUILD will produce a private version of the CHARMM in your default USER directory using your versions of USERSB and USERE. The procedure looks in your directory for USERSB.SRC and USERE.SRC. If either file (or both) is found, it is used in the make procedure of the CHARMM. BUILD command should always be used to generate a private version of the CHARMM as it will always use the correct files for linking. Before attempting to write your own USER functions, you should familiarize yourself with the information available onthe implementation of CHARMM. This interface procedure is designed for short, one time programs. If a user written subroutine is of general use, the routine should be rewritten to conform to parameter passing standards used in the system and then will be incorporated into the central CHARMM. There are several utility routines available to a user routine. Some of them are listed below. CALL GETE(X,Y,Z,...) will cause the energy and forces to be computed and values are saved in the appropriate common blocks. For this to work properly, NBONDS, HBONDS, and CODES must have been called. This can be done by executing both the NBONds and HBONds command, by the use of the UPDAte command, or by having previously found the energy (minimization, dynamics, etc..). CALL PRINTE(...) will write the current energy values (from common block values) to the specified unit (IUNIT). It will also write out the cycle or iteration number and optionally write out the standard header.  File:Usage, Node:Syntactic Glossary, Up:Top, Next:Glossary, Previous: Interface Glossary of Syntactic Terms char A character del The delimiter - a single character which is used to mark the end of a portion of a command. Initially, it is a dollar sign but can be changed using the DELIM command, see *note delim:(doc/miscom.doc). It should be noted that the delimiter cannot be a character within any string it is supposed to delimit. deldel Two delimiters concatened together with no space in between. int or integer An integer iupac IUPAC name for an atom. Initially specified in the residue topology file. keyword A word, see below, serving to identify some option range equivalent to real real integer. The first real is the minimum value in the range, the second number is the maximum value in the range, and the third number gives the number of interval, i.e. lines or columns. real A real number. No decimal point is required for the number to be interpreted correctly resid Residue identifier (a string of upto 4 characters) resname Residue name (type of residue. e.g. GUA) segid Segment identifier (a string of upto 4 characters) string An ordered set of characters tag A string which is a tag, i.e. no embedded spaces. title A series of 1 to 32 lines of text (max 80 characters per line) each starting with a "*". The title is terminated by a line which an asterisk "*" as the first character. Used for commenting files. word A string with no blanks unit-number An integer which is a Fortran unit number.  File: Usage, Node: Glossary, Up: Top, Previous: Syntactic Glossary, Next: Top General Glossary data structure A collection of arrays, scalars, and possibly other data structures which are related by part of a larger entity. For example, a coordinate set is a data structure which hold the three dimensional positions of atoms. This data structure consists of 1 scalar and three arrays. The scalar is the number of coordinates; the three arrays are the X, Y, and Z components of the coordinates. Internal bonds, angles, torsions, improper torsions. coordinates Also, a data structure used for constructing coordinates. Iupac Name for The name of an atom with a residue. This name should be an atom unique within a residue and should conform to the IUPAC nomenclature, Biochemistry 9:3471 (1970) Hbonds hydrogen bonds Parameters constants in the energy expression ( force constants, minima of energy surfaces, charges, Lennard-Jones parameters, van der Waals radii, etc.) PSF structure file ( protein structure file ) : a list of the internal coordinates and related information Residue A string of four characters or less which uniquely specifies Identifier residue with in a segment. This value is currently set by CHARMM to be the character representation of the residue number in the segment starting from the first real monomer unit in it. RTF residue topology file : a list of standard internal coordinates, atom charges, atom types, excluded non-bonded interactions, etc. Segment A string of up to four characters uniquely designating Identifier a segment. Specified in the GENErate command, see *note gener: (doc/struct.doc) Generate. Sequence list of residues C DEC/CMS REPLACEMENT HISTORY, Element DIR.DOC C *1 8-APR-1990 19:50:00 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element DIR.DOC  File: CHARMM, Node: Top Chemistry at HARvard Macromolecular Mechanics - --- - - Version 22 - January 1, 1992 Copyright(c) 1984,1987,1991 President and Fellows of Harvard College All rights reserved You are now using the INFO facility to view CHARMM 22 documentation. The paper; CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. J. Comp. Chem., Vol. 4, p187 (1983), is considered to be an integral part of this documentation. In places, this documentation and the paper will conflict. In all such cases, the documentation presented here should take precedence. * Menu: * Commands: (doc/commands.doc). Discription and syntax of CHARMM commands * Install: (doc/install.doc). Release notes How to install CHARMM on a user site * Usage: (doc/usage.doc). How to use CHARMM * Support: (doc/support.doc). Supporting data files and utilities * Testcase: (doc/testcase.doc). CHARMM testcases * Develop: (doc/developer.doc). Notes for CHARMM developers * News: (doc/recent_mods.doc). New features introduced recently * Info: (Info). A description of the INFO facility. C DEC/CMS REPLACEMENT HISTORY, Element ANALYS.DOC C *3 5-JAN-1992 14:44:40 WON "Info directive fixed" C *2 24-OCT-1991 01:26:36 WON "17-OCT-91 NIH update" C *1 12-SEP-1991 19:05:02 WON "Bernie Brooks new analysis facility" C DEC/CMS REPLACEMENT HISTORY, Element ANALYS.DOC  File: analys, Node: Top, Up: (doc/commands.doc), Next: Description Analysis Commands In CHARMM22, new analysis commands are under development, but some features are currently functional. * Menu: * Description:: Description of analysis facility * Energy:: Energy partitioning  File: analys, Node: Description, Up: Top, Previous: Top, Next: Energy Description of the ANALysis Command ANALys {ON } Enable analysis and disable FAST routines. {OFF} Disable analysis and restore FAST option defaults. The ANALysis command is a new energy and structure table facility that is being developed to examine both static and dynamic properties. The current code only allows energy partition analysis and energy contribution analysis from free energy simulations.  File: analys, Node: Energy, Up: Top, Previous: Description, Next: Top Energy option of the ANALysis Command ANALys {ON } {OFF} The ANALysis ON command enables energy partition analysis and disables the FAST routines. This will slow the calculation (especially on vector machines), but allow a detailed, atom by atom, energy analysis. Everytime the energy routine is invoked, the energy for each atom is stored in the ECONT array. During PERT dynamics, the EPCONT is filled with the time average energy difference on a atom by atom basis including every step of dynamics. This allows the free energy differences to be analyzed based on atom contributions. The ANALys OFF command enables the FAST routines and disables the resetting of the ECONT array (i.e. the ECONT array will not change, but may still be accessed. The energy partition array can be accessed with the SCALar ECONt commands. *note Econt:(doc/scalar.doc). The sum of all of the elements of the ECONT array is usually the total energy, but some energy terms, such as extended electrostatics, will not be included. The command: SCALar ECONT STATistics can be used to check the total energy and the command SCALar EPCONT .... can be used to examine atom contributions to energy differences for PERT. C DEC/CMS REPLACEMENT HISTORY, Element BLOCK.DOC C *2 6-MAY-1991 16:40:44 WON "Info directive fixed" C *1 10-APR-1991 12:13:23 WON "From Bruce Tidor" C DEC/CMS REPLACEMENT HISTORY, Element BLOCK.DOC  File: BLOCK, Node: Top, Up: (doc/commands.doc), Next: Syntax The commands described in this section are used to partition the molecular system into blocks and allows for the use of coefficients that scale the interaction energies between the blocks. This has a number of applications, and specific commands to carry out free energy simulations with a component analysis scheme have been implemented. * Menu: * Syntax:: Syntax of the block commands * Function:: Purpose of each of the commands  File: BLOCK, Node: Syntax, Up: Top, Next: Function Syntax of BLOCK commands BLOCk [int] Subcommands: miscellaneous-command-spec ! see *note miscom:(doc/miscom.doc). CALL int atom-selection LAMBda real COEFficient int int real NOFOrce FORCe FREE_energy_evaluation [OLDLambda real] [NEWLambda real] - FIRSt int [NUNIT int] [BEGIn int] [STOP int] [SKIP int] - [TEMPerature real] [CONTinuous int] INITialize CLEAr Energy_AVeraGe [OLDLambda real] [NEWLambda real] - FIRSt int [NUNIT int] [BEGIn int] [STOP int] [SKIP int] - [CONTinuous int] COMPonent_analysis DELL real NDEL int [TEMPerature real] - FIRSt int [NUNIT int] [BEGIn int] [STOP int] [SKIP int] AVERage {DISTance int int} {STRUcture} [PERT] [TEMPerature real] [OLDLambda real] [NEWLambda real] - FIRSt int [NUNIT int] [BEGIn int] [STOP int] [SKIP int] END  File: BLOCK, Node: Function, Up: Top, Previous: Syntax 1) BLOCk [int] enters the block facility. The optional integer is only read when the block structure is initialized (usually the first block call of a run) to specify the number of blocks for space allocation. If not specified, the default of three is assumed. 2) END exits the block facility. The assignment of blocks, the coefficient weighting of the energy function, the force/noforce option, etc. remain in place. For the terms of the energy function that are supported, each call to ENERGY (either directly or through MINIMIZE, DYNAMICS, etc. commands) results in an energy and force weighted as specified. Currently the fast energy routines must be used and images are not fully supported. The matrix of interaction coefficients is printed upon exiting. 3) CALL removes the atoms specified by "atom-selection" from their current block and assigns them to the block number specified by the integer. Initially all atoms are assigned to block 1. If atoms are removed from any block other than block 1, a warning message is issued. If blocks are assigned such that some energy terms (theta, phi, or imphi) are interactions between more than two blocks, a warning is issued when the END command is encountered. This is a severe error and indicates that something is wrong. 4) LAMBda sets the value of lambda to "real". This command is only valid when there are three blocks active. Otherwise multiple COEF commands may be used to set the interaction coefficients manually. LAMBda x is equivalent to (let y=1.0-x) COEF 1 1 1.0 COEF 1 2 y COEF 1 3 x COEF 2 2 y COEF 2 3 0.0 COEF 3 3 x 5) COEF sets the interaction coefficient between two blocks (represented by the integers) to a value (the real number). When the block facility is invoked, all of the atoms are initially assigned to block 1 and all interaction coefficients are set to one (BEWARE: This is subject to change!). 6) NOFOrce specifies that in subsequent energy calculations, the forces are not required. This is especially economical when using the FREE command. Forces may be turned back on with the FORCe command, which is necessary for running minimizations and dynamics. 7) FREE calculates a free energy change using simple exponential averaging. If the old and new lambdas are specified (can only be done when three blocks are active), the perturbation energy is calculated from these values. If not, the current coefficient matrix is used. FIRSt_unit, NUNIt, BEGIn, STOP, and SKIP specify the trajectory that is to be read. TEMPerature defaults to 300 K and gives the temperature value to be used in k_B*T. CONTinuous specifies the interval for writing cumulative free energies. A negative value causes binned (rather than cumulative average) values to be written. 8) INITialize is called automatically when the BLOCK facility is first entered and may also be called manually at some other point. All atoms are assigned to block one and all interaction coefficients are set to their initial value. 9) CLEAr removes all trace of the use of the BLOCK facility. The next command should generally be END, and then CHARMM will operate as if BLOCK had not ever been called, whether the slow or fast energy routines are used. 10) [EAVG] The average value of the potential energy during a simulation can be calculated with the EAVG (Energy_AVeraGe) command. The parsing is very much like the FREE command above. The most frequent use of this command is to calculate the average value of the perturbation energy during the course of a simulation for use in thermodynamic integration, although in actual practice the COMPonent_analysis command is somewhat more useful. 11) [COMP] The contribution of some set of unperturbed atoms (generally stored in block 1 during the course of a simulation) to the overall free energy change is calculated by using the COMPonent_analysis command to post-process the simulation. After reading in the topology, parameter, psf, and coordinate files, two tricks should be applied. First, use "cons fix sele ... end" to "fix" all atoms not involved in the actual perturbation (generally these are the atoms not in blocks 2 or 3). Second, use "update ihbfrq 0 inbfrq 1 - cutnb 999.9 ...rest.of.nonbond.function.used.to.run.simulation" to create a "long" nonbond list that will be valid for the whole simulation. Then enter BLOCK, requesting 4 blocks. Put the usual WT in block 2 and MUT in block 3. Put the portion of the environment whose contribution to the free energy change is desired into block 4 (this can be everything else, or just a subset), open the trajectory file, and use the COMPonent analysis command. Much of the parsing is like the free command. Two special subcommands are DELL and NDEL. The normal output of COMP is evaluated at the lambda of the simulation. One can evaluate the same average for ensembles perturbed to various lambdas= lambda +/- {0,1,2,...NDEL}*DELL. This helps the quadrature in thermodynamic integration. 12) [AVER] The AVERage command is used to extract ensemble average structural properties from a dynamics simulation. Features in this implementation allow averages taken over ensembles that are perturbed from that which the simulation corresponds to. This is particularly useful for calculating the average structure expected at lambda=0.0 from a simulation run at lambda=0.1, for example. One may calculate average structures [STRUcture] and average distances [DISTance int int; where the two integers are the atom numbers between which the average distance is requested], currently. The PERT keyword indicates that a perturbed ensemble from the dynamics trajectory is desired, with TEMPerature giving the temperature to use in the exponential for the perturbation (defaults to 300 K), OLDLambda and NEWLambda are the lambdas for which the simulation was run and for which the ensemble is requested, respectively (only valid if three blocks are active; if these are not specified, the perturbation energy is calculated with the current coefficient matrix), and the remaining keywords are used to specify the trajectory. C DEC/CMS REPLACEMENT HISTORY, Element CONS.DOC C *6 18-NOV-1991 14:54:27 WON "B. Brooks and S. Fleischman update" C *5 24-OCT-1991 01:27:56 WON "17-OCT-91 NIH update" C *4 12-SEP-1991 19:15:33 WON "Update by Bernie Brooks" C *3 6-MAY-1991 16:42:34 WON "Info directive fixed" C *2 4-FEB-1991 17:04:39 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:49:49 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element CONS.DOC  File: Cons, Node: Top, Up: (doc/commands.doc), Next: Harmonic Atom CONSTRAINTS The following forms of constraints are available in CHARMM: * Menu: command * Harmonic Atom:: "CONS HARM" Hold atoms in place * Dihedral:: "CONS DIHE" Hold dihedrals near selected values * Internal Coord:: "CONS IC" Holds bonds, angles and dihedrals near table values * Quartic Droplet:: "CONS DROP" Puts the entire molecule in a cage about the center of mass * Fixed Atom:: "CONS FIX" Fix atoms rigidly (sets the IMOVE array) * SHAKE:: "SHAKE" Fix bond lengths during dynamics. * NOE:: "NOE" Impose distance constraints from NOE data * Sbound: (doc/sbound.doc). Solvent boundary potential  File: Cons, Node: Harmonic Atom, Up: Top, Next: Dihedral, Previous: Top Holding atoms in place [SYNTAX CONS HARMonic] Syntax: CONStraint HARMonic [FORCE real] atom-selection [MASS] [EXPO int] [COMP] [WEIGhting ] The potential energy has a harmonic constraint term which allows one to prevent large motions of individual atoms. The form for this potential is as follows for coordinates: EC = sum over all atoms of k(i)* [mass(i)] * (x(i)-refx(i))**2 where refx is a reference set of coordinates. If MASS is specified in the command line, then k is multiplied by the mass of the atom resulting in a natural frequency of oscillation for the constraint of sqrt(k) in AKMA units. An atom constrained with MASS FORCE 1.0 will oscillate at 8 cycles/picosecond if free of other interactions. For most operations involving harmonic constraints, mass weighting is recommended. There are three reasons for this. First, the results obtained will be similar regardless of what atom representation is used (extended vs. explicit) for hydrogen atoms. Second, Hydrogen atoms are allowed greater relative freedom if present. And third, The character of the normal modes of a molecule are unperturbed with mass weighting (essential if normal modes or low frequency motions are of interest). Note, there is no longer a prefactor of 0.5 on the force constant specification. This is appropriate in that exponent values other than "2" are allowed. This differs from the earlier versions of CHARMM (up to version 16). CHARMM supports a number of operations on the coordinate constraints. The constraint for any atom can be set to any positive value (specified by the FORCE keyword followed by the desired value). The reference coordinates can be the current set at the point when constraints are specified (the default) or a set can be the comparison set (COMP keyword). The force constants may also be obtained from the weight array, in which case the FORCe keyword is not read. It is important to understand some aspects of how the constraints are set in order to get the most flexibility out of this command. When CHARMM is loaded, each atom has associated with it a harmonic force constant initially set to zero. Each call to the CONS HARM command changes the value of this constant for only those atoms specified. When this command is invoked with an atom selection, only the reference coordinates (XREF,YREF,ZREF) for selected atoms are modified. If the CONS HARM command is invoked several times using different atom selections, different reference coordinates may be used. Other commands: The harmonic constraints may be read and written to files. The file name to be specified in the READ and WRITE command is CONS. The files may be read or written only in binary. The PRINT command will also work for constraints. See *note io:(doc/io.doc), for more details. In addition, one may look at the contributions to the energy in detail using the analysis facility, see *note anal:(doc/analys). PRINT specifies that a listing of of all the atoms currently constrained should be printed out. This is done by segments of constrained atoms, which is concise in most cases. Unfortunately in the case of IUPAC specified constraints it is quite verbose.  File: Cons, Node: Dihedral, Up: Top, Next: Internal Coord, Previous: Harmonic Atom Holding dihedrals near selected values Using this form of the CONS command, one may put constraints on the dihedral angles formed by sets of any four atoms. The improper torsion potential is used to maintain said angles. The command for setting the dihedral constraints is as follows: Syntax: [SYNTAX CONS DIHEdral] CONStraint DIHEdral [BYNUM int int int int] [FORCE real] [MIN real] [ 4X(atom-spec) ] CONS CLDH Syntactic ordering: DIHE or CLDH must follow CONS, and FORCE and MIN must follow DIHE. where: atom-spec ::= { segid resid iupac } { resnumber iupac } DIHEdral adds a torsion angle to the list of constrained angles using the specified atoms, force constant, and minimum. CLDH clears the list of constrained dihedrals so that different angles or new constraint parameters can be specified. Other commands: The PRINT CONS command, see *note print:(doc/io.doc)print, will work for constraints.  File: Cons, Node: Internal Coord, Up: Top, Next: Quartic Droplet, Previous: Dihedral Holding Internal Coordinates near selected values [SYNTAX CONS IC] Syntax: CONStraint IC [BOND real [EXPOnent integer] [UPPEr]] [ANGLe real] [DIHEdral real] Using this form of the CONS command, one may put constraints on any internal coordinate. For this energy term, the IC table is used. All nonzero bond entries are constrained with the bond constant, using the optional EXPOnent (default 2) in the potential K*(S-S0)**EXPOnent. Second derivatives are currently supported only with EXPOnent=2. If UPPEr is specified the reference bond length is taken as an upper limit and the constraint potential is applied only if S>S0; this is intended for use with distance constraints from NMR NOE data. All nonzero angle entries are constrained with the angle constant. All dihedrals are constrained with the dihedral constant using the improper dihedral energy potential. If any IC entry contains an undefined atom (zeroes), then the associated bonds,angles, and dihedral will not be constrained. This constraint term is very flexible in that the user may chose which bonds... to constrain by editing an IC table. The major drawback is that all bonds must have the same force constant. The same is true for angles and dihedrals. By listing some IC's several times, the effective force constant is increased. Also, if only angle constraints are desired, then the bond and dihedral constants can be set to zero eliminating their contribution.  File: Cons, Node: Quartic Droplet, Up: Top, Next: Fixed Atom, Previous: Internal Coord The Quartic Droplet Potential [SYNTAX CONS DROPlet] Syntax: CONStraint DROPlet [FORCe real] [EXPOnent integer] [NOMAss] This constraint term is designed to put the entire molecule in a cage. Is is based on the center of mass (or center of geometry if NOMAss is specified) so that no net force or torque is introduced by this constraint term. The potential function is; Edroplet= FORC* sum over atoms (( r-rcm )**EXPO )*mass(i))  File:Cons, Node: Fixed Atom, Up:Top, Next: SHAKE, Previous: Quartic Droplet How to fix atoms rigidly in place [SYNTAX CONS FIX] Syntax: CONS FIX atom-selection-spec { [PURG] } { [BOND] [THET] [PHI] [IMPH] } This command fixes atoms in place by setting flags in an array (IMOVE) which tells the minimization and dynamics alogrithms which atoms are free to move. If atoms are fixed, it is possible to save computer time by not calculating energy terms which involve only fixed atoms. The nonbond and hydrogen bond algorithms in CHARMM check IMOVE and delete pairs of atoms that are fixed in place from the nbond and hbond lists respectively. In addition the PURG or individual energy term options specified with the CONS FIX command allow all or some of the internal coordinate energies associated with fixed atoms to be deleted. Interactions between fixed and moving atoms are maintained. *** NOTE *** because some energy terms are deleted from fixed systems, the total energy calculated with fixed atoms will be different from the total energy of the same system with all atoms free. The forces on the moveable atoms will however be identical. The purpose of this feature is to remove the computational cost of energy terms that do not change for simulations where a large fraction of the atoms are fixed. It is not recommended for any other purpose. The way CHARMM keeps track of fixed atoms is by the IMOVE array in the PSF. The IMOVE array is 0 if the atom is free to move, and has some other value if the atom is fixed. WARNING: the use of IMOVE is not yet universal in CHARMM. It is supported for dynamics, all forms of minimization except Newton-Raphson. The vibrational analysis does not support it. The fixing of atoms is also not respected with internal coordinate manipulations (IC BUILD) or the coordinate manipulation commands. ***** WARNING ***** The purge options modify the PSF. The effects of this command cannot be undone by the subsequent releasing of atoms.  File: Cons, Node: SHAKE, Up: Top, Next: NOE, Previous: Fixed Atom Fixing bond lengths or angles during dynamics. SHAKE is a method of fixing bond lengths and, optionally, bond angles during dynamics, minimization (not ABNR and Newton-Raphson methods), coordinate modification (COOR SHAKe command), and vibrational analysis (explore command). The method was brought to CHARMM by Wilfred Van Gunsteren (WFVG), and is referenced in J. Comp. Phys. 23:327 (1977). When hydrogens are present in a structure, it will allow a two-fold increase in the dynamics step size if SHAKE is used on the bonds. To use SHAKE, one specifies the SHAKE command before any SHAKE constraints usage. The SHAKE command has the following syntax: [SYNTAX SHAKe constraints] SHAKE [BONH] [BOND] [ANGH] [ANGL] { [MAIN] } [TOL real] [MXITer integer] { COMP } { PARAmeters } 2x(atom-selection) [SHKScale real] BONH specifies that all bonds involving hydrogens are to be fixed. BOND specifies all bonds. ANGH specifies that all angles involving hydrogen must be fixed. ANGL specifies that all angles must be shaken. BOND must be specified if angles are fixed, otherwise, only the 1-3 distances will be fixed. Coordinates must be read in before the SHAKE command is issued, unless the PARAmeter option is specified. SHAKE constraints are applied only for atom pairs where one atom is in the first atom selection and one atom int he second atom selection. The default atom selection is ALL for both sets. TOL specifies the allowed relative deviations from the reference values (default: 10**-10). MXITer is the maximum number of iterations SHAKE tries before giving up (default: 500). When the SHAKE command is used, it will check that there are degrees of freedom available for all atoms to satisfy all their constraints. Angles cannot be fixed with SHAKE if one has explicit hydrogen arginines in the structure as the CZ carbon has too many constraints. This is a general problem for any structure which has too many branches close together. SHAKE is not recommended for fixing angles. The algorithm converges very slowly in the case where one has three angles centered on a tetravalent atom and the constraints are satisfiable only using out of plane motions. The use of SHAKE modifies the output of the dynamics command. The number appearing to the right of the step number is the number of iterations SHAKE required to satisfy all the constraints. This number should generally be small. When ST2's are present, SHAKE constraints are automatically applied for the O-H bonds and H-O-H angles. There is a PARAmeter option the the SHAKe command. This option causes the shake bond distances to be found from the parameter table rather than from the current set of coordinates. This option is NOT compatible with the use on angle SHAKE constraints, and it will give an error if this is tried. With these commands, the bond energy may be zeroed without any minimization with the command sequence; SHAKE BOND PARA COOR SHAKE [MASS] [SYNTAX SHAKe FAST constraints] SHAKe FAST [WATEr SELEct water_selection END] [OLDWatershake] [ MXITer TOL ] [PARAmeter] [COMP] This command specifies the use of the new vector/parallel SHAKE constraint routines. Certain assumptions are made when this command is issued: The only bonds involved are between heavy atoms and hydrogens, except for water molecules included in the WATEr selection ... end sub-command. This selection is used to indicate the water molecules that have an H-H bond. It is assumed that the selection will include all atoms in the water molecule and that said molecule contains exactly two X-H bonds and one H-H bond where X is any heavy atom. Testing for "hydrogen-ness" is done via the CHARMm hydrog() function which makes it's choice based on atomic mass. The prefered selection is through the use of the RESNAME selection specifier, eg: ... WATEr SELEct RESNAME TIP3 END By default, water molecules selected with the WATEr sub-command will be constrained via the use of a special water-SHAKE routine which uses the direct inversion method. This algorithm is from 25 to 30 % faster than the normal iterative, scalar SHAKE routine. For the rest of the heavy atom -hydrogen bonds, a vector/parallel version of the original SHAKE routine is used. This is about 5X the scalar SHAKE. If the optional keyword OLDWatershake is used, the vector/parallel (not the watershake) routines are used. The rest of the keywords are the same as in the original SHAKE command. Note: that FAST has to be the second word in command line.  File: Cons, Node: NOE, Up: Top, Previous: SHAKE, Next: Top [SYNTAX NOE constraints] NOE Invoke the module RESEt Reset all NOE constraint lists. This command clears all existing NOE constraints. Resets scale factor to 1.0 ASSIgn [KMIN real] [RMIN real] [KMAX real] [RMAX real] [FMAX real] [TCON real] 2X(atom_selection) Assign a constraining potential between the last atoms the first selection and the last atom of the second selection. 0.5*KMIN*(R-RMIN)**2 RRLIM and RAVE=R TCON=0 RAVE=RRAVE**(-1/3) TCON>0 RRAVE=RRAVE*(1-DELTA/TCON)+R**(-3)*DELTA/TCON for initial conditions, RRAVE=RMAX**(-3) DELTA is the integration time step. For minimization, the value is either 0.001ps or the previous simulation value. Where: RLIM = RMAX+FMAX/KMAX (the value of RAVE where the force equals FMAX) Defaults for each entry: KMIN=0.0, RMIN=0.0, KMAX=0.0, RMAX=9999.0, FMAX=9999.0 TCON=0.0 READ UNIT Reads constraint data structure from card file previously written. WRITe UNIT [ANAL] Writes out the constraint data in card format to a file on the specified unit. A CHARMM title should follow the command. SCALE are saved together with the lists in the NOE common block. The ANAL option will print out the distances and energy data computed with the current main coordinates. PRINT [ANAL [CUT real]] Same as the WRITe command except to the output file and slightly more user friendly form. A positive CUT value will list only those that have a distance that exceeds RMAX by more than DCUT. SCALe [real] Set the scale factor for the NOE energy and forces. Default value: 1.0 END Return to main command parser. No other commands (I/O or loops) are supported inside the NOE module. Looping can be performed outside if necessary. EXAMPLE. Set up some NOE constraints for one strand of a DNA-hexamer in a file to be streamed to from CHARMM. * SOME NOE CONSTRAINTS FOR DNA. ASSUME PSF, COORD ETC ARE ALREADY PRESENT * ! First clear the lists NOE RESET END ! Since there are many identical atom pairs we use a loop set 1 1 label loop NOE ! Sugar protons, same in all six sugars (don't pay any attention to ! the numeric values) ASSIgn SELE ATOM A @1 H1' END SELE ATOM A @1 H2'' END - KMIN 1.0 RMIN 2.7 KMAX 1.0 RMAX 3.0 FMAX 2.0 ASSIgn SELE ATOM A @1 H3' END SELE ATOM A @1 H2'' END - KMIN 1.0 RMIN 2.7 KMAX 1.0 RMAX 3.0 FMAX 2.0 END incr 1 by 1 if 1 le 6 goto loop ! Now do some more specific things OPEN WRITE UNIT 10 CARD NAME NOE.DAT NOE SCALE 3.0 ! Multiply all energies and forces by 3 WRITE UNIT 10 * NOE CONSTRAINT DATA FROM DOCUMENTATION EXAMPLE * PRINT ANAL ! See what we have so far PRINT ANAL CUT 2.0 ! list END RETURN C DEC/CMS REPLACEMENT HISTORY, Element CORMAN.DOC C *7 7-DEC-1991 04:26:55 WON "Distance matrix command added" C *6 18-NOV-1991 14:58:54 WON "Updated by B. Brooks" C *5 12-SEP-1991 19:18:14 WON "Update by Bernie Brooks" C *4 6-MAY-1991 16:43:28 WON "Info directive fixed" C *3 4-FEB-1991 17:05:58 WON "from NIH, 02-Feb-91" C *2 8-JUL-1990 17:53:32 KOTTALAM "Charlie added solanl and covariance documentation" C *1 8-APR-1990 19:49:52 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element CORMAN.DOC  File: Corman, Node: Top, Up: (doc/commands.doc), Next: Syntax The Coordinate Manipulation Commands The commands in this section are primarily used for moving some or all of the atoms. There is a wide range of commands and options. All of the commands may be used on either the main coordinate set, or the comparison set. Some commands require both sets of coordinates. * Menu: * Syntax:: Syntax of the coordinate manipulations commands * Simple:: Descriptions of the simple commands * Function:: Descriptions of the remaining commands * Substitutions:: Description and usage of substitution values  File: Corman, Node: Syntax, Up: Top, Next: Simple Syntax of Coordinate Manipulation commands [SYNTAX COORdinate manipulation] COORdinates { INITialize } [COMP] [atom-selection] { COPY } [WEIGhting_array] { SWAP } [IMAGes] { AVERage [ FACT real ] } { SCALe [ FACT real ] } { MASS_weighting } { ADD } { SET vector-spec } { TRANslate vector-spec } { ROTAte vector-spec PHI real } { ORIEnt [MASS] [RMS] [NOROtation] } { RMS [MASS] } { DIFFerence } { FORCe [MASS] } { SHAKe [MASS] } { DRAW draw-spec } { DISTance distance-spec [DIFF] } { MINDist distance-spec } { READ io-specification } { WRITe io-specification } { PRINt io-specification } { RGYR [MASS] [FACT ] } { LSQP [MASS] [VERBose] } { OPERate image_name } { STATistics [MASS] } { VOLUme {SPACe integer} } { } { DUPLicate { 2X(atom-selection) } } { { PREVious } } COORdinates DYNAmics [COMParison] [PAX] [atom-selection] [NOPRint] [FIRSt int] [NUNIts int] [NSKIp int] [BEGIn int] [STOP int] COORdinates PAXAnalysis [COMParison] [atom-selection] [NOPRint] [FIRSt int] [NUNIts int] [NSKIp int] [BEGIn int] [STOP int] COORdinates SEARch [output-spec] [atom-selection] [COMP] [IMAGe] [XMIN real] [XMAX real] [XGRId integer] [YMIN real] [YMAX real] [YGRId integer] [ZMIN real] [ZMAX real] [ZGRId integer] [RCUT real] [RBUFf real] output-spec ::= { PRINt } [UNIT int] { [VACUum] } { [NOPRint] } { FILLed } COORdinates SURFace [atom-selection] [WEIGhting] { CONTact-area } [ACCUracy real] { ACCEssible-area } [RPRObe real] COORdinates CONVert-from-unit-cell [atom-selection] [COMP] [IMAGe] followed by; a b c and; alpha beta gamma (in degrees) COORdinates AXIS atom-selection [atom-selection] [MASS] [COMP] [IMAGEs] COORdinates COVAriance - [FIRStunit int] [NUNIt int] [BEGIn int] [SKIP int] [STOP int] 2x(atom_selection) [UNIT_for_output int] [DISTance_matrix] [RESIdue_average_nsets integer] COORdinates PUCKer SEGId segid RESId resid1 TO resid2 COORdinates HELIx atom-selection [atom-selection] COORdinate ANALysis {SOLVent} {WATer} SPEC FINIish - {XREF YREF ZREF } - !syntax to set-up arbitrary !analysis point {SITE } - !syntax to set-up solute anaylsis site NFIRst NSTEp NSKIp - !syntax for reading trajectories NCORs RSPIn RSPOut - !correlation function set-up DTCOordinates DTVElocity - !timestep information RDSP DR RRSP MGN - !more analysis info IMSD IVAC IGDISt ISDISt IKIRkg- !set-up for logical compute flags IFMIn XBOX YBOX ZBOX - !PBC info for analysis IFDBF RCUT ZP0 NZP !analysis info for DBF analysis atom-selection:== (see *note select:(doc/select.doc).) distance-spec::= { WEIGhting vector-spec atom-selection } { [UNIT int] [CUT real] [ENERGy [CLOSe]] 2X(atom-selection) - } { [Nonbonds] } { [NO14exclusions] } { [NOEXclusions] } { NONOnbonds } { 14EXclusions } { EXCLusions } [TRIAngle] vector-spec::= { [XDIR real] [YDIR real] [ZDIR real] } [DISTance real] [XCEN real] [YCEN real] [ZCEN real] [FACTor real] { AXIS } draw-spec::= [DFACt real] [NOMO] UNIT integer io-specification:== (see *note io:(doc/io.doc).)  File: Corman, Node: Simple, Up: Top, Previous: Syntax, Next: Function Descriptions of the simple coordinate manipulation commands All of these commands allow either the main coordinate set (default), or the comparison set (COMP keyword) to be modified. The other coordinate set is only changed by the SWAP command and the ORIEnt RMS command when the specified atoms are not centered about the origin. Each of these commands may also operate on a subset of the full atom space. The selection specification should be at the end of the command. The default atom selection includes all atoms. If the IMAGes keyword is specified, then the operation will be performed on the image atoms as well (if images are present). 1) The INITialize command The INITialize command returns the coordinate values of the specified atoms to their start up values (9999.0). The main use of this command is in connection with the IC BUILD command, which may only find coordinates for atoms with the initial value. 2) The COPY command The COPY command will copy the coordinate values into the specified set FROM the other coordinate set. 3) The SWAP command The SWAP command will cause the coordinate values of the specified atoms to be swapped with the comparison set. 4) the AVERage command The AVERage command will generate a new coordinate set at a point along the displacement vector between the present coordinate set and the other set. The FACTor value determines the relative step along this vector. Its default value is 0.5 (a true average). A FACTor value of 1.0 is equivalent to the copy command. Negative or greater than unit positive values are also allowed. 5) The SCALe command The SCALe command will cause the coordinate values for all selected values to be scaled by a required scale factor. This option is designed to work with coordinate displacement vectors. A scale factor of zero will set the selected coordinate values to zero. This option may also be useful in plotting. 6) The MASS_weighting command The MASS_weighting command will cause all selected coordinates to be scaled by the MASS of each atom. If the WEIGht option is specified, the weighting array will be scaled. 7) The ADD command The add command will add the main and the comparison coordinate values and store the results in the selected coordinate set. As with other commands, only selected atoms will be modified. If an atom in either set is undefined, then the sum will also be undefined. This option is designed for use in cases where one or both coordinate sets contain coordinate displacement vectors. 8) The SET command The SET command will set all coordinate values of selected atoms to a specified value determined by the vector specified. This is a simple manner in which to zero a coordinate set with the command; COOR SET XDIR 1.0 DIST 0.0 Note, the XDIR keyword value was included so that the vector has a nonzero norm (required for all vector specifications). 9) The TRANslate command The TRANslate command will cause the coordinate values of the specified atoms to be translated. The translation step may be specified by either X,Y, and Z displacements, or by a distance along the specified vector. When no distance is specified, The XDIR,YDIR, and ZDIR values will be the step vector. If the AXIS keyword is used, then the translation will be along the axis defined by the previous COOR AXIS command. For this option, a distance may be specified, but if it isn't, then the translation distance will be the COOR AXIS vector length 10) The ROTAte command The ROTAte command will cause the specified atoms to be rotated about the specified axis vector through the specified center. The vector need not be normalized, but it must have a non zero length. If the AXIS keyword is used, then the axis and center information from the last COORdinates AXIS command will be used. The PHI value gives the amount of rotation about this axis in degrees (in the right handed sense). Only the atoms specified will be rotated. 11) The ORIEnt command The ORIEnt command will modify the coordinate values of ALL of the atoms. The select set of atoms is first centered about the origin, and then rotated to either align with the axis, or the other coordinate set. The RMS keyword will use the other coordinate set as a rotation reference. The MASS keyword cause a mass weighting to be done. This will align the specified atoms along their moments of inertia. When the RMS keyword is not used, then the structure is rotated so that its principle geometric axis coincides with the X-axis and the next largest coincides with the Y-axis. This command is primarily used for preparing a structure for graphics and viewing. It can also be used for finding RMS differences, and in conjunction with the vibrational analysis. The NOROtation keyword will suppress rotations. In this case, only one coordinate set will be modified. 12) The RMS command The RMS command will compute the RMS or mass weighted RMS coordinate differences between the selected set of atoms just as they lie. This differences from the COOR ORIENT RMS command in that no coordinate modifications are made and no translation is done. 13) The DIFF command The DIFF command will compute the differences between the main and comparison set (or the reverse) and store this difference in the modified coordinate set. Undefined or unselected atoms result in a zero. 14) The FORCe command The FORCe command will copy the current forces (DX,DY,DZ) of the selected atoms to the specified coordinate set. Atoms not selected are given a value of zero. If the MASS keyword is specified, then the forces will be divided by the mass. This would correspond to an acceleration in dynamics. 15) The SHAKe command This command will SHAKE the selected coordinate set with respect to the other (as a reference). A mass weighting may be used. Any atoms that are not selected are considered to be fixed (infinite mass). In order to use this command, the SHAKe command must first be invoked which sets up the shake constraints.  File: Corman, Node: Function, Up: Top, Previous: Simple, Next: Substitutions Descriptions of the remaining corman commands See the descriptions of the simple commands for some background information on these commands. 1) The DISTance command The DISTance command will find all atom distances between two atom selections. A unit number may be specified (default=6) and a cutoff distance may be included as well (default=8999.0). If no selection is specified, all atoms will be included! The delimiter ENDselection must separate the two sets of atom selections. The van der Waal energy may be requested with the "ENERgy" keyword, and if this option is used, the list of pairs with a positive van der Waal energy may be selected with the "CLOSe" keyword (i.e. only close contacts will be listed). The COOR DISTance command doesn't gives distances between excluded atoms unless the "EXCLusions" keyword is specified. This make it much easier to search for bad contacts. Likewise, 1-4 interactions and other interactions may be requested or omitted. The command; COOR DISTance ENERgy CLOSe CUT 5.0 SELE ALL END SELE ALL END - 14EXclusions NONBonds will list all atom pairs that have a positive van der Waal energy. The command; COOR DISTance ENERGY CUT 5.0 NONONbonds NOEXclusions 14EXCLusions - SELE ALL END SELE ALL END will list all 1-4 interactions and energies (and nothing else). The command; COOR DISTance ENERgy CUT 4.5 SELE RESID 23 END SELE ALL END will list all contacts less than 4.5A that residue 23 has with the rest of the system without considering 1-4 interactions or excluded pairs. The 1-4 vdw terms, E14FAC, and EPS values other than 1.0 are recognized. The WEIGht option puts the distance of all selected atoms from some specified point. If no point is specified, then the origin is used. This is most useful in computing magnitudes of forces or coordinate differences. For example, the sequence; ENERGY ... COOR FORCE COMP ! copy forces to the comparison coordinates COOR DIST WEIGH COMP ! put magnitudes in the weighting array. PRINT COOR COMP SELE PROP 1 .GT. 5.0 END ! print atoms with large forces. Note that all operations were done on the comparison set. The DIFF keyword causes the selection to work on different coordinate sets, where the first selection corresponds to the set specified (MAIN or COMP), and the second atom selection uses the other coordinate set. 2) The RGYR command The RGYR command computes the Radius of GYRation, center-of-mass and total mass of the specified atoms. By default the electronic RGYR, where the number of electrons per atom is used as weighting factor, is computed. The current keywords are: MASS use mass weighting instead of electrons/atom WEIG use a weight array (WMAIN or WCOMP) for the weighting FACT constant (electrons) to be subtracted from each weight The weight arrays can be filled, by using COOR or SCALAR commands, before invoking the RGYR routine. In this way almost any RGYR can be computed. NB! The electronic RGYR is only correct if the nonbonded parameters have been specified with the number of electrons per atom! If this is not the case, you will get a warning, and the MASS weighted RGYR will be calculated. In this case use the WEIG option. 3) The LSQP command The LSQP command computes the least-squares-plane through the selected atoms. Weighting can be done by the atom masses [MASS], by the weighting array [WEIG], or not at all (default). Output is the equation for the plane, the sum-of-squared distances (weighted) from the plane (SSQ), and the center-of-mass of the selected atoms. The keyword VERBose causes some additional output, most useful of which is the distance from the plane for each atom. 4) The OPERate command. The OPERate command processes the selected coordinates through the image transformation specified by name. This command may only be used if an image file has been read. 5) The MINDistance command. The MINDistance command computes the minimum distance between selected coordinates. Usually this command is executed with a double selection. If only one selection is given, then it will give the minimum distance of the selected coordinates between the MAIN and COMPARISON set. 6) The STATistics command The STATistics command will print some simple statistics regarding the selected atoms. The values XMIN,YMAX,XAVE,YMIN,YMAX,YAVE, ZMIN,ZMAX,ZAVE,WMIN,WMAX,WAVE are set when this command is executed. These variable values may then be used un subsequent commands with the "?" symbol. For example, the command sequence may be used to shift a structure so that a single atom is in the X-Y plane (e.g. shift in the z-direction); COOR STATistics SELE desired-atom END COOR TRANS ZDIR ?ZAVE FACT -1.0 The MASS option will place the average values at the center of mass. 7) The AXIS command. The AXIS command generates a vector and saves it for subsequent use for either command parsing, or for use as input in the COOR SET, COOR ROTAte, COOR TRANslate, or COOR DISTance WEIGhting commands by using the AXIS keyword. There are two modes for the AXIS command. With a single atom selection, the stored vector is the defined from the origin to the center of geometry/mass of all selected atoms. With two atom selections, the vector spans from the center of the first set of selected atoms to the center of the second. The MASS keyword invokes the usage of the center of mass. The AXIS command sets the variables XAXIs, YAXIs, ZAXIs, RAXIs, XCEN, YCEN, and ZCEN, which may be accessed with the "?" symbol. These values define the actual vector, the length of the vector, and the center of the vector (midpoint). For example, to use the distance between two atoms as a criterion to terminating a run, the following command sequence could be used; SET 1 10.0 COOR AXIS SELE first-atom END SELE second-atom END IF 1 GT ?RAXIs STOP For another example, to rotate the chi-1 torsion of a specified residue BY 30 degrees, the command sequence would be appropriate; DEFINE BACK SELE TYPE O .OR. TYPE N .OR. TYPE H .OR. TYPE CA .OR. TYPE C END COOR AXIS SELE ATOM MAIN 23 CA END SELE MAIN 23 CB END COOR ROTATE AXIS PHI 30.0 SELE RESID 23 .AND. .NOT. BACK END 8) The DUPLicate command. The DUPLicate command copies coordinates between atoms within a structure. The coordinates are copied FROM the first selection TO the second selection. If the selections overlap, watch out!. The matching is done by number within the selected coordinate sets. If the two selection have a different number of atoms, a warning will be issued, and the smaller number will be used. For example, if one needs to compute the relative orientation between two alpha helicies, the following input might be used; COOR COPY COMP COOR DUPL COMP SELE backbone of first END SELE backbone of second END COOR ORIE RMS MASS COMP SELE backbone of second END This will give the RMS shift between these helicies as well as the coordinate transformation required to map one into the other. The PREVious option may be used with a single atom selection. This assigns the coordinate position of selected atoms to the value of the previous atom (by number). This has been used with the command; COOR DUPLicate PREVious SELE TYPE H* END to assign hydrogen atom positions to that of the associated heavy atom. 9) The DYNAmics command The COOR DYNAmics command will read a (set of) dynamics trajectory files and compute the average coordinates (stored in the selected coordinate set) and the isotropic fluctuations (stored in the weighting array). The first unit number (FIRSt)(default 51), number of units (NUNIts) (default 1), frequency of accepted coordinate sets (NSKIp)(default 1), starting set (BEGIn)(default first set), last set (STOP)(default last set), may be specified. Option values are not remembered with subsequent COOR DYNA commands. The NOPRint supresses much of the output. The PAX command causes the Principal AXis of the motion of each atom to be computed and save. The print out gives the direction and magnitude of the fluctuation as well as the anisotropies. The PAX data is saved for a subsequent COOR PAXAnal command if further analysis is desired. 10) the PAXAnal command The COOR PAXAnal command computes additional data regarding the Pricipal AXis data (computed by the most recent COOR DYNA PAX command). The trajectory must be reopened and reread, or a different trajectory may be substituted. This command prints data for each selected atom and averages over the selected atoms. The printout includes the skew and kurtosis, anisotropies, as well as all of the low moments of the motion. 11) the SEARch command The SEARch command will search through a set of grid points for vacuum space points (i.e. points outside the van der Waal radius of any atom). In the default mode (NOPRint), only the relative volume of filled and vacuum points are printed concerning the selected atoms. The grid specifiers must be input (min, max, and grid) for each dimension. (grid implies number of grid points. Hence XMIN -13.0 XMAX 13.0 XGRID 52.0 implies a half Angstorm sampling along the x direction) The FILLed option will cause non-vacuum points to be listed or plotted. The PRINt option will cause all found grid points to be listed on the output unit specified (default 6). The plot option will make a line printer plot of all found grid points. For this option, a plot order may be specified (XYZ,XZY,YXZ,...), Where the first index is the horizontal (maximum grid 64), the second is the vertical (no maximum), and the last is obtained by paging (no maximum). For this command, the atom sizes are taken from the weighting array. To get van der Waal radii into the weighting array, the command; SCALar WMAIn = RADIus may be used. If a hole big enough to stuff a water into is to be found, then the command sequence; SCALar WMAIn = RADIus SCALAR WMAIN ADD 1.6 SCALAR WMAIN MULT 0.85 would be probably the best to use. If the RCUT or RBUFf value is set to a nonzero value, then the accessible volume command is enabled. When RCUT is set, this is the maximum radius. When RBUFf is set, then the maximum radius is the weighting array plus the RBUFf value. The weighting array is returned with the fraction of free volume in the shell from the atom radius to the maximum radius. 12) the VOLUme command The VOLUme command will compute the volume of a selected set of atoms. Its operation is the same as that of the SEARch command, except that only the volume is printed and the degree of exposure for each atom is returned in the weighting array. The SCALAR storage arrays must be filled before using this command. The first storage array [1] must contain the radii of each atom (RMIN) and the second storage array must contain the outer probe distance (RMAX) for each atom. The free volume within the RMIN to RMAX range and not within RMIN of any other atom will be returned in the weighting array as a ratio of the maximum possible value. For example a completely exposed atom will return a value of 1.0 and an atom in the interior of a protein would return a value of 0.0. 13) The SURFace command The COOR SURFace command computes the Lee and Richards surface for selected atoms and stores the result in the appropriate weighting array. If the "WEIGhting" keyword is used, the radii are obtained from the weighting array (and then written over), otherwise the radii are obtained from the parameter file values. The radius of the probe may be specified (default 1.6) and the accuracy may be specified (default 0.05). Either ACCEssible surface (default) or CONTact surface may be specified. Contact surface is equivalent to Accessible surface if a zero probe radius is used. 14) The CONVert command The COOR CONVert command will cause the coordinates of all defined and selected atoms to be transformed from the unit cell to orthogonal coordinates. There are no other keywords, but this command requires further input in formatted, free field form. The first following line must contain the crystalographic a,b,and c values. the following line must contain alpha,beta and gamma. The angle values are specified in degrees. See the routine CONCOR for details concerning the transformation. 15) The COVAriance command The covarience command under coordinate manipulations computes covariances of the spatial atom displacements of a dynamics trajectory for selected pairs of atoms. mu = E[ (R - E[R ]) (R - E[R ] ) JK J J K K = E[R R ] - E[R ] E[R ] J K J K and the normalized covariance matrix is given by CO = mu / SQRT(mu mu ) JK JK JJ KK The command syntax and varibles are as in the coor dynamics command. The exceptions are the keywords: SET1: specifies the selection for the "J" groups in covarience SET2: specifies the selection for the "K" groups in covarience UNIT_for_output: specifies unit for output of covarience matrix (ascii) DISTance_matrix: keyword specifying that trajectory of distance matrix will be output in binary format to unit specified by UNIT RESIdue_average: is a logical for computing the average over residues in SET2 specification. When followed by NSETS: equal to 2 the average is over both SET1 and SET2 giving a NRES1 x NRES2 covarience matrix. 16) The ANALysis command A "new" analysis module for computing solvent averaged properties has been added to CHARMM. It is accessed from the coordinate manipulaiton part (CORMAN) of CHARMM and is used with the following syntax. This piece of documentation is still under development. CLBIII 1/1/1990 Keywords: SOLVent: specifies analysis is to be of pure solvent, which means xref, yref and zref, or site keywords are inappropiate, i.e., analysis all configurations of solvent using all solvent molecules. WATEr: specifies the solvent is water, and forces all distinct g(r)'s to be computed, i.e., g_oo, g_oh and g_hh. SPECecies: specifies the solvent species. If SOLVent is active then all solvent molecules to be analyzed should be specified here, e.g., all of them present in the simulations. This keyword is followed by the standard selection syntax and is terminated with the FINIsh_solvent_specification keyword. SITE: Specifies the collection of atoms around which you would like to compute solvent properties, e.g., if you would like to analyze the solvent distribution and velocity correlation function around the center of geometry of a trp residue this keyword would be followed by the selection syntax which selects that residue. XREF, YREF, ZREF: specifies that solvent analysis around a specific spatial position, (xref, yref, zref) is to be carried out. This is the same as the site keyword, as far as the analysis of solvent configurations it involks, however, this site is static whereas the SITE keyword permits selection of a dynamically evolving site. Other keywords controlling calculations: C NFIRST = number of first dynamics step to be read C NSTEP = number of last dynamics step to be read C NSKIP = number of dynamics steps to skip between calculations C NCORS = number of steps to compute vac or msd C RSPIN = inner radius for vac,msd, analysis around REF C RSOUT = outer radius for vac,msd, analysis around REF C DTVE = timestep for velocities (NSAVV*dynamics timestep) C DTCO = timestep for coordinates (NSAVC*dynamics timestep) C RDSP = radius of dynamics sphere, used for densities and dbf C DR = grid spacing for analysis of rdf's C RRSPHER = radius for rdf analysis C MGN = number of points in g(r) curve C IFDBF = 1 (0) do (don't do) deformable boundary force calculation C RCUT = radius of interaction sphere in dbf calculation C ZP0 = initial reference site - dynamics sphere origin separation C NZP = number of separations to compute dbf C for the following flags the presence of the keyword indicates C that the flag is on (1). I.e., just put the flag keyword on the command line. C IVAC = 1 (0) do (don't do) vac analysis C IGDIST = 1 (0) do (don't do) solvent-solvent rdf analysis C ISDIST = 1 (0) do (don't do) solvent-site rdf analysis C IMSD = 1 (0) do (don't do) msd analysis C IKIRKG = 1 (0) do (don't do) dipole analysis for water solvent C IFMIN = 1 (0) periodic boundaries are (aren't) in effect C XBOX = dimension of simulation box in x direction C YBOX = dimension of simulation box in y direction C ZBOX = dimension of simulation box in z direction C ***********Current restrictions************************************ C ** Note IVAC and IMSD are mutually exclusive flags ** C ** Note IGDIST and ISDIST are mutually exclusive flags ** C The coordinates are read from CHARMM dynamics output on file 20-29 C The velocities are read from CHARMM dynamics output on file 30-39 C The output is: C fortran unit 7 is the vac in plt2 format C fortran unit 8-10 solvent-solvent g(r)'s in plt2 format C fortran unit 11 is the msd in plt2 format C fortran unit 12 is the density profile in plt2 format C fortran unit 13 is the dipole profile in plt2 format C fortran unit 14 is the solvent-site g(r) in plt2 format C fortran unit 15 is the time dependant dbf (see SUBROUTINE FBOUND for C for details. C fortran unit 16 is the averaged deformable boundary force in plt2 C format C functions are also printed onto fortran unit 6 as output. C 17) The DRAW command The DRAW command (called directly from CORMAN, not to be confused with the DRAW command found under the ANALysis command) is useful for displaying molecules. The output is a command file that can be read by various displaying and plotting programs. This command file can be edited for different types of displaying. In addition to atom positions and bonds, velocity and forces may also be displayed. The current keywords are: NOMO - No molecule option (only velocities or derivatives) DFACt - Derivative factor (default 0.0) DASH - Spacing of dashed line used for Hbonds (default .01) FRAMe - Specifies that a frame tag will be written first (default - dont specify frame) RETUrn- Specifies which stream the plotting program will return to after plotting this section (default none) An atom selection is also looked for. Any atom not selected will not be considered. The default is to include all atoms.  File: Corman, Node: Substitution, Up: Top, Previous: Function, Next: Top Coordinate Manipulation Values There are several different variables that can be used in titles or CHARMM commands that are set by some of the coordinate manipulation commands. Here is a summary and description of each variable. ---------------------------------------------------------------------------- 'XAXI','YAXI','ZAXI','RAXI','XCEN','YCEN','ZCEN' A rotation axis vector and its length and the center of rotation. This data is set by the COOR AXIS, COOR ORIE, and COOR ORIE RMS commands. These values may be used by any of the commands that uses the vector-spec with the AXIS keyword. ---------------------------------------------------------------------------- 'XMIN','YMIN','ZMIN','WMIN','XMAX','YMAX', 'ZMAX','WMAX','XAVE','YAVE','ZAVE','WAVE' Statistics set by the COOR STAT command. ---------------------------------------------------------------------------- 'THET' Angle of rotation set by the COOR ORIEnt command. ---------------------------------------------------------------------------- 'XMOV','YMOV','ZMOV' Displacement of centers set by the COOR ORIEnt command. ---------------------------------------------------------------------------- 'RMS' Resulting RMS value set by the COOR RMS, COOR ORIEnt, or COOR RGYR commands. C DEC/CMS REPLACEMENT HISTORY, Element CORREL.DOC C *7 18-NOV-1991 15:02:15 WON "Updated by B. Brooks" C *6 24-OCT-1991 01:29:14 WON "17-OCT-91 NIH update" C *5 12-SEP-1991 19:19:28 WON "Update by Bernie Brooks" C *4 6-MAY-1991 16:44:43 WON "Info directive fixed" C *3 4-FEB-1991 17:07:03 WON "from NIH, 02-Feb-91" C *2 18-OCT-1990 15:54:16 KOTTALAM "LENNART DOCUMENTATION INCLUDED" C *1 8-APR-1990 19:49:55 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element CORREL.DOC  File: Correl, Node: Top, Up: (doc/commands.doc), Next: Syntax Correlation Functions The CORREL commands may be used to obtain a set of time series for a given property from a trajectory. Once obtained, the time series may be manipulated as required, saved or plotted, or to generate correlation functions ( C(tau) = ). The correlation functions may be manipulated, saved, plotted, and transformed to find spectral density (Fourier transform of C(tau)), etc and determine the correlation times. Alternately, a covariance matrix may be computed for a collection of time series. This option will compute the full matrix for use in entropy calculations or for other applications. Reorienting a coordinate trajectory is possible using the COMPARE command. For details see *note reorient:(doc/dynamc)Merge. * Menu: * Syntax:: The syntax of the correlation command * General:: General information regarding the correlation section * Enter:: How to specify time series * Trajectory:: How to reference to trajectory files * Edit:: How the edit the time series specifications * Mantime:: How to manipulate time series * Corfun:: How to generate correlation functions. * Spectrum:: How to get a spectrum from a correlation function * IO:: Input/output guide to correlation functions and series * Examples:: Just what it says  File: Correl, Node: Syntax, Up: Top, Next: General Syntax for the CORREL command and subcommands [SYNTAX CORRelation functions] Syntax: CORREL [ MAXTimesteps int ] [ MAXSeries int ] [ MAXAtoms ] [ COVAriance] default 512 default 2 default 100 Subcommands: miscellaneous-commands COOR coordinate-manipulation-command { DUPLicate time-series-name } { } ENTEr name { [ BONDs repeat(2x(atom-spec)) ] [ GEOMetry ] } c { [ ANGLe repeat(3x(atom-spec)) ] [ ENERgy ] } c { [ DIHEd repeat(4x(atom-spec)) ] } c { [ IMPRo repeat(4x(atom-spec)) ] } c { } { [ ATOM ] [ X ] repeat(atom-spec) [ MASS ] } e { [ FLUC ] [ Y ] } c { [ Z ] } { [ R ] } { [ XYZ ] } { } { VECT [ X ] repeat(2x(atom-spec)) } e { [ Y ] } { [ Z ] } { [ R ] } { [ XYZ ] } { } { ATOM DOTProduct repeat(2x(atom-spec)) [NORMal] } e { FLUC DOTProduct repeat(2x(atom-spec)) [NORMal] } e { VECT DOTProduct repeat(4x(atom-spec)) [NORMal] } e { } { ATOM CROSsproduct repeat(2x(atom-spec)) [NORMal] } e { FLUC CROSsproduct repeat(2x(atom-spec)) [NORMal] } e { VECT CROSsproduct repeat(4x(atom-spec)) [NORMal] } e { } { HBONd [4x(atom-spec)]*++ [ ENERgy ] } c { [ DISTance] } { [ HANGle ] } { [ AANGle ] } { } { DISTance repeat(2x(atom-spec)) } c { } { [ GYRAtion ] [ CUT real ] } c { [ DENSity ] } c { } { RMS [ MASS ] [ ORIEnt ] } c { MODE mode-number } c** { TEMPerature } v { ENERGY } c { HELIx [atom-selection] } c { PUCK RESI [SEGI ] } c { USER user-value [ repeat(atom-spec) ] } e { TIME [ AKMA ] } { ZERO } { } ( code: c-coordinates, v-velocities, e-either ) c** MODE time series is allowed only if CORREL is invoked from VIBRAN. *++ Hydrogen bond atom order is one of: Donor,Hydrogen,Acceptor,Aceptor-antecedent Donor,Hydrogen,Acceptor Donor,Acceptor atom-spec::= {residue-number atom-name} { segid resid atom-name } { BYNUm atom-number } atom-selection::= see *note Selection:(doc/select.doc) TRAJectory [ FIRStu int ] [ NUNIt int ] [ BEGIn int ] [ STOP int ] [ SKIP int ] [ VELOicty ] [atom-selection] { ALL } [P2] [UNIT int] SHOW { time-series-name } { CORRelation-function } (defines ?P2, ?AVER, ?FLUC) { ALL } EDIT { time-series-name } edit-spec { CORRelation-function } edit-spec::= [INDEx int] [VECCod int] [CLASs int] [SECOnd int] [TOTAl int] [SKIP int] [DELTa real] [VALUe real] [NAME new-name] READ { time-series-name } unit-spec edit-spec { [FILE] } { CORRelation-funct } { CARD } { DUMB [COLUmn int] } { ALL } { [FILE] } WRITe { time-series-name } unit-spec { CARD } { CORRelation-function } { PLOT } { DUMB [ TIME ] } MANTIME time-series-name { DAVErage } ! Q(T) = Q(T) - { NORMal } ! Q(T) = Q(T) / |Q(T)| { SQUAre } ! Q(T) = Q(T) ** 2 { COS } ! Q(T) = COS(Q(T)) (in degrees) { ACOS } ! Q(T) = ACOS(Q(T)) (in degrees) { COS2 } ! Q(T) = 3*COS(Q(T))**2 - 1 (in degrees) { AVERage integer } ! Q(T) = < Q(T) >(T=T-NUTIL+1,T) { SQRT } ! Q(T) = SQRT(T) { FLUCt name } ! print zero time fluctuations { DINItial } ! Q(T) = Q(T) - Q(1) { DELN integer } ! Q(T) = Q(T) - (T=T-NUTIL+1,T) { OSC } ! print oscillations { COPY name } ! Q(T) = Q2(T) { ADD name } ! Q(T) = Q(T) + Q2(T) { RATIo name } ! Q(T) = Q(T) / Q2(T) { PROB integer } ! Q(T) = PROB(Q(T)) { LOG } ! Q(T) = LOG(Q(T)) { EXP } ! Q(T) = EXP(Q(T)) { IPOWer integer } ! Q(T) = Q(T) ** integer { MULT real } ! Q(T) = real * Q(T) { DIVIde real } ! Q(T) = Q(T) / real { SHIFt real } ! Q(T) = Q(T) + real { DMIN } ! Q(T) = Q(T) - QMIN { ABS } ! Q(T) = ABS(Q(T)) { DIVFirst } ! Q(T) = Q(T) / Q(1) { DIVMaximum } ! Q(T) = Q(T) / ABS(Q(MAX)) { INTEgrate } ! Q(T) = Integral(0 to T) (Q(T)dT) { TEST real } ! Q(T) = COS(2*PI*T*real/TTOT) { ZERO } ! Q(T) = 0.0 CORFUN 2x(time-series-name) [ [ FFT ] [LTC ] [P0] [NONOrm] } [TOTAl int] [ [DIREct] [NLTC] [P1] } [ [P2] } SPECtrum [FOLD] [RAMP] [SWITch] [SIZE integer] END ! return to main command parser  File: Correl, Node: General, Up: Top, Next: Enter, Previous: Syntax General discussion regarding time series and correlation functions Discussion: The CORREL command invokes the CORREL subcommand parser. The keyword values MAXTimesteps, MAXSeries, and MAXAtoms may be specified for space allocation greater than the default options. If there in insufficient virtual address memory for the space request, it may be possible to achive the desired results by removing the nonbond lists before running the CORREL command. The MAXTimesteps value is the largest number of steps any time series will contain. The MAXSeries keyword is the largest number of timeseries that will be contained at any time within CORREL. A vector time series will counts as 3 time series in allocating space. The MAXAtoms keyword allocates space for the atoms that are specified in the DEFINE commands. For bonds, angles, dihedrals, and improper dihedral specifications, one extra value is needed for each entry to hold the CODES value (so each bond uses 3 atom entries, 4 for angles...). If the COVAriance keyword is given, no time series will be computed, but instead, a complete equal time covariance matrix will be computed. For this option, only one TRAJectory command is allowed. The covariance matrix is then obtained by writing the time series, where the elements are covariant with other time series. The ENTER defines a time series. Many time series may be specified. A time series is defined by the following items; Name - Each time series must have a unique (4 character) name. Class code - The type of time series (BOND, USER, ATOM,...) Number of steps - The number of time steps currently valid Velocity code - Was the time series read from velocities? Skip value - What multiple of delta do the time steps represent? Delta - Integration time step Offset - Time of first element Secondary code - Depends on Class code (Geometry/Energy)(X/Y/Z...) Vector code - 1=simple time series, 3=vector, 0=Y or Z part of vector Value - Utility series value, depends on Class code Mass weighting - Are the elements to be mass weighted (only for ATOM) Average - Time series average Fluctuation - Time series fluctuation about the average Atom pointer - Pointer into first specified atom in atom list Atom count - Number atom entries given in the ENTER command Time series - Series values from (1,NTOT) The TRAJectory command processes all of the time series which have a NTOT (number of steps) count of zero. For this process, the main coordinates are used for reading the trajectory. If flutucations are requested, the comparison coordinates MUST be filled with the reference (or average) coordinates before invoking the TRAJectory command. Allowing multiple TRAJectory commands separated by enter commands make it possible to compute correlation function between positions and velocities, or even for different trajectories. The EDIT command allows the user to directly modify the time series specifications. The MANTIME command allows the user to manipulate the time series values (and sometimes some of the specifications). The SHOW command will display the specification data for all of the time series.  File: Correl, Node: Enter, Up: Top, Next: Trajectory, Previous: General Specifying time series The ENTER command defines a new time series. Each time series specified by different enter commands must have a unique name (up to 4 characters). With this command, a time series may be defined and then must be later filled with a TRAJectory command (or a MANTIME COPY, or a READ time-series command). Alternativly, a time series may be retrieved from an existing file, or duplicated from another time series that currently exists. The time series names "ALL" and "CORR" may not be used, and are reserved for selecting all of the time series or the correlation function respectivly. The ENTER options are; ----------------------------------------------------------------------------- DUPLicate time-series-name This causes an exact copy of an existing time series to be created (except with a different name). This may be useful where several different type of manipulations are required on a single time series. ----------------------------------------------------------------------------- READ unit-number [CARD] [edit-spec] This causes a time series to be created and all data then read in from an existing time series file. All time series (up to the maximum allowed) will be read with this command. ----------------------------------------------------------------------------- [ BONDS repeat(2x(atom-spec)) ] [ GEOMETRY ] [ ANGLE repeat(3x(atom-spec)) ] [ ENERGY ] [ DIHEd repeat(4x(atom-spec)) ] [ IMPRo repeat(4x(atom-spec)) ] These specifications cause a particular internal coordinate (or an average of several) to define the time series. It is not necessary that the specified atoms have a corresponding PSF entry, but if ENERGY is requested, the specified atoms must be able to produce a valid parameter code. The default is GEOMETRY. With geometry, any 4 atoms may be specified. A velocity trajectory should not be used to fill these types of time series. ----------------------------------------------------------------------------- [ ATOM ] [ X ] repeat(atom-spec) [ MASS ] [ FLUC ] [ Y ] [ Z ] [ R ] [ XYZ ] These ENTER commands define a time series based on atom positions or velocities. The ATOM option uses the values directly. The FLUCtuation option subtracts off the reference values (contained in the comparison coordinates. If the average structure is desired, then the command; COOR DYNA COMP trajectory-spec would be required before invoking the TRAJECTORY command. If more than one atom is specified, the values are averaged. If MASS is specified, then the mass weighting is used in the averaging. The properties X,Y,Z, and R cause a scalar time series to be created with the requested property. The XYZ option causes a vector time series to be created. ATOM: Q(t) = X(t) FLUC: Q(t) = X(t) - X(ref) ----------------------------------------------------------------------------- VECT [ X ] repeat(2x(atom-spec)) [ Y ] [ Z ] [ R ] [ XYZ ] The VECTor command is similar to the ATOM and FLUCuation commands listed above, except the values are given by the difference in position or velocity of 2 atoms. If more than one pair of atoms is specified, then the values for each vector are averaged. Q(t) = X1(t) - X2(t) ----------------------------------------------------------------------------- ATOM DOTProduct repeat(2x(atom-spec)) FLUC DOTProduct repeat(2x(atom-spec)) VECT DOTProduct repeat(4x(atom-spec)) These ENTER commands produce a scalar time series for velocities or positions with the following definitions; ATOM: Q(t) = ( r1(t) | r2(t) ) FLUC: Q(t) = ( (r1(t)-r1(ref)) | (r2(t)-r2(ref)) ) VECT: Q(t) = ( (r1(t)-r2(t)) | (r3(t)-r4(4)) ) If more than one set of atoms is specified, then the values are averaged. For the FLUC option, the reference coordinates MUST be in the comparison coordinate set. ----------------------------------------------------------------------------- [ GYRAtion ] [ CUT real ] [ DENSity ] These commands define a scalar time series for a coordinate trajectory. The density calculation is based about the origin on all atoms within the CUT value; the radius of gyration is for all atoms within distance CUT of the geometric center of the molecule, and no mass weighting is applied. ----------------------------------------------------------------------------- MODE mode-number This option generates a scalar time series which is obtained by projecting the velocities onto the specified normal mode, or to project the coordinate diplacement from the reference strucure. The result is given by; velocity: Q(t) = < root(mass)*v(t) | q > position: Q(t) = < root(mass(i))*(r(t)-r(ref)) | q > ----------------------------------------------------------------------------- TEMPerature The time series is the temperature at each point. --------------------------------------------------------------------- HELIx atom-selection The x,y&z components of the normalized vector defining the axis af a cylindrical surface best fitting the selected atoms. So you end up with a three-dimesnional vector series. Intended for say alpha helices where the selection would be something like: SELE ATOM * * CA .AND. RESID 23:36 END, to give the axis of an alpha helix running from residue 23 to residue 36. --------------------------------------------------------------------- RMS [ORIE] The RMS deviation from the COMPARISON coordinate set is computed, with a rotation to obtain a best fit if ORIEnt is specified. --------------------------------------------------------------------- PUCK RESI [SEGI ] The sugar pucker phase and amplitude are calculated for the (deoxy)ribose of the specified residue; the first segment is the default. This gives a two-dimensional vector, with component 1 being the phase (degrees) and component 2 the pucker amplitude (Angstroms), as defined by Cremer&Pople (JACS 1975). ----------------------------------------------------------------------------- USER user-value [ repeat(atom-spec) ] The USRTIM routine is called for each coordinate or velocity set. The user value and atom list is also passed along. See the description in (USERSB.FLX)USRTIM for more details. Q(t) = Whatever you want! ----------------------------------------------------------------------------- TIME [ AKMA ] The time is returned in picoseconds unless AKMA is specified. Q(t) = t ----------------------------------------------------------------------------- ZERO A zero time series is specified ( Q(t)=0 ). This option is useful for cases where time series will be read with the DUMB option. For these cases, the EDIT command may also be needed to get desired results.  File: Correl, Node: Trajectory, Up: Top, Next: Edit, Previous: Enter Specification of the Trajectory Files The TRAJectory command reads a number of trajectory files whose Fortran unit numbers are specified sequentially. The first unit is given by the FIRSTU keyword and must be specified. NUNIT gives the number of units to be scanned, and defaults to 1. BEGIN, STOP, and SKIP are used to specify which steps in the trajectory are actually used. BEGIN specifies the first step number to be used. STOP specifies the last. SKIP is used to select steps periodically as follows: only those steps whose step number is evenly divisible by STEP are selected. The default value for BEGIN is the first step in the trajectory; for STOP, it is the last step in the trajectory; and for SKIP, the default is 1. Reorienting a coordinate trajectory is possible using the COMPARE command. For details see *note reorient:(doc/dynamc)Merge. If VELOcity is specified, a velocity trajectory will be looked for. Otherwise, a coordinate trajectory is expected. Any time series that has a zero count (NTOT=0) will be filled by this comand. The time series count will then be filled with the total number of steps processed for each of these series. Any time series with a nonzero count (NTOT>0) will not be affected by this command. The count may be set to zero for a time series with the EDIT command. Upon conclusion, the average and flucutation as well as some other data is presented on each of the processed time series. If any of the time series to be filled require a reference coordinate set, then the comparison coordinates MUST be filled with the reference (or average) coordinates before invoking the TRAJectory command. Upon completion, the main coordinates contain the last coordinate set read from the trajectory, and the comparison coordinates are unaffected.  File: Correl, Node: Edit, Up: Top, Next: Mantime, Previous: Trajectory Editing a time series The EDIT command allows the time series specifications to be modified directly. WARNING:: This command gives the user direct access to most time series specification. There is NO checking to see if what is being done makes sense. As such, this command is versitile and dangerous. The EDIT command must be followed by a valid time series name. All subsequent keywords will be based on that time series. The series name "ALL" will cause the edit spec to operate on all the time series. The name "CORR" will cause the edit to occur on the correlation function. The following may be specified for a time series; INDEx integer - May be specified to modify X,Y, or Z (1,2,3 resp) of a vector timeseries. Otherwise, all are modified. The index number is in fact an offset from the specified time series, where a value of 1 represents the selected time series. A value of 5 will cause the edit operation to modify the fourth time series from the specified. CLASs integer - May be used to specify a class code (consult source). TOTAl integer - The total number of valid steps may be altered, but none of the values are modified. By setting this value to zero, the time series is then ready again for the next TRAJectory command. SKIP integer - May be specified to reset the SKIP value. This may be useful after reading an external time series. DELTa real - May be specified to modify the basic time step. The actual time step for a series is (SKIP*DELTA). VECCod integer - User may specify a vector code. This may be useful in merging 3 separate time series into a vector time series (or the reverse). In fact any number of time series may be grouped together with this option. For example, if a table with 5 time series is desired, setting VECCOD to 5 for the first one and the writing this time series will output all 5. VALUe real - This allows the user to modify the series utility value. Its function depends on the Class code. This value is currently used for (USER, GYRAtion, DENSity, MODE, and TIME) SECOndary int - The secondary class code may be modified (consult source).  File: Correl, Node: Mantime, Up: Top, Next: Corfun, Previous: Edit Manipulating the Time Series The MANTIME command allows the user to manipulate selected time series, Q(t), and performs the operation requested by the option and leaves the resultant time series as the active time series. This helps in performing various permuations of manipulations to increase the options without increasing the number of ENTER commands. The keyword ordering must be followed exactly. DAVErage subtracts the average of the time series from all elements. NORMal normalises the vectorial time series. (i.e. creates the unit vector by dividing all elements for a given value of t by r(t) = sqrt(x**2 + y**2 + z**2) ). SQUAre squares all the elements COS obtains the cosine of all elements. ACOS obtains the arc-cosine of all elements. COS2 calculates 3*cos**2 - 1 for all elements. AVERage integer calculates the average for every consecutive points and increases the time interval by a factor of . Note: NTOT is divided by . SQRT obtains square root for all elements. Negative elements are set to -SQRT(-q(t)). FLUCt name The Q(t) remains unchanged. A second (b) timeseries must be specified. The zero time fluctuations are computed and printed out. The following variables are computed: A = B = sqrt C = sqrt D = A/(B*C) DINItial subtracts the value of the first element from all elements. Q(T) = Q(T) - Q(1) DELN integer Q(I) = Q(I) - I FROM 1 TO N, FROM N+1 TO N+N ETC. (untested). OSC counts the number of oscillations in Q(T) / unit time step. The Q(t) remains unchanged. COPY name This copies the second time series to the first. NTOT of the first is set to that of the second. ADD name add the time series read in the file specified by to the existing time series. RATIO Q(T) = Q(T) / Q2(T) PROB integer give the probability to find a specific value of the time series. subdivisions of the time series are considered so that there are integer+1 values. LOG Q(T) = LOG(Q(T)) EXP Q(T) = EXP(Q(T)) IPOWer integer Q(T) = Q(T) ** integer MULT real Q(T) = Q(T) * DIVI real Q(T) = Q(T) / SHIFt real Q(T) = Q(T) + DMIN Q(T) = Q(T) - QMIN, QMIN being the minimum of the time series. ABS Q(T) = ABS(Q(T)) DIVFirst Q(T) = Q(T) / Q(1) DIVMax Q(T) = Q(T)/ ABS(Q(t) with max norm) INTEgrate Q(T) = Intgral(0-T) [ Q(t) dt ] TEST real Q(T) = COS ( 2 * PI * / NTOT ) ZERO Q(T) = 0 This option zeroes the specified time series.  File: Correl, Node: Corfun, Up: Top, Next: Spectrum, Previous: Mantime Calculating a Correlation Function CORFUN: This option takes the specified time series and calculates the desired correlation function from it. The correlation functions are normalized unless NONORM is specified. In the following, Qa and Qb refer to the time series that were extracted using the CORREL command. FFT This option is to calculate the correlation function using the FFT method. There are certain limitations on the prime factors in the total number of points. DIRECT This option is to calculate the correlation function using the direct multiplication method. P1 This option gives the direct correlation function, . If Qa and Qb are unit vectors, then this is also the first order Legendre Polynomial P2 This is to obtain the correlation function of second order Legendre Polynomial, (3 <[Qa(0).Qb(t)]**2> - 1)/2. For all applications that I can think of, Qa and Qb will be unit vectors. For P2, LTC = 0 and NORM = 1 NLTC no long tail correction. LTC long tail correction (subtracts **2 if autocorrelation, * if cross correlation. There is no LTC for P2 so NLTC and LTC give same result.) This feature is to be used with care. If the Qa and Qb are fluctuations from the mean (i.e. FLCT or MANTIME DELTA), then this can serve as a correction for roundoff error. Otherwise, they are not centered about the mean, this correction causes the C.F. to be a less accurate calculation of fluctuations from the mean, i.e. - LTC = - * = NONORM Correlations are not normalized. This is useful for adding correlations computed in different trajectories. (P2 is not normalized) TOTAL integer The TOTAL value determines the number of points to keep in the correlation function. The number of points may not be grater than the number of points in the time series. A reasonable value is about 1/4 to 1/3 the length of the time series. Correlation function values near the end have little weight. The default value is the nearest power of two less than half of the time series length. The defaults are FFT, P0, NLTC. Note: The correlation time which is given by the program is calculated by an exponential fit to the first NTOT/8 points or up to the first crossing of the time axis. This value should be considered a (poor) estimate, it is meaningful only for correlation functions which decay exponentially to zero with no oscillations. For P0, C(t) = (c(t) - ltc)/N ltc and Normalization factors, N, are: LTC, autocorrelation: ltc = **2 for P0 = 0 for P2 N = C(0) - ltc = - ltc LTC, cross-correlations: ltc = * N = sqrt[ ( - **2) * ( - **2) ] NLTC, autocorrelation: ltc = 0 N = C(0) NLTC, cross-correlations: ltc = 0 N = sqrt [*]  File: Correl, Node: Spectrum, Up: Top, Next: IO, Previous: Corfun Generating a Spectrum from Correlation Functions There is a command, SPECtral-density, which may be used to generate a spectrum from a correlation function. The synatax is; SPECtrum [SIZE integer] [FOLD] [RAMP] [SWITch]  File: Correl, Node: IO, Up: Top, Next: Examples, Previous: Spectrum Input/Output of time series and correlation functions. 1) The SHOW command { ALL } SHOW { time-series-name } { CORRelation-function } The SHOW command displays to print unit various data regarding the specified time series. This command is automatically run after the ENTER and EDIT commands as a verification of the last action. 2) The READ command READ { time-series-name } unit-spec edit-spec { [FILE] } { CORRelation-funct } { CARD } { DUMB [COLUmn int] } The READ command allows a time series or correlation function to be directly read. The file formats for time series and correlation functions is identical. There are three basic methods by which time series may be read: FILE (default), CARD, and DUMB. The FILE and CARD options expect a file of specific type generated by the corresponding WRITE command. The DUMB option will read a free field card file with NO title or other header. The COLUmn option (default 1) may be specified to start reading the time series from any specified column. The DUMB option will usually include some edit specifications to properly set the time steps (etc.). 3) The WRITe command { ALL } { [FILE] } WRITe { time-series-name } unit-spec { CARD } { CORRelation-function } { PLOT } { DUMB [ TIME ] } The WRITe command will write out time series or a correlation function. All of the write options expect a title to follow this command. There are several file formats; FILE (default), CARD, PLOT, and DUMB. The FILE and CARD options will write out all data regarding the specified time series with the expectation for later retrival by Charmm or another program. The PLOT option will create a BINARY file for plotting by PLT2. The first line of the title is used as the plot title, but this may be reset in PLT2. The DUMB options will simply write out the values with no title or header to a card file, one value to a line. If the TIME option is specified, the time value will preceed the time series values (as needed for an X-Y plot). If the time series is a vector type, then 3 values will be given on each line. The DUMB option is useful for making plot files, or for feeding the data to other programs.  File: Correl, Node: Examples, Up: Top, Previous: IO Examples These examples are meant to be a partial guide in setting up input files for CORREL. The test cases may be examined for a wider set of applications. Example (1) CORREL MAXSERIES 1 MAXTIMESTEPS 500 MAXATOMS 5 ENTER AAAA TORSION MAIN 28 N MAIN 28 CA MAIN 28 C MAIN 29 N GEOMETRY TRAJECTORY FIRSTU 51 NUNIT 5 BEGIN 26000 STOP 31000 SKIP 10 MANTIME AAAA DAVER WRITE AAAA UNIT 20 DUMB TIME * title * WRITE AAAA CARD UNIT 10 * title for card * file containing the time series * CORFUN AAAA AAAA FFT NLTC P0 WRITE CORREL UNIT 21 DUMB TIME * title * WRITE CORREL FILE UNIT 11 * title for binary correlation function * Extracts the time series, PHI(t), for phi dihedral of residue 28. Makes the time series the fluctuation from the mean, delta PHI(t). Makes a plot file of delta PHI(t) vs. time. Makes binary file of delta PHI(t). Calculates C(t) = / by FFT with no long tail correction. Makes a plot file of C(t) vs. t. Makes a binary file of C(t). Example (2) CORREL MAXSERIES 2 MAXTIMESTEPS 500 MAXATOMS 10 ENTER PHI TORSION MAIN 27 C MAIN 28 N MAIN 28 CA MAIN 28 C GEOMETRY ENTER PSI TORSION MAIN 28 N MAIN 28 CA MAIN 28 C MAIN 29 N GEOMETRY TRAJECTORY FIRSTU 51 NUNIT 5 BEGIN 26000 STOP 31000 SKIP 10 MANTIME PHI DAVER MANTIME PSI DAVER CORFUN PHI PSI FFT NLTC P0 NONORM WRITE CORREL FILE UNIT 11 * title for cross correlation binary file * WRITE CORREL PLOT UNIT 12 * plot title * Extracts the time series PHI(t), for phi dihedral, and PSI(t), for the psi dihedral, of residue 28. Makes the time series the fluctuation from the mean. Calculates C(t) = by FFT with no long tail correction. Makes a binary file of C(t). Makes a binary PLT2 file for plotting Example (3) Fluorescence Depolarization, for example CORREL MAXSERIES 6 MAXTIMESTEPS 500 MAXATOMS 8 ENTER V1 VECTOR XYZ MAIN 28 NE1 MAIN 28 CZ3 MAIN 28 NE1 MAIN 28 CE3 ENTER V2 VECTOR XYZ MAIN 28 CD1 MAIN 28 CH2 MAIN 28 CD1 MAIN 28 CZ3 TRAJECTORY FIRSTU 51 NUNIT 5 BEGIN 26000 STOP 31000 SKIP 10 MANTIME V1 NORMAL MANTIME V2 NORMAL SHOW ALL CORFUN V1 V2 FFT P2 WRITE CORREL PLOT UNIT 21 * title for plot * Extracts the time series, consisting of the average of the vectors NE1 - CZ3 and NE1 - CE3 == V1(t) and of the average of CD1 - CH2 and CD1 - CZ3 == V2(t). Makes V1(t) and V2(t) unit vectors. Displays data regarding both time series Calculates P2(t) = (3< (V1(0)*V2(t))**2 > - 1) / 2 Makes a binary plot file for PLT2  File: Correl, Node: Oldexam, Up: Top, Previous: Examples OLD Examples This section shows the input examples for the old analysis based correlation functions. This section is provided for the benefit of users of this code that may want to bring old input files up to date. The examples here are roughly equivalent to those given in the previous section. (1) CORREL FIRSTU 51 NUNIT 5 BEGIN 26000 STOP 31000 SKIP 10 - GEOMETRY TORSION MAIN 28 N - CA - C - 29 N $$ - MANTIME DELTA $$ - PLOT $$ - MANTIME SAVE 10 $$ - CORFUN FFT NLTC P0 $$ - PLOT $$ - MANCOR SAVE 11 $$ Extracts the time series, PHI(t), for phi dihedral of residue 28. Makes the time series the fluctuation from the mean, delta PHI(t). Makes a line printer plot delta PHI(t) vs. time. Makes binary file of delta PHI(t). Calculates C(t) = / by FFT with no long tail correction. Makes a line printer of C(t) vs. t. Makes a binary file of C(t). (2) CORREL FIRSTU 51 NUNIT 5 BEGIN 26000 STOP 31000 SKIP 10 - GEOMETRY TORSION MAIN 28 N - CA - C - 29 N : CA - CB - 29 N - CA $$ - MANTIME DELTA $$ - CORFUN FFT NLTC P0 NONORM $$ - MANCOR SAVE 11 $$ Extracts the time series PHI(t), for phi dihedral, and PSI(t), for the psi dihedral, of residue 28. Makes the time series the fluctuation from the mean. Calculates C(t) = by FFT with no long tail correction. Makes a binary file of C(t). (3) Fluorescence Depolarization, for example CORREL FIRSTU 51 NUNIT 5 BEGIN 26000 STOP 31000 SKIP 10 - GEOMETRY VECTOR MAIN 28 NE1 - CZ3 : CD1 - CH2 $ NE1 - CE3 : CD1 - CZ3 $$ - MANTIME NORMAL $$ - CORFUN FFT P2 $$ - PLOT $$ Extracts the time series, consisting of the average of the vectors NE1 - CZ3 and NE1 - CE3 == V1(t) and of the average of CD1 - CH2 and CD1 - CZ3 == V2(t). Makes V1(t) and V2(t) unit vectors. Calculates P2(t) = (3< (V1(0)*V2(t))**2 > - 1) / 2 Makes a line printer plot. OLD OLD OLD C DEC/CMS REPLACEMENT HISTORY, Element CRYSTL.DOC C *3 6-MAY-1991 16:46:30 WON "Info directive fixed" C *2 4-FEB-1991 17:10:58 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:49:58 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element CRYSTL.DOC  File: Crystl, Node: Top, Up: (doc/commands.doc), Next: Syntax Calculations on Crystals using CHARMM The crystal section within CHARMM allows calculations on crystals to be performed. It is possible to build a crystal with any space group symmetry, to optimise its lattice parameters and molecular coordinates and to carry out a vibrational analysis using the options. * Menu: * Syntax:: Syntax of the CRYSTAL command * Function:: A brief description of each command * Examples:: Sample testcases * Implementation:: Background and implementation  File: Crystl, Node: Syntax, Up: Top, Next: Function [Syntax CRYStal command] CRYStal [BUILd_crystal] [CUTOff real] [NOPErations int] [DEFIne xtltyp a b c alpha beta gamma] [PHONon] [NKPOints int] [KVECtor real real real TO real real real] [VIBRation] [READ] [CARD UNIT int] [PHONons UNIT int] [PRINt] [PRINt] [PHONons] [FACT real] [MODE int THRU int] [KPTS int TO int] [WRITe] [CARD UNIT int] [PHONons UNIT int] [VIBRations] [MODE int THRU int] [UNIT int] The crystal module is an extension of the image facility within the CHARMM program. All crystal commands are invoked by the keyword CRYStal. The next word on the command line can be one of the following : Build - builds a crystal. Define - defines the lattice type and constants of the crystal to be studied. Phonon - calculates the crystal frequencies for a single value or a range of values of the wave vector, KVEC. Print - prints various crystal information. Read - reads the crystal image file. Vibration - calculates the harmonic crystal frequencies when the wave vector is the zero vector. Write - writes out to file various crystal information.  File: Crystl, Node: Function, Previous: Syntax, Up: Top, Next: Examples A brief description of each command follows. 1. Crystal Build. A crystal of any desired symmetry can be constructed by repeatedly applying a small number of transformations to an asymmetric collection of atoms (called here the primary atoms). The transformations include the primitive lattice translations A, B and C which are common to all crystals and a set of additional transformations, {T}, which determines the space group symmetry. The Build command will generate, given {T}, a data structure of all those transformations which produce images lying within a user-specified cutoff distance of the primary atoms. The data structure can then be used by CHARMM to represent the complete crystal of the system in subsequent calculations. The symmetry operations, {T}, are read from the lines following the Crystal Build command. The syntax of the commmand is : Crystal Build Cutoff Noperations ... lines defining the symmmetry operations. The Cutoff parameter is used to determine the images which are included in the transformation list. All those images which are within the cutoff distance are included in the list. There is no limit to the number of transformations included in the lists as they are allocated dynamically. The crystal symmetry operations are input in standard crystallographic notation. The identity is assumed to be present so that (X,Y,Z) need not be specified (in fact, it is an error to do so). For example, a P1 crystal is defined by the identity operation and so the input would be Crystal Build .... Noper 0 whilst a P21 crystal would need the following input lines : Crystal Build .... Noper 1 (-X,-Y,Z+1/2) It should be noted that in those cases where the atoms in the asymmetric unit have internal symmetry or in which a molecule is sited upon a symmetry point within the unit cell not all symmetry transformations for the crystal need to be input. Some will be redundant. It is up to the user to check for these cases and modify the input accordingly. 2. Crystal Define. The define command defines the crystal-type on which calculations are to be performed. It is usually the first crystal command that is specified in any job using the crystal facility. It has the format : Define lattice-type a b c alpha beta gamma The input lattice parameters are checked against the lattice-type to ensure that they are compatible. Six lattice types are permitted. They are listed below along with any restrictions on the lattice parameters : Triclinic - no restrictions on a, b, c, alpha, beta or gamma. Monoclinic - alpha = gamma = 90.0 degrees. Orthorhombic - alpha = beta = gamma = 90.0 degrees. Tetragonal - a = b and alpha = beta = gamma = 90.0 degrees. Hexagonal - a = b, alpha = beta = 90.0 degrees and gamma = 120.0 degrees. Cubic - a = b = c and alpha = beta = gamma = 90.0 degrees. It is up to the user to ensure that the lattice parameters have the desired values for the system at all times. The values are stored by the program but, at present, there is no way of transmitting this information between jobs. For example, if the lattice parameters have been changed during a lattice optimisation then the new parameters, which are printed out at the end of the minimisation, must be input here at the beginning of the next CHARMM run. 3. Crystal Phonon. Phonon calculates the dispersion curves for a crystal. Any value of the wavevector can be used (although, in practice, each component of KVEC is normally limited to the range -0.5 to +0.5). The dynamical matrix and normal mode eigenvectors determined in the phonon calculation are complex although the eigenvalues remain real. The syntax for the command is : Crystal Phonon Nkpoints Kvector To Nkpoints tells the program the number of points at which the derivative matrices must be built and diagonalised whilst the Kvector ... To ... clause determines the values of KVEC for each calculation. Thus, Kvector 0.0 0.0 0.0 To 0.5 0.5 0.5 Nkpoints 3 would solve for the crystal frequencies at the points, KVEC=(0.0,0.0,0.0), (0.25,0.25,0.25) and (0.5,0.5,0.5). If it is desirable, point calculations can be carried out by omitting the To statement and putting Nkpoints 1. For single calculations at KVEC=(0.0,0.0,0.0) the Crystal Vibration command is faster. The eigenvalues and eigenvectors at each value of the wave vector from the phonon calculation are saved and they can be written out to a file using the Crystal Write Phonon command. No analysis facilities exist within CHARMM for the phonon data structure as the eigenvectors are complex. It is to be noted that phonon and vibration calculations can only be performed on crystals of P1 symmetry. No information about the symmetry operations is used when generating the dynamical matrix. 4. Crystal Print. Two options exist with the Print command. If no keyword is given then the crystal image file is printed out. The Crystal Print Phonon command performs a similar function to the Print Normal_Modes command in the vibrational analysis facility. Selected frequencies and eigenvectors for a range of values of the wave vector can be printed out. The syntax is : Crystal Print Phonon Kpoints To Modes Thru Factor The Kpoints .. To .. clause determines the wave-vectors at which the modes are to be printed, the Modes .. Thru .. gives the range of the eigenvectors and the Factor command gives the scale factor to multiply each normal mode by. 5. Crystal Read. The Crystal Read command reads in a crystal image file. The file has the same output as produced by the Crystal Print or Crystal Write commands. The command is useful if a crystal image file was produced using the Crystal Build command and saved using the Crystal Write command in a previous job and it is desired to reuse the same transformation file for analysis or comparison purposes. The command can also be used to read in limited sets of transformations if specific crystal interactions need to be investigated. The transformation file is formatted so the Card keyword needs to be specified and the unit number must be given after the Unit keyword. 6. Crystal Vibration. For a free molecule with N atoms the dynamical equations have 3N-6 non-zero eigenvalues. This is no longer so for a crystal. If a crystal is made up of L unit cells each containing Z molecules with N atoms, the dynamical equations would have a dimension of 3NZL. However, using the symmetry properties of the lattice it is possible to factor the equations into L sets each with a dimension of 3NZ and each depending upon a vector, KVEC, which labels the irreducible representation of the translation group to which the set belongs. The force constant matrix is complex. Its form may be found in the references given at the end of the documentation. Vibration solves the dynamical equations for the case where the wave-vector is zero, i.e. when the equations are real. The procedure is invoked by the Crystal Vibration command. The syntax is : Crystal Vibration 7. Crystal Write. There are three Crystal Write options. If no keyword is given the crystal image file is written out, in card format, to the specified unit. The CARD and UNIT keywords are required. The Crystal Write Phonon command writes out the phonons from a phonon calculation. All the eigenvalues and eigenvectors for all values of the wavevector that are stored are written automatically. The Crystal Write Vibration command writes out the eigenvalues and eigenvectors from a vibration calculation. The modes to be written are given by the Mode .. Thru .. clause. All Write commands require that the Fortran stream number be given after the Unit keyword and a CHARMM title may be specified on the following lines. The structure of the phonon and vibration files for a crystal may be found by looking at the routines WRITDC and XFRQW2 respectively in the file [.IMAGE]XTLFRQ.SRC. The vibration modes are written in the same form as a for VIBRAN normal mode file and may be read in using the appropriate VIBRAN commands. Unfortunately no analysis facilities exist for complex eigenvectors within CHARMM and so users will have to write their own if they want to perform phonon calculations. 8. Crystal Minimisation. It is possible to perform a lattice minimisation using the normal CHARMM MINImise command and the ABNR minimiser. Two extra keywords have been introduced. If none of them is present then a coordinate minimisation is performed as usual. If LATTICE is specified then the LATTice parameters and the atomic coordinates are minimised together. If NOCOoordinates is given with the keyword LATTice then only the lattice parameters are optimised. Specifying NOCOordinates by itself is an error. It should be noted that when the lattice is being optimised the crystal symmetry is maintained. A cubic crystal will remain cubic, etc.  File: Crystl, Node: Examples, Previous: Function, Up: Top, Next:Implementation Examples of input may be found in the test directory. All crystal files are prefixed by the string "xtl_". All the jobs involve L-Alanine. Briefly the jobs are : 1. XTL_ALA1.INP. The crystallographic fractional coordinates are read in and converted to real space coordinates using the CHARMM COORdinate CONVert command and the experimental values for the lattice parameters. 2. XTL_ALA2.INP. A crystal image file is generated for the crystal using a value of 10.0 Angstroms for the crystal cutoff. 3. XTL_ALA3.INP. A coordinate and lattice minimisation are performed for the crystal. The crystal image file from the previous job is used and the optimised coordinates are saved. The main point to note is that before using the crystal package for energy calculations and other manipulations that involve the image non-bond lists an image update must be performed. For safety always do an update after building or reading in the crystal. Note too that the new, optimised lattice parameters are used in the all the subsequent input files. 4. XTL_ALA4.INP. For subsequent calculations a coordinate file that contains the coordinates of all atoms (four molecules of L-alanine) is generated. A crystal image file suitable to do this is read in directly from the input stream. It contains 6 transformations (not 3 as might be expected) because the CHARMM image facility requires that the inverses of all transformations be present. The first three are the ones needed and the last three are their inverses. An update is needed after reading the file to make known to the program the coordinates of the atoms in the first transformation of all the inverse pairs in the image list. The Print Coor Image file will then print out the coordinates of the atoms in the original asymmetric unit and the first three of the images. If the coordinates of the atoms in all the images are required then the keyword NOINV in the UPDATE command must be used (check IMAGE.DOC). 5. XTL_ALA5.INP. The same job as the second except that the crystal is generated for a whole unit cell (i.e. the system generated in the fourth job). The same value of the crystal cutoff is used. An energy is calculated too. The energy and its RMS coordinate derivative should be exactly four times (apart from a small round-off error) the value obtained for an energy calculation on a single asymmetric unit with the same lattice parameters and crystal cutoff (see job 3). 6. XTL_ALA6.INP. Peform a crystal vibration and phonon calculation for the optimised structure of the L-alanine crystal. The vibrational and phonon modes are written out to files and components of the first 24 phonon normal modes for the three values of the wavevector that were calculated are printed. To do the same for the vibrations it would be necessary to use the appropriate VIBRAN commands in another job.  File: Crystl, Node: Implementation, Previous: Examples, Up: Top Background and Implementation. The Crystal options and their commands were described above. The present section discusses relevant background material and briefly reviews the methods used in the implementation. Some technical points are also made. The crystal option is an extension to the CHARMM program. The source code is in the directory [.IMAGE] whilst the crystal data structure is in the file IMAGE.FCM. Two additional source code files have been added - CRYSTL.SRC and XTLFRQ.SRC. Small modifications have been made to the files ENERGY.SRC and EIMAGE.SRC. CHARMM Images and the Crystal Image Data Structure. As outlined above a crystal structure can be specified entirely by the action of the primitive translations A, B and C, and a small set of transformations, {T} (which themselves are functions of A, B and C), on an asymmetric group of atoms. In CHARMM the calculation of the energy assumes that there exists a cutoff distance beyond which all interactions between particles are neglected so that when performing calculations on supposedly infinite crystals only a limited portion of that crystal, i.e. that portion containing those atoms within the cutoff distance of the prmary atoms, need be considered. The CHARMM image option, of course, already enables the energies of crystals to be calculated but the input required to use it to do so is cumbersome and time consuming. It is a great simplification to include an extra data structure that defines the crystal in terms of A, B and C and {T}. There are a number of advantages: 1. A crystal is regular so that its generation can be automated. All that needs to be done is to systematically transform the primary atoms by one of the set {T} and a linear combination of A, B and C. The result is obviously best stored in terms of A, B, and C rather than as absolute numerical values of the transformations. 2. It is essential to define a CHARMM crystal by A, B and C and {T} if the lattice parameters a, b, c, alpha, beta and gamma are to be varied because the coordinates of all the image atoms within the crystal will change during successive cycles of the optimisation as a, b, c, alpha, beta and gamma themselves change. 3. When constructing the dynamical matrix for a non-zero wave-vector it is necessary to know the unit cell to which a particular atom belongs in order to evaluate the exponential factor in the expression. Although the crystal data structure and the values of the lattice parameters define the crystal the individual transformations have to be worked out explicitly in order to determine energies, harmonic frequencies and so on. In the present version of the program the IMAGE facility is used, so that a new set of IMAGE transformations are calculated from the crystal data structure as soon as a crystal is built or every time the lattice parameters are changed. The use of the IMAGE facility means that the number of transformations that can be used is determined by the dimension of the IMAGE arrays (MAXTRN in DIMENS.FCM). Crystal and Image Patching. Crystal image patching is unavailable in the present version of the program so that bonds between images are not permitted. Similarly hydrogen-bond interactions described by an explicit hydrogen-bond function are also forbidden. The only forces that can be calculated between primary and image atoms are non-bonded ones. The Lattice Coordinate System. The convention used by CHARMM for orientating the crystal in real space may be found in the routine CONCOR in [.MANIP]CORMAN2.SRC. The Structure of the Crystal File. The crystal file is divided into three parts. A standard CHARMM title. A symmetry operation declaration section headed by the word Symmetry and terminated by an End. The transformations are written in the same way as for the Crystal Build command except that the identity transformation has to be explicitly listed. An image section headed by Images and terminated by an End. Here the images are defined in terms of the symmetry transformations and the lattice translations A, B and C. The comment line shows the column labelling. Sometimes it is useful to write one's own crystal files without recourse to the Crystal Build option. In this case the symmetry and image blocks can be put in any order (although only one of each is allowed) and there is no restriction on the positioning of blank and comment lines. Two examples of a crystal file are: * Crystal file for a P1bar crystal. * Symmetry (X,Y,Z) (-X,-Y,-Z) End Images ! Operation a b c 1 0 0 -1 1 0 0 1 2 0 0 0 End * Crystal file for a P212121 crystal. * Symmetry (X,Y,Z) (X+1/2,-Y+1/2,-Z) (-X,Y+1/2,-Z+1/2) (-X+1/2,-Y,Z+1/2) End Images ! Operation a b c 2 0 0 0 3 0 0 0 4 0 0 0 2 -1 0 0 3 0 -1 0 4 0 0 -1 End Second Derivative Calculations and the Use of Symmetry. Consider a crystal with a unit cell in which there is more than one asymmetric unit (i.e. all space groups other than P1). The dynamical matrix then takes a blocked form, with Z**2 blocks if Z is the number of asymmetric units. Each block is of dimension 3N x 3N and contains the sum over all unit cells of the second derivative interaction elements between the Mth and Nth asymmetric units. It is possible to calculate only the Z blocks (11), (12), ..., (1M), ..., (1Z) and then transform them to produce the full matrix. In the present program, however, it is necessary to perform vibration calculations on entire unit cells. It should be emphasised that while this symmetry transformation can be used for calculations of the normal mode eigenvectors and frequencies for the zero wavevector it does not hold at other values for all additional values. Therefore, simple symmetry arguments such as these do not hold for phonon calculations. Symmetry can also be used to block the dynamical matrix into several smaller matrices each corresponding to a different symmetry species, thereby greatly reducing the time needed for diagonalisation and automatically helping to identify the normal modes. Symmetry blocking is not coded at the moment. References. Lattice Dynamics of Molecular Crystals", Lecture Notes in Chemistry 26, S.Califano, V.Schettino and N.Neto (1981), Springer-Verlag, Berlin, Heidelberg and New York. A comprehensive monograph with good sections on the theory of lattice vibrations and normal mode symmetries. A.Warshel and S.Lifson, J.Chem.Phys. (1970), 53, 582. The original CFF paper on crystal calculations. It describes the theory behind crystal optimisations and vibrational calculations. E.Huler and A.Warshel, Acta Cryst. (1974), B30, 1822. An extension of the work in reference 2. "Infrared and Raman Spectra of Crystals", G.Turrell (1972), Academic Press, London and New York. A nice clear introduction to the subject. C DEC/CMS REPLACEMENT HISTORY, Element DYNAMC.DOC C *5 24-OCT-1991 01:30:43 WON "17-OCT-91 NIH update" C *4 12-SEP-1991 19:14:37 WON "Update by Bernie Brooks" C *3 6-MAY-1991 16:47:24 WON "Info directive fixed" C *2 4-FEB-1991 17:14:37 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:50:01 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element DYNAMC.DOC  File: Dynamc, Node: Top, Up: (doc/commands.doc), Next: Syntax Dynamics: Description and Discussion In order to generate a dynamics trajectory, all requirements for evaluating the energy must be met. See *note Energy:(doc/energy.doc)Needs. * Menu: * Syntax:: Syntax of the dynamics command * Description:: Description of the keywords and options * Recommended:: Recommended input options and values * Discussion:: Running dynamics * Output:: Output from a dynamics run * Trajectory:: Trajectory manipulation and I/O * Merge:: Merging or breaking up trajectory files into different size pieces. Resampling at a larger interval. Least squares fit to a reference. * Reorient:: Reorienting a coordinate trajectory * RMSDyn:: Computes the RMS difference between two trajectories * Format:: formatting and unformatting a dyamics trajectory * Monitor:(doc/monitor.doc). Monitor dihedral transitions * CPT dynamics:(doc/pressure.doc). CPT dynamics * Pressure:(doc/pressure.doc)Pressure. The pressure command  File: Dynamc, Node: Syntax, Up: Top, Previous: Top, Next: Description Syntax for the Dynamics Command DYNAmics {[LEAPfrog]} {[VERLet]} {STRT } {[TIMEstp real]} [NSTEp integer] - { ORIG } {LANGevin} {STARt } { AKMA real } { CPT } {RESTart} nonbond-spec hbond-spec frequency-spec - unit-spec temperature-spec options-spec - cpt-spec hbond-spec::= see *note Hbonds:(doc/hbonds.doc). nonbond-spec::= See *note Nbonds:(doc/nbonds.doc). frequency-spec::= [INBFrq integer] [IEQFrq integer] [IHBFrq integer] [IHTFrq integer] [IPRFrq integer] [NPRInt integer] [NSAVC integer] [NSAVV integer] [NTRFrq integer] [ILBFrq integer] [ISVFRQ integer] unit-spec::= [IUNCrd integer] [IUNRea integer] [IUNVel integer] [IUNWri integer] [KUNIt integer] [CRAShu integer] [BACKup integer] temperature-spec::= [FINAlt real] [FIRStt real] [TEMInc real] [TSTRuc real] [TWINDH real] [TWINDL real] [TBATh real] options-spec::= [IASOrs integer] [IASVel integer] [ICHEcw integer] [ISCAle integer] [ISCVel integer] [ISEEd integer] [SCALe real] [NDEGg integer] [RBUFfer real] [AVERage] [ECHEck real] [TOL real] cpt-spec::= [ PCONst {[PINTernal]} [COMPressibility real] ] { PEXTernal } [PCOUpling real] [PREFerence real] [VOLUme real] [ TCONst [TCOUpling real] [TREFerence real] ]  File: Dynamc, Node: Description, Previous: Syntax, Up: Top, Next: Recommended Options common to minimization and dynamics The following table describes the keywords which apply to both minimization and dynamics. Keyword Default Purpose NSTEP 100 The number of steps to be taken. This is the number of dynamics steps which is also equal to the number of energy evaluations. INBFRQ 50 The frequency of regenerating the non-bonded list. The list is regenerated if the current step number modulo INBFRQ is zero and if INBFRQ is non-zero. Specifying zero prevents the non-bonded list from being regenerated at all. IHBFRQ 50 The frequency of regenerating the hydrogen bond list. Analagous to INBFRQ ILBFRQ 50 The frequency for checking whether an atom is in the Langevin region, defined by RBUF, or not. non-bond- The specifications for generating the non-bonded list. -spec See *note Nbonds:(doc/nbonds.doc). hbond- The specifications for generating the hydrogen bond list. -spec See *note Hbonds:(doc/hbonds.doc). [ STRT ] STRT The dynamics is assumed to start from the input [ ] coordinates using an assignment of velocities given by [ ] IASVEL. No restart file is read. [ REST ] The dynamics is restarted by reading the restart file from unit IUNREA. TIMESTP 0.001 Time step for dynamics in picoseconds. The default value is 0.001 picoseconds. IUNREA -1 Fortran unit from which the dynamics restart file should be read. A value of -1 means don't read any file IUNWRI -1 Fortran unit on which the dynamics restart file for the present run is to be written. A value of -1 means don't read any file. Formatted output. IUNCRD -1 Fortran unit on which the coordinates of the dynamics run are to be saved. A value of -1 means no coordinates should be written. Unformatted output. IUNVEL -1 Fortran unit on which the velocities of the dynamics run are to be saved. -1 means don't write. Unformatted output. KUNIT -1 Fortran unit on which the total energy and some of its components along with the temperature during the run are written using formatted output. CRASHU -1 Fortran unit where a single DCL command file will be written. If the machine crashes before a restart file is written, this file won't be touched. If the crash occurs after a restart is written but before the run completes, this file will contain the line, "$ @CRASH". If the run completes, the file will contain the line, "$ @COMPLET". This allows for an automatic recovery system after crashes. NSAVC 10 The step frequency for writing coordinates. NSAVV 10 The step frequency for writing velocities. NPRINT 10 The step frequency for storing on KUNIT as well as printing on unit 6, the energy data of the dynamics run. IPRFRQ 100 The step frequency for calculating averages and rms fluctuations of the major energy values. If this number is less than NTRFRQ and NTRFRQ is not equal to 0, square root of negative number errors will occur. ISVFRQ NSTEP The step frequency for writing a restart file. IHTFRQ 0 The step frequency for heating the molecule in increments of TEMINC degrees in the heating portion of a dynamics run. Zero means do no heating. IEQFRQ 0 The step frequency for assigning or scaling velocities to FINALT temperature during the equilibration stage of the dynamics run. NTRFRQ 0 The step frequency for stopping the rotation and translation of the molecule during dynamics. This operation is done autmatically after any heating. FIRSTT 0.0 The initial temperature at which the velocities have to be assigned at to begin the dynamics run. Important only for the initial stage of a dynamics run. FINALT 300.0 The desired final (equilibirium) temperature for the system. Important for all stages except initiation. TEMINC 5.0 The temperature increment to be given to the system every IHTFRQ steps. Important in the heating stage. TSTRUC -999. The temperature at which the starting structure has been equilibrated. Used to assign velocities so that equal partition of energy will yield the correct equilibrated temperature. -999. is a default which causes the program to assign velocities at T=1.25*FIRSTT. TWINDH 10.0 The temperature deviation from FINALT to be allowed on the high temperature side.(+ve). i.e. high side of the temperature window. Useful during equilibration. TWINDL -10.0 The temperature deviation from FINALT to be allowed on the low temperature side.(-ve). i.e. low side of the temperature window. Useful during equilibration. TBATH 0.0 The temperature of the heatbath in Langevin dynamics. When set to zero it allows one to do purely dissipative (quenched) dynamics. RBUF 0.0 Inner radius of the buffer, or Langevin, region sphere. All atoms with radial positions greater than RBUF angstroms are propagated by Langevin dynamics, if the dynamics keyword LANGevin has been specified. IASORS 0 The option for scaling or assigning of velocities during heating (every IHTFRQ steps) or equilibiration (every IEQFRQ steps). .eq. 0 - scale velocities. (use ISCVEL option) .ne. 0 - assign velocities. (use IASVEL option) IASVEL 1 The option for different assignments of velocities. .eq. 0 - Use the comparison coordinate values in AKMA units (sorry) with the STRT option. If NTRFRQ is positive, then net trans/rot will be removed first. This option supresses other assignments of velocity. .gt. 0 - gaussian distribution of velocity. (+ve) .lt. 0 - uniform distribution of velocity. (-ve) kinetic energy of 3N velocity components are same. ISEED 314159 The seed for the random number generator used for assigning velocities. ISCVEL 0 The option for two ways of scaling velocities. .eq. 0 - single scale factor for all atoms .ne. 0 - a scale factor for each atom proportional to the kinetic energy average ratio between the system and along every degree of freedom for that atom. ICHECW 1 The option for checking to see if the average temperature of the system lies within the allotted temperature window (between FINALT+TWINDH and FINALT+TWINDL) every IEQFRQ steps. .eq. 0 - do not check i.e. assign or scale velocities. .ne. 0 - check window i.e. assign or scale velocities only if average temperature lies outside the window. ISCALE 0 This option is to allow the user to scale the velocities by a factor SCALE at the beginning of a restart run. This may be useful in changing the desired temperature. .eq. 0 no scaling done (usual input value) .ne. 0 scale velocities by SCALE. WARNING: Please use this option only when you are changing the temperature of the run. SCALE 1. Scale factor for the previous option. NDEGF computed Number of degrees of freedom to use in computing the temperature. If not specified on any call, the value is computed. This specification is not remembered between successive calls to dynamics. AVERAGE no When saving coordinates every NSAVC steps, this option will cause the average structure of the last NSAVC dynamics steps to be written instead of the final snapshot coordinate set. This option is primarily used for making smooth movies. ECHECK 20.0 The maximum amount the total energy may change on any step. TOL 1.0E-10 The shake tolerance (if SHAKE is in use). PCONst false Flag to indicate that constant pressure code will be used. PINTernal true Flag to indicate that the internal pressure will be coupled the referenece pressure. PEXTernal false Flag to indicate that the external pressure will be coupled to the reference pressure. PCOUpling 0.0 The coupling decay time in picoseconds for the pressure. A good value for this is 5 ps. COMPress 0.0 The compressibility in atm**-1. A good value for proteins is 4.63e-5 PREFerence 1.0 The referenece pressure in atmospheres. VOLUme computed The volume in Angstroms**3 to use for the pressure calculation denominator. This value is calculated if the CRYStal feature is use. TCONst false Flag to indicate that constant tepmerature code will be used. TCOUpling 0.0 The coupling decay time in picoseconds for the temperature. A good value for this is 5 ps. TREFerence 298.0 The reference temperature for constant temperature simulations.  File: Dynamc, Node: Recommended, Previous: Description, Up: Top, Next: Discussion Recommended CHARMM input for dynamics. This section is intended only as a guide in setting up a dynamics simulation input file. Changes should be made as necessary according to personal tastes and project requirements. 1) For heating and early equilibration: DYNAMICS LEAP VERLET RESTART(*) NSTEP 20000 TIMESTEP 0.001(+) - IPRFRQ 1000 IHTFRQ 1000 IEQFRQ 5000 NTRFRQ 5000 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 100 NSAVC 100 NSAVV 0 INBFRQ 25 - hbond-spec nonbond-spec - FIRSTT 100.0 FINALT 300.0 TEMINC 100.0 - IASORS 1 IASVEL 1 ISCVEL 0 ICHECW 0 TWINDH 10.0 TWINDL -10.0 (*) Except for first run, the use STRT in place of RESTART (+) If bonds to hydrogen atoms are SHAKEd 2) For late equilibration and analysis runs: DYNAMICS LEAP VERLET RESTART NSTEP 20000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 2000 IEQFRQ 5000(*) NTRFRQ 5000 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 100 NSAVC 100 NSAVV 0 IHBFRQ 0 INBFRQ 25 - hbond-spec nonbond-spec - FIRSTT 100.0 FINALT 300.0 TEMINC 100.0 - IASORS 0 IASVEL 1 ISCVEL 0 ICHECW 1 TWINDH 10.0 TWINDL -10.0 (*) Window checking should be disabled for the analysis run (i.e. IEQFRQ=0) if you want a real microcanonical ensemble. 3) For heating, equilibration and analysis runs using Langevin dynamics:(+) DYNA LEAP LANGEVIN STRT(*) NSTEP 20000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 0 IEQFRQ 0 NTRFRQ 0 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 100 NSAVC 100 NSAVV 0 IHBFRQ 0 INBFRQ 25 - ILBFRQ 1000 RBUFFER 0.0 TBATH 300.0 - hbond-spec nonbond-spec - FIRSTT 300.0 FINALT 300.0 - IASORS 0 IASVEL 1 ISCVEL 0 ICHECW 0 TWINDH 0.0 TWINDL 0.0 (+) Note that the friction coefficients, in units of 1/ps, must first be initialized by filling the array FBETA with the SCALAR command SCALAR FBETA SET (*) Except for first run, the use STRT in place of RESTART 4) For quenched molecular dynamics: For the first run (STRT), read velocities into the comparison coordinate set, or this should directly follow a former dynamics command. DYNA VERLET STRT(*) NSTEP 10000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 200 IEQFRQ 200 NTRFRQ 400 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 50 NSAVC 50 NSAVV 0 IHBFRQ 0 INBFRQ 25 - hbond-spec nonbond-spec - TSTRUC 300.0 FIRSTT 300.0 FINALT 0.0 TEMINC -30.0 - IASORS 0 IASVEL 0 ISCVEL 0 ICHECW 1 TWINDH 0.0 or equivalently with Langevin (dissipative) dynamics DYNA LANGEVIN STRT(*) NSTEP 10000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 0 IEQFRQ 0 NTRFRQ 4000 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 50 NSAVC 50 NSAVV 0 IHBFRQ 0 INBFRQ 25 - hbond-spec nonbond-spec - TSTRUC 300.0 FIRSTT 300.0 FINALT 300.0 - ILBFRQ 1000 RBUFFER 0.0 TBATH 0.0 - IASORS 1 IASVEL 1 ISCVEL 0 ICHECW 0 TWINDH 0.0 (*) For first run, use RESTART otherwise The IASVEL 0 option causes the comparison coordinates to be used for the initial velocities (AKMA units). For subsequent runs the options IASORS 1 and IASVEL 1 may be used if random velocities are to be periodically assigned. 5) For constant temperature and/or pressure dynamics DYNA LEAP VERLET STRT(*) NSTEP 20000 TIMESTEP 0.001 - IPRFRQ 1000 IHTFRQ 0 IEQFRQ 0 NTRFRQ 0 - IUNREA 30 IUNWRI 31 IUNCRD 50 IUNVEL -1 KUNIT 70 - NPRINT 100 NSAVC 100 NSAVV 0 IHBFRQ 0 INBFRQ 25 - PCONst PINTernal COMPress 4.63e-5 PCOUpling 5.0 PREFerence 1.0 - TCONst TCOUpling 5.0 TREFerence 300.0 - hbond-spec nonbond-spec - FIRSTT 300.0 FINALT 300.0 - IASORS 0 IASVEL 1 ISCVEL 0 ICHECW 0 TWINDH 0.0 TWINDL 0.0  File: Dynamc, Node: Discussion, Previous: Recommended, Up: Top, Next: Output Running Molecular Dynamics The theoretical basis for dynamical simulations is elementary physics. The force on a particle is equal to the negative gradient of the potential energy of the particle. CHARMM can solve this equation numerically for all atoms in the molecule. A simple second order predictor two step method due to Verlet is used for integration. The choice of the integration step size is very important. One must weigh the increased accuracy of using a small step size against the longer real time that can be simulated with a given amount of execution time when a larger step size is used. The time step may be entered in picoseconds (using the TIMESTP keyword). CHARMM provides information on the accuracy of the numerical solution. Since the system has no external forces, the total energy should be conserved. Numerical errors will result in some fluctuations in the total energy so a good test is to compare the fluctuations in total energy to the fluctuations in kinetic energy as these fluctuations are proportional to the heat capacity of the system. See the next node for a description of dynamics output. Because the force constants for the bonds and bond angles are fairly large, it is reasonable under certain circumstances to constrain their values during dynamics. Such constraints are applicable if the harmonic motions are weakly coupled to other motions. The advantage of such constraints is that the step size of the numerical integration may be increased without sacrificing accuracy as these terms have the largest gradients in macromolecules simulated at physiological temperatures. We use the SHAKE algorithm for applying the constraints, see *note shake:(doc/cons.doc)SHAKE. SHAKE can be applied to just the bonds involved with hydrogens, all bonds, all bonds and the angles involving hydrogens, or all bonds and angles. A dynamics run has basically four parts; initialization, heating, equilibration, and the simulation itself. Initialization means providing an initial position and velocity for all the atoms. Heating is the process of increasing the kinetic energy of the system up to a final temperature at which the simulation will be conducted. Equilibration is the process where the kinetic energy and the potential energy of the system evenly distribute themselves throughout the system. Only when the average temperature of the system stabilizes can one collect the trajectory information for analysis. The initial coordinates of a simulation are obtained after applying the minimization algorithm to a complete coordinate set. One cannot start with a system with a large potential energy as it will quickly heat up to unreasonable temperatures. For initializing the velocities, the user can specify velocities from the comparison coordinates (IASVEL 0), a uniform distribution of kinetic energy along each coordinate with random sign of the motion along each axis (IASVEL -1) or a Gaussian distribution of velocities (IASVEL 1 the default). The temperature at which velocities are assigned is determined by FIRSTT and TSTRUC by the alogrithm: Tassign = 2*(FIRSTT-TSTRUC) + TSTRUC. For a harmonic system equilibrated to TSTRUC equal parition of the energy will result in an equilibarted temperature of roughly FIRSTT. If TSTRUC is not specified 1.25*FIRSTT will be used for assignment. Velocities may also be passed to dynamics in the comparison coordinate set (as opposed to assignment). This allows the user considerable flexibility in setting up the initial conditions. The heating of system is performed gently by increasing the kinetic energy by a small amount periodically. The number of integration steps between heating applications, the final temperature, and the kinetic energy increment are all user specified. In addition, there is a choice in the method of increasing the kinetic energy of the system. One may scale existing velocities or reassign them. The velocities can be scaled by either one scale factor calculated for the kinetic energy of the system averaged over many time steps or by scale factors established for each atom base ed on the ratio of its time averaged kinetic energy with that of the system. If reassignment is chosen, the velocities can have either a uniform or Gaussian distribution. To equilibrate the structure, one can specify a window around the final temperature where velocity adjustments will be made. The choice of velocity adjustments is the same as described above for heating. For the actual run, CHARMM will output the position and velocities of all atoms at intervals specified by the user. The temperature window can be set larger so that any gross conformational changes which result in a different potential energy will cause the temperature to be maintained. At any time energy is added to the system, the angular momentum of the system will be reduced to zero and translational motion will be stopped. One can also request that these operations be performed at any time during the dynamics run. The use of a restart file is essential for running dynamics. The restart file contains information about the most recent coordinate sets necessary for the VERLET algorithm. In addition the values of the energy accumulators are stored. All other information (such as SHAKE, fixed atoms, harmonic constraints, friction coefficients) has to be regenerated before invoking a dynamics restart. When the run is initiated, a restart file must be written using the IUNWRI keyword. As the dynamics routine complete NCYCLE, see *note Output::, steps of dynamics, the Fortran unit specified by IUNWRI will be rewound and a restart file will be written. In case of crashes, one has restart files corresponding to various points in the run. The CRASHU variable may prove valuable. Successive runs of CHARMM to continue the dynamics run must read the previous restart file using the IUNREA keyword and write it out for the next part of the run. Restarts may be done to reset various options, or to break up a long run into several shorter runs. Restart files will only run with the version of CHARMM they are created with. There are many numbers giving the frequency of actions to be taken during dynamics such as updating the non-bonded list, heating the molecule etc. Some of these numbers are adjusted along with the number of steps to run so that numbers all have a common divisor. At the present time, there are combinations which result in errors. At some point an attempt may be made to catalog all the actions, and check for erroneous processing. If one is interested in simulating the motion of part of the system with the rest of the system remaining fixed, it is possible to fix atoms in plce, see *note fix:(doc/cons.doc)fixed atom. If this is done, there are several effect on the dynamics. First, since the system is now anchored in space, the center of mass motion and total angular velocity is never stopped. Second, the number of degrees of freedom used for calculating the temperature is set to the number of free atoms times 3 minus 6. Third, the coordinate and velocity trajectory files will contain the position of the fixed atoms only once, and all other records will hold just the moving atoms. This saves a great deal of disk space. Trajectory files can be merged, broken in smaller pieces, and sampled at different intervals. Likewise, said operations can be performed on coordinate trajectories while rotating the coordinates to match a given coordinate set. When the DYNAmics command exits, the main coordinate set contains the final coordinate positions from the last energy evaluation and the comparison coordinates will contain the final velocities In AKMA units. Finally, a brief discussion of the Langevin dynamics algorithm is presented. The Langevin dynamics algorithm presently in CHARMM was intented to be used primarily with the "Stochastic Boundary Molecular Dynamics" method and consequently has been restricted to an algorithm which is valid only for the case FBETA*TIMESTEP<<1.0. That is for cases where relatively small friction coefficients are used. Typically values of FBETA*TIMESTEP up to about 0.3 still produce a stable dynamics which also satisify the fluctuation-dissipation theorem. The algorithm itself reduces to the Verlet algorithm when FBETA is zero and consequently may be used to do regular dynamics, actually it is the same routine which does both dynamics. In using Langevin dynamics care must be taken to first initialize the array FBETA by using the scalar commands e.g., CHARMM >SCALAR FBETA SET Failure to do this just means you are doing regular dynamics so no warning is given by CHARMM.  File: Dynamc, Node: Output, Previous: Discussion, Up: Top, Next: Trajectory Contents of a dynamics output Note: This description of the output of a command is not normally going to be part of the documentation of commands. The dynamics output is sufficiently confusing that this description is necessary. The first part of CHARMM's output after a dynamics command lists all of the options that apply to that part of the run. Then, any information about velocity assignments (temperature changes) follows. Any time the velocities are changed in an anistropic way, the motion of and about the center of mass will be stopped. This results in a printout both before and after this operation of the "DETAILS ABOUT CENTRE OF MASS". Its position and velocity are output followed by the components of the angular momentum. The last line gives the translation kinetic energy of the system, and thus one should expect a drop in the total energy and temperature of the system afterwards. Non-bonded interaction and hydrogen bond updates will appear intermittently and are cleared labeled. Every NPRINT steps, the total energy and various contributions will be printed. This output is preceded by a title which gives the correspondence of numbers to energy names. After IPRFRQ steps will appear the averages and RMS fluctuations. After the second such printout of averages and RMS fluctuations, the averages and RMS fluctuations for the run upto the last turning of the molecule will be given. This gives you longer range statistics. Such a calculation will not be done if IPRFRQ equals NTRFRQ. The ratio of total energy to kinetic energy fluctuations is an excellent measure of the accuracy of the run. After the averages are printed, a least squares fit of the total energy against the step number will be made to look for drift in the energy. Two such values are printed, one for the last IPRFRQ steps, and one to the previous turn. Next, the initial energy for the statistics, both short range and long, are printed. Finally, the correlation coefficient of the energy versus step is given for both ranges. A value close to zero indicates no systematic drift; a magnitude near 1 means you have a real problem with the dynamics. This process of printout continues until the end of the run is reached. Just before the last energy is printed will appear a message about the writing of coordinates and velocities to their respective files.  File: Dynamc, Node: Trajectory, Previous: Output, Up: Top, Next: Merge Reading and writing trajectory frames with direct commands This facility allows the creation or manipulation of trajectory files The main uses of this facility are; 1) creating artificial trajectory files from coordinate frames 2) reading an existing trajectory frame by frame for analysis that requires access to a variety of CHARMM commands 3) modifying an existing trajectory (copy with changes) such as minimizing each frame or other operations. [Syntax TRAJectory command] =================================================================== There are three commands that comprise this facility. 1) Initializing trajectory I/O TRAJectory {read-spec} {write-spec} read-spec:== [IREAd unit] [NREAd int] [SKIP int] [BEGIN INT] [STOP INT] write-spec:== [IWRIte unit] [NWRIte int] [NFILE int] [EXPAnd] [NOTHer int] [DELTa real] [SKIP int] IREAd - first unit to read from (default: do not read) NREAd - number of units to read from (default:1) SKIP - skip value for both reading and writing (default:1) IWRIte - first unit to write to (default: do not write) NWRIte - number of units to write to (default:1) NFILe - number of frames on each output file (default: total) EXPAnd - flag to free fixed atoms in copying (only if reading) NOTHer - number of frames in previous files (if not reading) (d:0) DELTa - output delta value (if not reading) (default:0.001) 2) Reading a frame TRAJectory READ [COMP] 3) Writing a frame TRAJectory WRITe [COMP] The reading and writing commands do not have any specifiers other than whether the comparison or main coordinates will be used. =================================================================== There are three modes of operation; 1) Create a new trajectory. The IWRIte and NFILe keywords must be used. The default values for the others are listed above. If several files will be made in different CHARMM runs that will be linked together later, the NOTHer keyword value should be increased by NFILe on each subsequent run. EXAMPLE: Create a "movie" trajectory that involves the rotation of a single sidechain (residue 21). COOR AXIS SELE ATOM * 21 CA END SELE ATOM * 21 CB OPEN WRITE UNIT 22 FILE NAME TYR21.ROT TRAJECTORY IWRITE 22 NWRITE 1 NFILE 360 SKIP 1 * trajectory showing the rotation of sidechain 21 * SET 1 1 LABEL LOOP COOR ROTATE AXIS PHI 1.0 SELE ATOM * 21 * .AND. .NOT. ( TYPE C - .OR. TYPE N .OR. TYPE H ) END TRAJ WRITE INCR 1 BY 1 IF 1 LT 360.5 GOTO LOOP STOP =================================================================== 2) Reading an existing trajectory The IREAD keyword must be used. The default NFILe value is 1 and the remaining values if not specified will be read from the trajectory file. EXAMPLE: find the structure with the lowest energy and save it. UPDATE ... OPEN READ UNIT 22 FILE NAME DYN1.TRJ OPEN READ UNIT 23 FILE NAME DYN2.TRJ TRAJECTORY IREAD 22 NREAD 2 SKIP 100 SET 1 1 SET 9 9999.0 LABEL LOOP TRAJ READ GETE IF 9 LT ?ENER GOTO NEXT SET 8 @1 COOR COPY SET 9 ?ENER LABEL NEXT INCR 1 BY 1 IF 1 LT 1000.5 GOTO LOOP OPEN WRITE CARD UNIT 12 NAME LOWE.CRD WRITE COOR COMP CARD UNIT 12 * structure with the lowest energy * frame number @8 with energy @9 * STOP =================================================================== 3) Copying from one trajectory to another. The operation of this command works in the same mode as the MERGE command, except a variety of CHARMM commands can be executed between reading and writing of frames. EXAMPLES: Create a new trajectory where every frame is minimized for 200 steps. OPEN READ UNIT 22 FILE NAME DYN.TRJ OPEN WRITE UNIT 32 FILE NAME DYN.MIN TRAJECTORY IREAD 22 SKIP 100 IWRITE 32 * minimized trajectory * SET 1 1 LABEL LOOP TRAJ READ MINI ABNR NSTEP 200 TRAJ WRITE INCR 1 BY 1 IF 1 LT 1000.5 GOTO LOOP STOP  File: Dynamc, Node: Merge, Previous: Trajectory, Up: Top, Next: Reorient Merges or breaks up a trajectory into different numbers of files Frequently, one generates a trajectory into small files to minimize the CPU time of one job. However, so many files are usually hard to manage so it is desirable to merge said files into larger units. This command provides that capacity. In addition, it is possible to break up the trajectory into smaller pieces and to sample the trajectory less frequently than originally generated. Another option is to optionally rotate the structure at each frame to least squares fix a reference structure. [Syntax MERGE dynamics trajectories] MERGE [ COOR ] [FIRSTU unit-number] [NUNIT integer] [SKIP integer] [ VEL ] [OUTPutu unit-number] [NFILE integer] [ DRAW ] [BEGIN] [STOP] [PRINT] [first-atom-selection] [ ORIEnt [MASS] [WEIGht] [NOROt] [PRINT] second-atom-selection ] Keyword table Option Default Purpose [COOR] COOR Specification of the type of trajectory file. COOR is [VEL ] coordinates; VEL is velocities. FIRSTU 51 The first unit of the trajectory to be read. NUNIT 1 The number of units to be read starting with FIRSTU SKIP 1 Only those coordinate whose dynamics step number modulo SKIP will be reoriented and written out. OUTPUTU 61 The first unit number of the output trajectory NFILE The number of coordinates written to each output file. If left out, this will be set to the number of coordinates in the first input file times the number of input files. WARNING: This default will generate a bad trajectory file if SKIP is not set to the interval actually present in the trajectories. Further, if you set its value to be larger than the number of coordinates that are actually written in any output file, you will have problems. The error that is generated results from the control array in the beginning specifying that there are more coordinates than actually exist in the file. EOF errors will result when the trajectory is read. The title of the output trajectory will be copied from the input trajectory.  File: Dynamc, Node: Reorient, Previous: Merge, Up: Top, Next: RMSDyn Reorienting a coordinate trajectory If one is interested in reorienting every set of coordinates found in a dynamics trajectory with respect to some reference structure, one can use the ORIEnt option in conjunction with the MERGe command. The process of reorienting a coordinate trajectory works as follows: A series of files containing the trajectory are assigned to successive units prior to a CHARMM run. The coordinates stored therein are presumed to have been written every NSAVC steps. CHARMM will read each coordinate, select some periodically, reorient them, and write them to successive units where each output file will have a user specified number of coordinates. The following table lists the options involved: Option Default Purpose ORIE .false. Specify that a least squares RMS fit will be done. MASS .false. Use a mass weighting in the fit WEIGH .false. Use the weighting array (wmain) in the fit NOROt .false. Just shift the centers to best fit. PRINt .false. Print what happened to each coordinate set. atom-selection all Select which atom to use in the fit. If atoms were fixed during the dynamics, the new trajectory produced will not have fixed atoms because the rotations applied to each coordinate set will be different thereby yielding different coordinates for the fixed atoms. Fixing the coordinates leads to a large space reductions, so the reorientation process will therefore result in potentially much larger trajectory files. See *note fix: (doc/cons.doc)Fixed Atom.  File: Dynamc, Node: RMSDyn, Previous: Reorient, Up: Top, Next: Format Computes the RMS difference between two trajectory files and make a matrix of results. Large files should be reduced with the MERGe command before processing this command. RMSDynmics [IREAd unit-number] [JREAd unit-number] [IWRIte unit-numbner] [BEGIn integer] [STOP integer] [IMAGes] ORIEnt [MASS] [WEIGht] [NOROt] [RMS] atom-selection IREAd int - unit number of the first trajectory file. JREAd int - unit number of the second trajectory file. IWRIte int - Unit for the output matrix. BEGIn int - Starting step number (default: first) STOP int - Ending step number (default: last) IMAGes - Use image atoms for the analysis ORIEnt - Do best fit of structures MASS - Use a mass weighting in best fit. WEIGht - Use the weighting array in bet fit. NOROt - Best fit without letting the structures rotate. RMS - Do RMS fit between structures, otherwise, align structures with the axis. atom-selection - Atoms to use in the fitting procedure.  File: Dynamc, Node: Format, Previous: RMSDyn, Up: Top Format or unformat a dynamics trajectory DYNAmics FORMat FIRStunit NUNIt BEGIn SKIP STOP OUTPut OFFSet SCALe MODE DYNAmics UNFOrmat INPUt OUTPut These commands allow to convert binary trajectory files into a machine independent yet compact format and to convert them back into binary files. The defaults for OFFSet, SCALe and MODE are: OFFSet=600, SCALE=10000, and MODE=12Z6. The trajectory is converted into positive integers according to the formula =INT(+OFFSET)*SCALE). The user has to make sure that all coordinates of the trajectory are within OFFSET angstroms. The precision may be increased by choosing a larger SCALE and FORTRAN-format, e.g. MODE=11Z7 OFFSET=100000. ("Z" is the hexadecimal format and is available on most machines.) C DEC/CMS REPLACEMENT HISTORY, Element ENERGY.DOC C *6 10-FEB-1992 14:11:29 WON "RCZ, documenting EXSG keyword" C *5 18-NOV-1991 15:07:21 WON "Updated by B. Brooks" C *4 12-SEP-1991 19:21:20 WON "FAST option updated" C *3 6-MAY-1991 16:49:57 WON "Info directive fixed" C *2 4-FEB-1991 17:17:52 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:50:05 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element ENERGY.DOC  File: Energy, Node: Top, Up: (doc/commands.doc), Next: Description Energy Manipulations: Minimization and Dynamics The main purpose of CHARMM is the evaluation and manipulation of the potential energy of a macromolecular system. In order to compute the energy, several conditions must be met. There are also several support commands which directly relate to energy evaluation. * Menu: * Description:: Description of the energy commands * Skipe:: Selection of particular energy terms * Interaction:: Computation of interaction energies and forces. * Fast:: Requirements for using the fast routines * Needs:: Requirements for all energy evaluations * Optional:: Optional actions to be taken beforehand  File: Energy, Node: Description, Up: Top, Next: Skipe, Previous: Top Syntax for Energy Commands There are two direct energy evaluation commands. One is parsed through the minimization parser and the other involves a direct call to GETE. See *note Minimiz:(doc/minimiz.do,,) an d*note Gete: (doc/usage.doc)interface. In addition to getting the energy, the forces are also obtained. The ENERgy command. (processed through the minimization parser) [SYNTAX ENERgy] ENERgy [ nonbond-spec ] [ hbond-spec ] [ image-spec ] [ print-spec ] [ COMP ] [ INBFrq 0 ] [ IHBFrq 0 ] [ IMGFrq 0 ] [NOUPdate] hbond-spec *note Hbonds:(doc/hbonds.doc). nonbond-spec *note Nbonds:(doc/nbonds.doc). image-spec *note Images:(doc/images.doc)Update. If the COMP keyword is specified, then the comparison coordinate set is used, but this disables the use of the fast routines. The keyword NOUPdate turns off all update routines, and thus requires all lists to be present already. The GETE command. (a direct call to GETE) [SYNTAX GETEnergy] GETE [ COMP ] [ PRINt [ UNIT int ] ] For this command to work, all list must be set up. This is best done through the UPDAte command. The COMP keyword will cause the comparison coordinate set to be used. The PRINt keyword will result in a subsequent call to PRINTE in order to print the energy. If the PRINt keyword is not specified, then NO indication that the energy has been called will be given. The UPDAte command (sets up required lists for GETE) [SYNTAX UPDAte lists] UPDAte [ nonbond-spec ] [ hbond-spec ] [ image-spec ] [ COMP ] [ INBFrq 0 ] [ IHBFrq 0 ] [ IMGfrq 0 ] [ EXSG {list-of-segment-names} | EXOF ] The update command will set up the codes lists and also create a nonbond list (unless INBFrq is 0) and a new hbond list (unless IHBFrq is 0). If the COMP keyword is specified, then the comparison coordinates will be used in setting up the nonbond and hbond lists. EXSG keword with optional following list of segment names allows to exclude some nonbonded interactions (ELEC & VDW). If list of names is empty ALL INTERsegment nonbonded interactions will be excluded. If list is not empty all INTER and INTRA segment nonbonded interactions for listed segments will be ecluded. EXOF turns off this option. H-bond energies (HBON) are not affected at the moment (Dec 3, 1991).  File: Energy, Node: Skipe, Up: Top, Next: Interaction, Previous: Description Skipping selected energy terms There is a facility to skip any desired energy terms during energy evaluation. For each energy term there is associated a logical flag determining whether that energy term is to be computed. Specifications are processed sequentially. The default operation is INCLude which implies that subsequent energy term are to be removed from the energy calculation. NOTE: that EXCLude implies that the energy term is to be computed. If for some reason, the list presented here is out of date, the data in SKIPE(energy.src) and in ENER.FCM of the source should be consulted. Syntax: [SYNTAX SKIP energy terms] [ INCLude ] [ EXCLude ] SKIPe repeat( [ ALL ] ) [ NONE ] [ item ] item::= [ BOND ] [ ANGL ] [ UREY ] [ DIHE ] [ IMPR ] [ VDW ] [ ELEC ] [ HBON ] [ USER ] [ HARM ] [ CDIH ] [ CIC ] [ CDRO ] [ NOE ] [ SBOU ] [ IMNB ] [ IMEL ] [ IMHB ] [ XTLV ] [ XTLE ] [ EXTE ] [ RXNF ] [ ST2 ] [ IMST ] [ TSM ] description: BOND - bond energy ANGL - angle energy UREY - Urey-Bradley energy term DIHE - dihedral energy IMPR - improper dihedral energy VDW - van der Waal energy ELEC - electrostatic energy HBON - hydrogen bond energy USER - user supplied energy (USERLINK) HARM - harmonic positional constraint energy CDIH - constrained dihedral energy CIC - internal coordinate constraint energy CDRO - quartic droplet potential energy NOE - NOE general distance restraints SBOU - solvent boundary energy IMNB - image van der Waal energy IMEL - image electrostatic energy IMHB - image hydrogen bond energy XTLV - crystal van der Waal energy XTLE - crystal electrostatic energy EXTE - extended electrostatic energy RXNF - reaction field energy ST2 - ST2 water-water energy IMST - image ST2 water-water energy TSM - TMS free energy term. Examples; SKIP ALL EXCL BOND - do just bond energy SKIP EXCL ALL - return flags to default state SKIP ELEC EVDW - throw out electrostatics and van der Waal energy  File: Energy, Node: Interaction, Up: Top, Next: Fast, Previous: Skipe Interaction energies and forces The INTEraction command computes the energy and forces between any two selections of atoms. [SYNTAX INTEraction energy] INTEraction [ COMP ] [ NOPRint ] 2x(atom-selection) [UNIT int] If only one atom selection is given, then a self energy will be computed. This routine is quite efficient and may be used within a Charmm loop without too much overhead, though there are some restrictions. The COMP keyword causes the comparicon coordinates to be used. The NOPRint keyword will prevent the results from being printed. This routine works in the same manner as the GETE command in that all of the lists (CODES, nonbond, and Hbond) must be specified before invoking this command. One difference is that SHAKE will not be repected with this command (i.e. if the coordinates don't satisfy the constraints, neither will the energy). The following energy terms may be computed by this routine (unless supressed with the SKIP command); Bond - Energy defined by the two atoms involved. Angles - Energy allocated to the central atom (auto energy only). Dihedral - Energy defined between central two atoms Improper - Energy defined by first atom (auto energy only) van der Waal - ATOM option only. Energy defined by two atoms involved. Electrostatic - ATOM option only. Energy defined by two atoms involved. Hbond - Energy defined by heavy atom donor and acceptor atom. Harmonic cons - Energy allocated to central atom (auto enegy only). Dihedral cons - Energy defined by central two atoms. User energy - Atom selections may be passed to USERE in the selection common (DEFIne command). Fill forces and energies as desired. All other energy terms will be zeroed. For terms listed "auto energy only", the corresponding atom must be present in both atom selections. For the remaining terms, one atom of the pair must be present in each of the atom selections. The energy division matches the method used in the analysis facility. This command will not work with the selection of images atoms, or the selection of ST2 waters. All energy terms not listed above will not be computed. The nonbond list must be generated with the ATOM and VATOM options. The individual energy terms are stored in the energy common and are available in commands and titles via the "?energy-term" substitution. The forces for all kept energy terms will be returned in the force arrays. Note, that it is possible for atoms to have a force that were not selected in either selection specification. This may happen for angle or dihedral terms on the first and last atoms. It may also happen in a similar manner for improper dihedrals, hydrogen bonding terms, and dihedral constraints.  File: Energy, Node: Fast, Up: Top, Next: Needs, Previous: Interaction [SYNTAX FASTer ] FASTer {integer} {OFF } {ON } {DEFAult} {SCALar } ! for testing only {VECTor } ! for testing only Instead of using an integer value, FASTer command can be issued with one of the following keywords. Keyword Equivalent integer ---------------- ---------- FASTer OFF -1 DEFAult 0 ON 1 SCALar 2 VECTor 3 The FASTer keyword or integer defines which versions of the energy routines to be used. FASTer -1 : Always use slow routines FASTer 0 : Use fast routine if possible, no error if cannot (default) FASTer 1 : Use best optimized routine for the current machine (Error message if cannot) FASTer 2 : Use fast scalar routine (Error message if cannot) FASTer 3 : Use fast vector routine (Error message if cannot) There exist a general and a fast version of the internal energy routines (bond, angle, dihedral, and improper dihedral). The is also a fast version of nonbond energy evaluation (roughly 30-50% faster). These routines were designed for long minimization or dynamics calculations. To request the FAST routine, the FASTer command should be used with a positive integer or an appropriate keyword. A negative integer will disable the fast energy routines. If the fast routines are requested and it is not possible to use the fast routines, a warning will be issued, and the general routines will be used in their place. The fast routines are more efficient in several ways; (1) arrays are included in common files rather than passed (2) second derivatives have been removed (3) analysis and print options have been removed The restrictions are that; (1) the MAIN coordinate set must be used in the energy evaluations (2) second derivatives may not be requested (3) The PSF, parameter, and codes arrays must be used (from the common files) (4) a limited set of nonbond options must be used. The current nonbond options supported by the fast nonbond routine are as follows. ATOM [CDIE] [SHIFt ] VATOM [VSHIft ] [RDIE] [SWITch ] [VSWItch ] [FSWItch] [VFSWitch] [FSHIft ] GROUP [CDIE] [SWITch ] VGROUP [VSWItch ] [RDIE] [FSWItch]  File: Energy, Node: Needs, Up: Top, Next: Optional, Previous: Fast Requirements before energy manipulations can take place Before the energy of a system can be evaluated and manipulated, a number of data structures must be present. First, a PSF must be present. Second, a parameter set must be present. It must contain all parameters which are required by the PSF being used. Third, coordinates must be defined for every atom in the system. An undefined coordinate has a particular value, and if two coordinates have the same value, division by zero will occur in the evaluation of the energy. If the positions of hydrogens are required, the hydrogen bond generation routine, see *note Hbond: (doc/hbonds.doc), must be called before the energy is evaluated. Fourth, provisions must be made for having a hydrogen bond list and a non-bonded interaction list. Having non-zero frequencies for updating this lists is one way, one can also read these lists in, see *note read:(doc/io.doc)read, or generate them with separate commands, see *note HBgen:(doc/hbonds.doc), or *note NBgen:(doc/nbonds.doc).  File: Energy, Node: Optional, Up: Top, Previous: Needs, Next: Substitution Optional actions you can take to modify the energy manipulations There exist several commands which can modify the way the potential energy is calculated or can affect the way energy manipulations are performed. The Constraint command, see *note Cons:(doc/cons.doc), can be used to constraints of various kinds. First, it can be used to set flags for particular atoms which will prevent them from being moved during minimization or dynamics. Second, it can be used to add positional constraint term to the potential energy. This term will be harmonic about some reference position. The user is free to set the force constant. Third, the user can place a harmonic constraint on the value of particular torsion angles in an attempt to force the geometry of a molecule. Other constraints are also available. The SHAKe command, see *note shake:(doc/cons.doc)SHAKE, is used to set constraints on bond lengths and also bond angles during dynamics. It is very valuable in that it permits a larger step size to be used during dynamics. This is vital for dynamics where hydrogens are explicitly represented as the low mass and high force constant of bonds involving hydrogen require a ridiculously small step size. The user interface commands can be used to modify the calculation of the potential and to add another term to the potential energy. See *note Modify:(doc/usage.doc)interface for details. File: Energy, Node: Substitution, Up: Top, Previous: Optional, Next: Top The following command line substitution values may be included in any command or title. To get the total energy, the syntax; ...... ?TOTE ..... should be used. Energy related properties: 'TOTE' - total energy 'TOTK' - total kinetic energy 'ENER' - total potential energy 'TEMP' - temperature (from KE) 'GRMS' - rms gradient 'BPRE' - boundary pressure applied 'VTOT' - total verlet energy (no HFC) 'VKIN' - total verlet kinetic energy (no HFC) 'EHFC' - high frequency correction energy 'EHYS' - slow growth hysteresis energy correction 'VOLU' - the volume of the system. 'PRSE' - the pressure calculated from the external virial. 'PRSI' - the pressure calculated from the internal virial. 'VIRE' - the external virial. 'VIRI' - the internal virial. 'VIRK' - the virial "kinetic energy". Energy term names: 'BOND' - bond (1-2) energy 'ANGL' - angle (1-3) energy 'UREY' - additional 1-3 urey bradley energy 'DIHE' - dihedral 1-4 energy 'IMPR' - improper planar of chiral energy 'VDW ' - van der waal energy 'ELEC' - electrostatic energy 'HBON' - hydrogen bonding energy 'HARM' - harmonic positional restraint energy 'CDIH' - dihedral restraint energy 'CIC ' - internal coordinate restraint energy 'CDRO' - droplet restraint energy (approx const press) 'USER' - user supplied energy term 'IMNB' - primary-image van der waal energy 'IMEL' - primary-image electrostatic energy 'IMHB' - primary-image hydrogen bond energy 'SBOU' - solvent boundary lookup table energy 'NOE' - general distance restraint energy (for NOE) 'XTLV' - crystal van der waal energy 'XTLE' - crystal electrostatic energy 'EXTE' - extended electrostatic energy 'RXNF' - reaction field electrostatic energy 'ST2' - ST2 water-water energy 'IMST' - primary-image ST2 water-water energy 'TSM' - TMS free energy term 'QMEL' - QM/MM electrostatic energy term 'QMVD' - QM/MM van der Waals energy term Energy Pressure/Virial Terms: 'VEXX' - External Virial 'VEXY' - 'VEXZ' - 'VEYX' - 'VEYY' - 'VEYZ' - 'VEZX' - 'VEZY' - 'VEZZ' - 'VIXX' - Internal Virial 'VIXY' - 'VIXZ' - 'VIYX' - 'VIYY' - 'VIYZ' - 'VIZX' - 'VIZY' - 'VIZZ' - 'PEXX' - External Pressure 'PEXY' - 'PEXZ' - 'PEYX' - 'PEYY' - 'PEYZ' - 'PEZX' - 'PEZY' - 'PEZZ' - 'PIXX' - Internal Pressure 'PIXY' - 'PIXZ' - 'PIYX' - 'PIYY' - 'PIYZ' - 'PIZX' - 'PIZY' - 'PIZZ' - Examples: 1. Save the structure with a lower NOE restraint energy. READ COOR CARD UNIT 1 ! Read the first structure READ COOR CARD COMP UNIT 2 ! Read the second structure ENERGY ! Compute energy of first structure SET 1 ?NOE ! save the NOE energy value ENERGY COMP ! Compute the energy of the second structure IF ?NOE LT @1 COOR COPY ! replace first structure if second has ! a lower energy. 2. Write some energy values when saving coordinates .... COOR ORIE RMS MASS ENERGY OPEN WRITE CARD UNIT 22 NAME RESULT.CRD WRITE COOR CARD UNIT 22 * Final coordinates * energy=?ENER and electrostatic energy=?ELEC * mass weighted rms deviation from xray structure is ?RMS * C DEC/CMS REPLACEMENT HISTORY, Element EWALD.DOC C *2 5-JAN-1992 14:46:57 WON "Info directive fixed" C *1 7-DEC-1991 04:16:34 WON "Ewald summation program documentation" C DEC/CMS REPLACEMENT HISTORY, Element EWALD.DOC  File: Ewald, Node: Top, Up: (doc/commands.doc), Next: Syntax The Ewald Summation method Invoking the Ewald summation for calculating the electrostatic interactions can be specified any time the nbond specification parser is invoked. See the syntax section for a list of all commands that invoke this parser. Prerequisite reading: nbonds.doc * Menu: * Syntax:: Syntax of the Ewald summation specification * Defaults:: Defaults used in the specification * Function:: Description of the options * Discussion:: More general discussion of the algorithm  File: Ewald, Node: Syntax, Up: Top, Next: Defaults, Previous: Top [SYNTAX EWALD] { NBONds } { nonbond-spec } { UPDAte } { } { ENERgy } { } { MINImize } { } { DYNAmics } { } The keywords are: nonbond-spec::= [ method-spec ] [ ewald-spec ] method=spec::= [ EWALD ] [ NOEWald ] ewald-spec::= [ KAPPa real ]  File: Ewald, Node: Defaults, Up: Top, Next: Function, Previous: Syntax The defaults for the ewald summation are set internally to CHARMM22 and are currently set to NOEWald and KAPPa = 0.  File: Nbonds, Node: Function, Up: Top, Previous: Defaults, Next: Discussion i) The EWALD keyword invokes the Ewald summation for calculation of electrostatic interactions in periodic, neutral systems. The formulation of the Ewald summation dictates that the primary system must be neutral. If otherwise, the summation is not formally correct and there may be some convergence problems may result. The NOEWald (default) suppresses the Ewald method for calculating electrostatic interactions. The Ewald summation is invoked as a fast nonbond option, so FASTER must be greater than 0. The algorithm currently supports the atom nonbond list and the image facilty must be used. ii) The KAPPa keyword, followed by a real number governs the width of the Gaussian distribution central to the Ewald method. See discussion section for detail on choosing an optimum value of KAPPa. iii) The K space summation is currently set to accommodate 2000 atoms.  File: Ewald, Node: Discussion, Up: Top, Previous: Function, Next: Top The Ewald Summation in Molecular Dynamics Simulation The electrostatic energy of a periodic system can be expressed by a lattice sum over all pair interactions and over all lattice vectors excluding the i=j term in the primary box. Summations carried out in this simple way have been shown to be conditionally convergent. The method developed by Ewald, in essence, mathematically transforms this fairly straightforward summation to two more complicated but rapidly convergent sums. One summation is carried out in reciporcal space while the other is carried out in real space. Based on the formulation by Ewald, the simple lattice sum can be reformulated to give absolutely convergent summations which define the principal value of the electrostatic potential, called the intrinsic potential. Given the periodicity present in both crystal calculations and in dynamics simulations using periodic boundary conditions, the Ewald formulation becomes well suited for the calculation of the electrostatic energy and force. If we consider a system of point charges in the unit or primary cell, we can specify its charge density by ro(r) = sum_i [ q_i * delta(r-r_i)] In the Ewald method this distribution is replaced by two other distributions ro_1(r) = sum_i [ q_i ( delta(r-r_i) - f(r-r_i)] and ro_2(r) = sum_i [q_i f(r-r_i) such that the sum of the two recovers the original. The distribution, f(r), is a spherical distribution generally taken to be Gaussian, the width of the gaussian dictated by the parameter, kappa. The charge distributions are situated on the ion lattice positions, but integrate to zero The potential from the distribution ro_1(r) is a short range potential and is evaluated in a direct real space summation. Consequently, one can use standard cutoffs or truncation in this phase of the calculation. The diffuse charge distribution placed on the lattice sites reduces to the potential of the corresponding point charge at large r. Within the minimum image convention, one can choose kappa such that the real space summation converges within a distance of the nearest neighbor images. ro_2(r), being a continuous distribution of Gaussians situated on the periodic lattice positions, is a smoothly varying function of r and thus is well approximated by a superposition of continuous functions. This distribution is, therefore, expanded in a Fourier series and the potential is obtained by solving the Poisson equation The point of splitting the problem into two parts, is that by a suitable choice of the parameter determining the width of the Gaussian distribution we can get very good convergence of both parts of the summation. For the real space part of the energy, we choose kappa so that the complementary error function term, erfc(kappa*r) decreases rapidly enough with r to make it a good approximation to take only nearest images in the sum and neglect the value for which r > rcut. The reciprocal space sums are rapidly convergent and a spherical cutoff in k space is applied so that the sum over k becomes a sum over {l,m,n} with (l**2+m**2+n**2) < or = to kmax**2 A large value of kappa means that the real space sum is more rapidly convergent but the reciprocal space sum is less rapid. In practice one chooses kappa to give good convergence at the cutoff radius, Rcut. Kmax is then chosen to such that the reciprocal space calculation converges. C DEC/CMS REPLACEMENT HISTORY, Element GRAPHX.DOC C *4 12-SEP-1991 19:22:08 WON "Update by Bernie Brooks" C *3 6-MAY-1991 16:51:28 WON "Info directive fixed" C *2 4-FEB-1991 17:18:51 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:50:07 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element GRAPHX.DOC CHARMM documentation file <*> graphics <*> latest update: august 1990  File: Graphx, Node: Top, Up: (doc/commands.doc) Graphics is a subparser of charmm, invoked by via the GRAPH command. All of the miscellaneous commands ( miscom.doc ), coordinate commands ( corman.doc ), and internal coordinate commands ( intcor.doc ) are available from the GRAPHX> prompt. Only the 1st three characters are used for primary graphics commands, but many of the options require the 1st four characters. The current implementation is for the Apollo network at NIH; a generic PHIGS+ version is planned. Items marked ** developmental ** are only available in test versions, and may be a little buggy.... NOTE: to run test version, create a link (1st time only) and use run, i.e. crl ~/cdg //ishtar/rvenable/progs/charmm run ~/cdg/charmm Option keywords are indicated by the use of upper case; lower case terms are variable values, generally real numbers, but decimal points are not required. Triplets ( x y z ) are position dependent; omitted values are assumed to be zero. =================================================================== Initializing: GRAPhx [NODBuff] These commands affect what is viewed: DISplay [ON] [MAIN] [COMP] [BOND] [VECT] [ATOM] [BOND] [TEXT] [HBONds] [LABEls] [OFF] COLor color-name [brightfactor] [COMP] atom-selection NONE RED BLACk YELLow GREEn WHITe BLUE CYAN MAGEnta GRAY ORANge BROWn PURPle TURQuoise CHARtreuse DKBLue (or an integer from 0 to 15) LBL label-type label-atoms SIZE label-size COLOr label-color label-type = INIT SEGId RESN RESId TYPE CHEM CHARge WEIGht USER user-label label-atoms = FIRSt and/or atom-selection label-size = VSMAll SMALl MEDIum LARGe label-color = color-name ( see COLOR command above ... ) default: YELLOW user-label = up to 8 characters LINe iwidth :: bonds or vectors; pixels HBStyle [COLOR color-name] [WIDTH iwidth] [DASH idash] :: HBOND style RADii [DEFaults] [PARam] atom-scale [bond-scale] atom-sel DEFault-colors STEreo [ON/OFF] [dist] [angle] DRAw [atom-selection] GPR :: use graphics primitives for each draw FULl :: use full screen graphics INTer :: interactive mode; E to exit, ? for help TEXt [text-body] :: display a title FONT [ VSMAll | SMALl | MEDIum | LARGe ] :: text font; default MEDIUM OFF :: disable all graphics and exit. These commands change the view only: RESet SCAle factor [MOL/LAB] [REP int] BOXsize size [MOL/LAB] CENter [atom-selection] MAXwindow POInt x y z ROTate rx ry rz [MOL/LAB] [REP int] TRAnslate x y z [MOL/LAB] [REP int] ZCLip [low] high ZCUe [ [low] high ]/[AUTO] These commands do not affect the display: AUTo [ON/OFF] :: redraw after every command BITmap [file] :: save screen to a bitmap file ERAse [ON/OFF] :: screen clear prior to next drawing HELp :: provides a command listing PRImary :: forces primary buffer use EXEcute pathname :: execute a program; no arguments are passed END :: exit from command parser only. NOTE: the UNIT must be OPENed first... PLOt UNIT n :: HPGL plot file PLUto UNIT n :: Cambridge V.1 PLUTO file XMOle UNIT n :: xmole (Stellar) file. MAKE UNIT n :: LIGHT file. =================================================================== DETAILED COMMAND DESCRIPTION ------------------------------------------------------------------- GRAPhx [NODBuff] Invoked from the main CHARMM command parser; if already initialized (i.e. GRAPHX ... END) the previous graphic states are retained. The NODBuff option disables hardware double-buffering, and may only be used either the 1st time or after graphics termination with the OFF subcommand. The default is whatever the workstation will allow. Display updates will be slower, since a software double-buffer is employed. Intended for debugging only. ------------------------------------------------------------------- DISplay [ON] [MAIN] [COMP] [VECT] [ATOM] [BOND] [TEXT] [HBONds] [LABEls] [OFF] Turns the display of various graphic features on or off; the default is DISPLAY ON MAIN BOND which will show the connectivity of the atoms in main coordinate set. The options are: ON/OFF enable or disable the display of one of the features MAIN the main coordinate set COMP the comparison coordinate set; both may be displayed BOND atom connectivity as atom-colored half-bonds ATOM filled circles using current radius (see RADII) VECT display atom terminated vectors TEXT title display HBONDS current HBOND list, using double width lines LABELS residue names, atom types, user labels ... Examples: display atom on ! enable atom display; MAIN assumed display text ! toggles title display display hbonds off ! disable H-bond display ------------------------------------------------------------------- COLor color-name [brightfactor] [COMP] atom-selection Sets the color of individual atoms according to the atom selection, using one of the color names below: NONE RED BLACk YELLow GREEn WHITe BLUE CYAN MAGEnta GRAY ORANge BROWn PURPle TURQuoise CHARtreuse DKBLue The color applies to the main coordinate set unless COMP is specified; brightfactor is a relative intensity, 0.0 -- 1.0 Examples: All carbons for segment s are colored cyan: color cyan sele type c* .and. segid s end Colored based on weighting array: color green .1 sele prop 1 .gt. 1.0 end color green .2 sele prop 1 .gt. 2.0 end : : : ------------------------------------------------------------------- LBL label-type label-atoms SIZE label-size COLOR label-color label-type = INIT SEGID RESN[*] RESID[*] TYPE CHEM CHARGE WEIGHT USER user-label label-atoms = FIRST[*] and/or atom-selection label-size = VSMALL SMALL[*] MEDIUM LARGE label-color = color-name ( see COLOR command above; default: YELLOW ) user-label = up to 8 characters The LBL command identifies which atoms are to be labeled, what atom attributes are to be included in the label, and the relative size of the labels; the defaults are marked with an asterisk [*]. The INIT option clears all labels, and any other options are ignored. One or more the following attributes may be included in the label, by simply including the keyword(s) in the LBL statement: SEGId segment name (from GENErate; A4) RESN residue name (from the RTF; A4) RESId residue ID, a numeral (A4) TYPE atom type, e.g. N, CA, CB ... (A4) CHEM atom parameter type code (A4) CHARge atomic charge (G12.4) WEIGht value stored in the weight vector (G12.4) USER arbitrary user-specified text (A8) The label length (24 bytes) is such that all attributes may NOT be displayed simulaneously; in particular, CHARge and WEIGht may not be displayed at the same time for the same atom. SIZE is specified by one the keywords VSMALL, SMALL, MEDIUM, LARGE, with a default of SMALL. The COLOR keyword allows setting the label color, using the same color names as the COLor command; the default label color is yellow. Each use of the LBL command can create a group of labels of a different size and color, for atoms which don't overlap with any previous label atom selections. The blank delimited word following the keyword USER, up to 8 characters, allows the use of any text string as a label for the selected atoms. The default is to label the first atom of each residue; an atom selection overrides this, unless the FIRSt keyword is present; in this case, the first atom of each selected residue is labeled. Examples: ! the first atom of each residue is labeled by name with normal text LBL RESN COLOR CYAN ! the first atom of each residue in the segment MAIN is labeled ! by name and number with very small text LBL RESN RESID FIRST SELE SEGID MAIN END SIZE VSMALL ! all oxygen atoms are labeled by charge with small text LBL CHARGE SELE TYPE O* END COLOR CHAR ! all alpha-carbons are labeled by the weight vector with medium text LBL WEIGHT SELE TYPE CA END SIZE MEDIUM ! enter a null label; may be used to selectively "blank" labels ! in this case, all alpha-carbon labels are set to a string of blanks ! for display efficiency, LBL INIT is preferable LBL SELE TYPE CA END USER ! show the location of formal charges on amino acid side chains LBL USER - SELE RESN ASP .AND. TYPE CG END SIZE LARGE COLOR GREEN LBL USER - SELE RESN GLU .AND. TYPE CD END SIZE LARGE COLOR GREEN LBL USER + SELE RESN ARG .AND. TYPE CZ END SIZE LARGE COLOR GREEN LBL USER + SELE RESN LYS .AND. TYPE NZ END SIZE LARGE COLOR GREEN ** developmental ** ------------------------------------------------------------------- LINe iwidth (bonds or vectors; pixels) Set the line width for bonds & vectors, in pixels (integer). Example: line 2 ------------------------------------------------------------------- HBStyle [COLOR color-name] [WIDTH iwidth] [DASH idash] Set the style for representing HBONDS; color-name is as for the COLOR command, and iwidth and idash are integers in pixel units. Specifying HBSTYLE alone resets to the default style, which is equivalent to HBSTYLE COLOR ORANGE WIDTH 4 DASH 4 (N.B. DASH 0 = solid line) If at least one option (COLOR, WIDTH, or DASH) is specified, the remaining options are unchanged; thus HBSTYLE COLOR WHITE will not reset the WIDTH to 4 pixels, but leave it as it was. ------------------------------------------------------------------- RADii [DEFaults] [PARam] scale [bond] atom-sel Sets the radius for displaying atoms, and for output files produced by the PLOT, XMOLE, and MAKE commands. The options are: scale required if no other options are specified, and assumed to be 1.0 if omitted; performs a relative scale if used by itself, or scales the radii set by the DEFAULT or PARAM options DEFAULTS set radii to a convenient size for display, based on atom type ( C, N, O, ... ) PARAM use VDW radii bond value for bond radii for LIGHT program (see MAKE) atom-sel atom selection to apply the radii command to Examples: rad 0.8 ! reduces radii to 80% of current value radii param .5 ! set radii to 50% of VDW radii param .5 .15 ! bond radii to 0.15 A rad 1.5 sele type H* end ! enlarge all H atoms by 50% ------------------------------------------------------------------- DEFault-colors Restore the default color assignments, based on element type. Example: default ------------------------------------------------------------------- STEreo [ON/OFF] [dist] [angle] Invoke side-by-side stereo mode for screen display and for output files produced by the PLOT and MAKE commands, when the ON keyword is used, or when stereo is off. The dist option controls the separation between the two images; the angle option specifies the parallax angle for left and right eye views. The OFF keyword is used to return to mono mode, and is assumed if the command is used while in stereo mode. Example: stereo on 16.0 7 ! default stereo ------------------------------------------------------------------- DRAw [atom-selection] Forces a redraw when AUTO mode is off; also used to to change which set of atoms is currently being displayed. All display modes and output files from PLOT, PLUTO, XMOLE, and MAKE commands use this atom selection. The initial selection is all atoms and may be restored via: draw sele all end ------------------------------------------------------------------- GPR Use Apollo GPR graphics; the only mode currently available. ------------------------------------------------------------------- FULL Use full screen graphics, without an input window; most useful for photography, when the display is controlled from another workstation. ------------------------------------------------------------------- INTer Enter full screen interactive mode, which allows the keyboard to be used to manipulate the structure; also allows DN10000 stereo mode. Keystrokes are case-sensitive, e.g. key [SHIFT]-[E] to exit. INTERACTIVE KEYSTOKE SUMMARY =============================================== E exit interactive mode CTRL-A toggle AUTO mode a/A atom display ON/OFF b save bitmap to file screen.img c/C center / MASS weighted center d forced redraw (AUTO off) h/H H-bond display ON/OFF lN line width N; N=[1-9] (2 keys) m maximum window CTRL-R graphics state reset rNN set rot inc in deg (3 keys) tN set tran inc in A,0=10 (2 keys) vN set radii to .N*VDW (2 keys) z auto zcue over mol coord CTRL-Z zcue off [<--][-->] X translation [^][v] Y translation NEXT WNDW +Z translation, SHIFT for -Z <-- --> X axis rotation ^ v Y axis rotation SHIFT ^ v Z axis rotation s toggle side-by-side stereo S toggle hardware stereo (DN10000) F1,F1S scale by 1.05 / 0.95 F2,F2S scale by 1.25 / 0.80 F3,F3S scale by 2.00 / 0.50 =============================================== HELP or ? display help screen N.B. <--, -->, ^, v indicate cursor arrow keys ** developmental ** ------------------------------------------------------------------- TEXT [text-body] Supply the text for the title display. ------------------------------------------------------------------- FONt [ SMall | NOrmal | MEdium | LArge ] Change the title font to one of four sizes; the initial setting is NORMAL, as is the default if no name is specified. ------------------------------------------------------------------- OFF Disable all graphics and exit the graphics subcommand parser. ------------------------------------------------------------------- RESet Restores view settings to the program defaults ( scale, translation and rotations). ------------------------------------------------------------------- SCAle factor [MOL/LAB] [REP int] Change the Angstrom/pixel scale factor; initially, 1 A = 32 pixels. The repeat factor can provide an ersatz zoom effect. ------------------------------------------------------------------- BOXsize size [MOL/LAB] ------------------------------------------------------------------- CENter [atom-selection] Move the selected atoms to the center of display space. ------------------------------------------------------------------- MAXwindow Scales the molecule to fit in the display window. ------------------------------------------------------------------- POInt x y z The point specified by x y z becomes the center of display space. ------------------------------------------------------------------- ROTate rx ry rz [MOL/LAB] [REP int] Apply a rotation to the viewing transform; does not affect the coordinates. Examples: rot 0 90 ! rotate by 90 deg around the y axis rotate 180 ! rotate by 180 deg around the x axis rot 90 0 90 ! rotate by 90 deg around x, then 90 deg around z ------------------------------------------------------------------- TRAnslate x y z [MOL/LAB] [REP int] Apply a translation to the viewing transform; does not affect the coordinates. Examples: tran 2 9.5 ! translate +2 A along x axis, +9.5 along y axis tra 0 0 4 ! translate +4 axis along the z axis ------------------------------------------------------------------- ZCLip [low] high Set hither and yon clip limits; atoms outside the limits are not displayed, whether selected or not. Examples: zclip 10 ! atoms outside z = ( -10 .. +10 ) are not displayed zclip -5 10 ------------------------------------------------------------------- ZCUe [[low] high ]/[AUTO] Controls the z coordinate range over which depth cueing will be applied. AUTO use the atom coordinates to set the limits. Examples: zcue 10 ! zcue from -10 to +10 zcue -4 8 zcue auto ------------------------------------------------------------------- AUTo [ON/OFF] Enables or disables automatic redraw after every command; the initial setting is ON, and the command functions as a toggle. AUTO OFF is useful for making multiple changes without the time required for a redraw. ------------------------------------------------------------------- BITmap [file] The current screen is saved to a bitmap file using the supplied file name or a default name of screen.img if no name is supplied. The bitmaps may be printed on the Tektronix 4693DX color printer. WARNING: files are ca. 1 Mb in size (1288 blocks). ** developmental ** ------------------------------------------------------------------- ERAse [ON/OFF] Enables or disables a screen clear prior to the next drawing; the initial setting is ON, and the command functions as a toggle. ERASE OFF is useful for overlaying trajectory frames or other related collections of structures. Possibly best used with AUTO OFF, and using DRAW for each structure. Also applies to PLOT command; the plot file is not closed until one of the following: (1) ERASE ON followed by a final PLOT command; correct method (2) the UNIT is CLOSEd; no title or page eject, etc. (3) the program terminates; no title or page eject, etc. ** developmental ** ------------------------------------------------------------------- HELp List the available commands and syntax. ------------------------------------------------------------------- PRImary Forces use of the primary graphics buffer on hardware double-buffer displays; needed to avoid blank images when using /com/cpscr to save screen images for printing, e.g. with the Tektronix 4693DX color printer. The Apollo screen dump /com/cpscr only saves bit planes 0-7 to the bitmap file. ------------------------------------------------------------------- EXEcute pathname Execute a program w/o arguments. Example: exe /com/ld ! list the current directory (aegis) exe /bin/ls ! list the current directory (unix) ------------------------------------------------------------------- END Exit from command parser only; allows other CHARMM commands to be performed w/o losing the current display. Re-invoking graphics does not re-initialize the graphics settings. ------------------------------------------------------------------- PLOt UNIT n Writes out an HPGL plot file using the current atom selection and view transform. The UNIT must be OPENed first ( recommended file extension .hp ). Use of more than 6 colors is not recommended; the color translation is: plotter screen =============================================================== pen 1 black (light gray) gray pen 2 red pen 3 green chartreuse pen 4 blue dkblue turquoise pen 5 magenta purple pen 6 yellow orange pen 7 cyan brown pen 8 white =============================================================== Atom display on the plotter is also fairly primitive (filled circles) and is not intended for space-filling radii, but for a simplistic ball and stick type drawing with small atom radii ------------------------------------------------------------------- PLUto UNIT n Writes out atom coordinates and connectivity based on the current atom selection and view transform. WARNING: 999 atom limit! Stereo mode settings are ignored, as are radii and color. Also, atoms should be renamed to their element types to get proper radii, etc within pluto (e.g. CA is not carbon alpha, it's calcium). As with PLOT, the UNIT must be opened first ( recommended file extension .fdat ). See the ~/nihcom help file for pluto for information on using the HP and APOLLO versions of the pluto program. Example: rename atom C sele type C* end rename atom N sele type N* end rename atom O sele type O* end rename atom H sele type H* end open unit 50 write card name molecule.fdat graphics scale .5 rot 0 90 pluto unit 50 off ------------------------------------------------------------------- XMOle UNIT n Writes atom coordinates, radii and color to a file formatted for display using the xmole (Stellar) demo program. Uses the current atom selection; stereo settings and view transforms are ignored. The UNIT must be opened first ( recommended file extension .xmo ). ------------------------------------------------------------------- MAKE UNIT n Writes atom coordinates, radii, and color to a file formatted for the LIGHT program (see ~/nihcom help file) which produces nice ray-traced images using current stereo settings and view transform. The UNIT must be opened first ( recommended file extension .atm ). ------------------------------------------------------------------- C DEC/CMS REPLACEMENT HISTORY, Element HBONDS.DOC C *4 18-NOV-1991 14:55:30 WON "B. Brooks update" C *3 6-MAY-1991 16:57:33 WON "Info directive fixed" C *2 4-FEB-1991 17:21:41 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:50:08 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element HBONDS.DOC  File: Hbonds, Node: Top, Up: (doc/commands.doc), Next: Syntax Generation of Hydrogen Bonds The generation of hydrogen bonds is one of the major steps in evaluating the energy of a system. The process of hydrogen bond generation involves looking at all possible pairs of hydrogen bond donors and acceptors and selecting those which are "good". The meaning of "good" is determined by parameters to be described below. In addition, the generation routine is responsible for constructing the positions of all uncoordinated hydrogens and adding them into the coordinate list. The selection of hydrogen bonds involves three checks. First, any good hydrogen bond has a length less than some cutoff. Second, the angle off linearity has a value less than some cutoff. This angle is 180 - D--H...A. Finally, if a hydrogen donor has more than one acceptor which satisfies the above constraints and BEST is specified, the routine will select the one with the lowest energy (normally it will take ALL and let the minimization or dynamics adjust there strengths). To obtain a more detailed description of the selection process and the process of constructing hydrogen coordinates, the CHARMM paper should be consulted. Because there are cutoff's involved with the selection of hydrogen bonds, and because the hydrogen bond list must be updated during dynamics, and because energy must be conserved, switching functions are needed to smooth the transition over a cutoff. Therefore, the specification of hydrogen bond generation also allows the specification of switching function parameters. One should note that particular choices for the selection process will never conserve energy in a dynamics run. First, one must fix the hydrogen bond list if one uses the extended atom representation. This is necessary as one cannot apply a switching function to the hydrogen bond angle as it is not calculated if the hydrogens are not present. Second, the selection of the best hydrogen bond for a given donor can't be used, because there is no switching function to smooth the transition between two possible and mutually exclusive hydrogen bonds. The generation is performed by CHARMM at several different points. One can request the hydrogen bonds be generated explicitly using a hydrogen bond command. This is useful prior to analyzing the system. The hydrogen bonds can be generated during any energy manipulation, see *note Energy:(doc/energy.doc). * Menu: * Syntax:: Syntax of the Hydrogen bond specification * Function:: Purpose of each of the keywords  File: Hbonds, Node: Syntax, Up: Top, Next: Function, Previous: Top Syntax of the Hydrogen Bond Command [SYNTAX HBONd] { HBONds } { [IHBFrq integer] hbond-spec } { UPDAte ... } { [IHBFrq 0 ] } { MINImize ... } { } { DYNAmics ... } { } { ENERgy ... } { } hbond-spec ::= [BEST] [DUMMy] [CUTHB real] [CUTHBA real] [ACCE] [INIT] [ALL ] [NOAC] [HBEXclude] [CTONHB real] [CTOFHB real] [CTONHA real] [CTOFHA real] [HBNOexcl ] NOTE:: The IHBFrq value is remembered. If its value is zero, interpretation of [hbond-spec] will be supressed as well as any modifications to the hbond list. [SYNTAX HBTRim] HBTRim real  File: Hbonds, Node: Function, Up: Top, Previous: Syntax, Next: HBTRim Purpose of the various hydrogen bond variables. Variable Default Function ACCE/NOAC ACCE ACCE specifies that acceptor anticedents will be used in an (H-A-AA) angle factor where present in the structure file (from the RTF). HBEX/HBNOexclude HBEXclude causes all hydrogen bonds between excluded atoms to be removed in the hbond edit facility. This also includes 1-4 interaction if appropriate as determined by the NBXMode nonbond value. This option is needed for systems where no angle cutoff is applied (as in the AMBER potential). BEST/ALL ALL BEST turns on selection of best hydrogen bond for a given donor. ALL takes all hydrogen bonds for given donor which satisfy the other conditions. DUMMy Sets CUTHB and CUTHBA to zero. This will result in no hydrogen bonds which is desirable when one is not interested in the hydrogen bond energy. The selection will be done very quickly in this case. CUTHB 4.5 Maximum distance allowed for a hydrogen bond. This distance is measured between the heavy atoms NOTE: a CUTHB value less than 1.0 will disable the HBOND generation code (for efficiency). CTOFHB CUTHB-0.5 Distance where distance switching function is off Once specified, it will only change if respecified. CTONHB CTOFHB-0.5 Distance where distance switching function is on. Once specified, it will only change if respecified. CUTHBA 90.0 Maximum out of line angle allowed for a hydrogen bond. The angle is 180 - D--H...A angle CTOFHA CUTHBA-20.0 Angle where angle switching function is off Once specified, it will only change if respecified. CTONHA CTOFHA-20.0 Angle where angle switching function is on. Once specified, it will only change if respecified. INIT do not INIT specifies that all values and conditions return to the original defaults.  File: Hbonds, Node: HBTRim, Up: Top, Previous: Function, Next: Top The HBTRim command deletes all hydrogen bonds that have an energy of interaction that is higher than the specified cutoff. This command is used to reduce a list of all hydrogen bonds to that of important hydrogen bonds. The syntax is; HBTRim real where the real value is the energy cutoff and should usually be negative. C DEC/CMS REPLACEMENT HISTORY, Element HBUILD.DOC C *3 6-MAY-1991 17:08:16 WON "Info directive fixed" C *2 4-FEB-1991 17:22:34 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:50:10 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element HBUILD.DOC  File: Hbuild, Node: Top, Up: (doc/commands.doc), Next: Syntax Construction of hydrogen positions By Axel Brunger, December 1983 * Menu: * Syntax:: Syntax of the HBUILD command * Algorithm:: Description of the used algorithm  File: Hbuild, Node: Syntax, Up: Top, Next: Algorithm, Previous: Top Syntax of the HBUILD command [SYNTAX HBUILD] HBUILD [atom-selection] hbond-spec non-bond-spec [PHIStp real] [PRINt] [CUTWater real] [WARN] [DISTof real] [ANGLon real] where specify the hydrogens to be (re-)constructed (see *note selection:(doc/select.doc).). By default (if no selection is specified) these are all unknown hydrogens and lone pairs (this is equivalent to a selection "SELEction (LONE .OR. HYDRogen) .AND..NOT INITial"). hbond-spec are hydrogen bond specifications, see (*note hbonds:(doc/hbonds.doc)Syntax.) for the detailed syntax, and non-bond-spec are non-bonded interaction specifications, see (*note nbonds:(doc/nbonds.doc)Syntax.) for the detailed syntax. At present the use of the following options is not supported by HBUILD and may yield to errors: BEST in hbond-spec, GROUP [...] in non-bond-spec. PHIStp (default: 10 degrees) determines the step size of the donor group rotation algorithm in HBUILD. PRINt (default: PRINt flag off) if specified prints information about electrostatic, Van der Waals, hydrogen bond, dihedral energy as well as ST2 energy during the performance of the algorithm. If WARN is specified routine ST2WRN is invoked after exiting HBUILD to provide information about unlikely water-(non-polar group) configurations. See that routine for the purpose of DISTof and ANGLon. Any bond between atoms, both of which are to be built, will be ignored. If it is desired to build a chain of atoms with this method, it is essential to build each level in this chain with a separate HBUILD invocation.  File: Hbuild, Node: Algorithm, Up: Top, Previous: Syntax, Next: Top Alogorithm of the hydrogen builder 1. Introduction In most cases a X-ray diffraction structure contains no information about the positions of the protons of a particular protein. However, our empirical hydrogen bond energy function CHARMM requires the treatment of explicit protons at least for hydrogen bond forming protons. To construct proton positions starting from the X-ray structure of a protein is the task of our method. At present only hydrogen bonding protons are constructed. Due to the generality of the algorithm also the positions of aliphatic protons could be easily constructed. Proton coordinates are constructed for the protein as well as for the surrounding water. The water requires special treatment and the investigations for a this part of the method are not yet complete. The presented method was tested using the neutron diffraction structures of two different proteins systems each including several water molecules. One structure was ribonuclease A with 128 water molecules. The other structure was trypsin with 30 ordered water molecules. The knowledge of the proton positions using the neutron data allowed detailed comparisions of spatial positions of the protons, hydrogen bond and energy differences. The results indicate that the use of the presented method should yields to a good initial structure of the protons and is therefore a useful tool in cases where no neutron structure is available. 2. Methods In the first part of our method all proton positions of the protein are constructed. The protons are classified according to their environment. At present the following classes are defined: a) proton bound to a donor with at least two heavy donor antecedents (e.g. (C, CA)-N-H) b) proton bound to a donor with one heavy donor antecedent and no other proton (e.g. -OH-HH of tyrosine) c) proton bound to a donor with one heavy donor antecedent and one other proton (e.g. -NH2-(HH21, HH22) group of arginine) d) proton bound to a donor with one heavy donor antecedent and two other protons (e.g. -NZ-(HZ1, HZ2, HZ3) group of lysine) First, all protons of class a) are placed by using equilibrium bond lengths, angles and dihedrals. This problem is overdetermined if there exists more than one heavy donor antecedent. In these cases an averaging over all possible ways to place the proton is performed. In the next step the protons of all other classes are constructed. All these classes have in common that there is a degree of freedom to place the protons (e.g. a spin around the CE-NZ bond of lysine). To find an optimum position the dihedral angle with the symmetry axis antecedent-donor is modified in small steps over a certain range determined by the symmetry of the donor group. For each dihedral angle the protons of the donor are placed according to their equilibrium geometry and the relative energy of the corresponding configuration is evaluated. The energy is determined by using the hydrogen bond potential, the Van der Waals term, electrostatic term and the dihedral term derived from the full energy expression of CHARMM. The dihedral with the lowest energy is taken and the protons of the donor group are placed with the optimum dihedral angle. This procedure is performed in the order given by the residue sequence of the protein. Not jet constructed protons have no influence on the current energy evaluations. After construction of all explicit protein protons the water protons are constructed. First, a sequence of water molecules is determined independent of any input sequence (e.g. by the X-ray data). The waters are ordered in respect to the minimum distance of the water oxygen to any protein atom. The protons of waters near the protein are constructed first. At present there are three classes of water molecules treated in our method. a) water able to form two different hydrogen bonds to acceptor atoms b) water able to form only one hydrogen bond to acceptor atom c) water forms no hydrogen bonds at all to acceptor atoms. In case a) protons are placed by performing a rotation of the water molecule in the plane defined by the two best hydrogen bonds and taking the minimum energy configuration. In case b) one proton is placed on the (linear) hydrogen bond and the water is rotated around this hydrogen bond axis placing the other proton using the equilibrium geometry. Again the minimum energy configuration is taken. The evaluated relative energy is the sum of the Van der Waals, the electrostatic and the hydrogen bond energy terms. Finally, the water protons of case b) are placed in a standard way (H1 on x-axis, H2 in x,y plane) after all other protons have been placed. ST2 water molecules are treated as regular waters for the proton construction. The position of the lone pairs is derived from the proton positions. C DEC/CMS REPLACEMENT HISTORY, Element IMAGES.DOC C *3 6-MAY-1991 17:09:47 WON "Info directive fixed" C *2 4-FEB-1991 17:23:32 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:50:15 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element IMAGES.DOC  File: Images, Node: Top, Up: (doc/commands.doc), Next: Read IMAGES By Bernard R. Brooks, 1983 CHARMM has a general image support system that allows the simulation of almost any crystal and also finite point groups (such as dimers and tertamers...). There is also a facility to introduce bond linkages (with additional energy terms including angles, dihedrals and improper dihedrals) between the primary atoms and image atoms. This allows infinite polymers, such as DNA to be studied. For infinite systems, an assymetric unit may be studied because rotations and reflections are allowed transformations. The IMAGE facility is invoked by reading an image transformation file. From this point, the images of the primary atoms will be included in any energy and force determinations for the remainder of the calculation. A null image file with the INIT keyword will disable this facility. * Menu: * Read:: Description of the IMAGE data file. * Write:: The write and print options regarding images. * Update:: Options and description of the image update. * Patching:: Specification of image patching. * Centering:: Secification of image centering during updates. * Operation:: Some details and requirements for operation  File: Images, Node: Read, Up: Top, Next: Write, Previous: Top Image Transformation File The IMAGE file contains all of the information needed to define the position and orientation of all symmetric images of the primary atoms. The file is read in free field and may be passed parameters. Syntax: READ IMAGe [CARD] [UNIT integer] [INIT] [PRINT] File Structure: (title) SCALE xfactor yfactor zfactor IMAGE image_name ! start of a new transformation DEFIne repeat( [INVErse] other_image ) ! define transformation from others ROTAte xdir ydir zdir angle ! specify an axis and angle TRANslate xdir ydir zdir [distance] ! specify a displacement NEGAte ! invert through origin END ! terminates transformation file When an image file is read, any existing image transformations are discarded, but not any information regarding image patching. The INIT keyword on reading will remove ALL existing image data first. Rereading an image transformation file without the INIT keyword is useful when crystal parameters are to be modified, but the patching data is to be retained. The PRINT option, prints all data as it is read. The image file starts with a standard CHARMM title. The remaining commands are processed sequentially. The SCALE command gives three values which multiply all subsequent transformation specifications. The default values are unity. The IMAGE command initiates a new image transformation. The transformation matrix is set to unity (no rotation, no translation). This transformation matrix is then modified by subsequent commands until another IMAGE commmand or the END command is found. The DEFIne command multiplies the current transformation matrix by any of the previously defined images. The INVErse keyword proceeding the other transformation name, uses the inverse of this transformation. Any number of previous transformations may be listed with this command, but they are processed sequentially (just in case they don't commute). The ROTAte command causes the current transformation matrix to be operated with a rotation. Four real numbers must follow which define a rotation axis and an angle about this axis (in degrees). The TRANslation command will translate the current transformation matrix. If three values are specified, then this is used as the translation vector. If four values are given, then the first three define a direction, and the fourth value defines a distance. Before operating on the transformation matrix, the elements of this vector are multiplied by the current scale factors (from last SCALe command). The NEGAte command projects the current transformation through the origin. This operation changes the chirality of the system, and is not appropriate for macromolecules. This operation is required for simulations with glide planes. The END command is used to terminate the IMAGE file. This is required if the file is read from the input stream. ____________________________________________________________________ One restriction on the transformations is that every transformation MUST have an inverse. There is a serious warning if this restriction is violated. This requirement is needed in defining the energy of the system. When computing the energy, the Hamiltonian is assumed to be symmetric, and only the lower half is generated. The result of having an image without a transformation is to remove the symmetry of the Hamiltonian. The considerations of program efficiency and memory requirements make this necessary. There may be examples where this is desired, such as for cases where no energy calculations are needed or for structural analysis. A transformation may be its own inverse as is the case when a tranformation consists of only a 180 degree rotation. The maximum number of allowed transformations is 100. This limit can easily be increased.  File: Images, Node: Write, Previous: Read, Up: Top, Next: Update Image Writing and Printing Several different types of image data may be written or printed. These are used for analysis and to check the operation of the program. Syntax: WRITE IMAGes { TRANsformations } { PSF } { FORCes } The TRANsformation option will list all image transformation matricies as well as what the inverse transformations are. For each transformation, there is given a 3X3 rotation matrix followed by the translation vector. For the use in this program the translation is done AFTER the rotation has been made. The PSF option, lists information about the image atoms as well as list all primary-image internal coordinates (bonds, angles, dihedrals, and improper dihedrals). The FORCe option, lists the total force and torque each image transformation applies on the primary atoms. This data may be used to estimate the pressure of a system, or to check if minimization is complete. At the end, the total force (vector sum) and torques are listed.  File: Images, Node: Update, Previous: Write, Up: Top, Next: Patching Image Updating The image update procedure has several functions. This updating is done prior to any nonbond or hydrogen bond updating, because its results may affect those updates. [Syntax IMAGE updating] ...... { } ! no change { IMGFrq int [CUTIm real] [IMALl ] [INVErse ] } { [IMBRief] [NOINverse] } { } { IMGFrq 0 } ! suppress image updating The absence of the IMGFrq keyword, maintains the current status of image updating. Specifying an IMGFrq value of zero, suppresses all image update functions, but does not modify the image lists in any way. The IMGFrq integer value gives the frequency of image updating to use during dynamics or minimization. For setting up a single image update, any positive value may be used. The CUTIm value gives the maximum allowable distance of any group to be included in the image atom lists. Normally, a group is included only if it belongs to a tranformation whose inverse transformation is of a higher index than its own. This is because only the lower triangle of the Hamiltonian is computed and any image interaction between primary atoms and image atoms of a higher inverse index will already be computed. This efficieny consideration greatly reduces the required number of image atoms and the size of the image nonbond lists. This reduction is activated by the use of the IMBRief option. If on the otherhand, one desires these groups for the purpose of analysis of for displays, the IMALL keyword may be used to generate them as well. The sequence of events in this update are; 1) Save existing image atom lists (from the previous update). 2) Process image centering if requested to replace far off groups of atoms by a closer image. 3) Generate appropriate image atoms within the cutoff distance of the primary atoms. 4) Remap internal coordinate energy list if the new image atom list differs from the previous one. Also remap the IC table and image exclusion lists. The INVErse and NOINverse options are internal and neither should be specified under normal circumstances.  File: Images, Node: Patching, Previous: Update, Up: Top, Next: Centering Image Structure File Patching This command introduces bonding linkages between primary atoms and image atoms. This allows the simulation of infinite (or cyclic) polymers. [Syntax IMPAtch (image patching)] IMPAtch patch_residue repeat( image_name segid resid ) [SETUp] [WARN] The patch_residue must be present in the topology file and the syntax of this patch residue is identical to ordinary patching (see *note patch:(doc/struct.doc)patch.), with the restrictions that the ATOM, DONOr, and ACCEptor specifications may not be used. Atom characteristics may not be modified with this command. The donor and acceptor status of any image atom must match that of the corresponding primary atom. The patching spcifications that are recognised are; BOND, ANGLe, DIHEdral, IMPHi, and IC (internal coordinates) A residue specification is required for each used in the PRES. These are specified by three names, (1) the image name (for primary atoms the name "PRIM" must be used), (2) the segid, and (3) the resid. The SETUp keyword causes all PRES IC table entries to be added to the current IC table. The WARN makes all errors nonfatal and lists errors.  File: Images, Node: Centering, Previous: Patching, Up: Top, Next: Operation Image Centering There is a set of commands that allow for the centering of selected part of the PSF during an image update. This is primarily designed for solvent, but may be used in many ways. [Syntax IMAGE (image centering)] IMAGE { FIXEd } [ XCEN real ] [ YCEN real ] [ ZCEN real ] { BYSEgments } atom-selection { BYREsidues } { BYGRoups } { BYAToms } During minimization, a particular water may become far from the rest of the primary structure. The centering features allows one of its image (the one closest to the primary space) to become the primary water. It is also useful when setting up a crystal calulation. With a single update, the "best" image choice of all solvent molecules may be made. One example of this is the netropsin crystal where one of the published sulfate groups is quite far from the primary netropsin. This command is required for a pure solvent simulation where solvent can freely diffuse. The execution of this command only sets up data used during the image update. There is only one value each for XCEN, YCEN, and ZCEN. If these values are not specified in any IMAGE command, then they are not modified (default 0.0). For each atom, there is a flag specifying the manner of image centering to be used. Each invocation of the IMAGE command may modify these flags. The default is FIXEd (don't center this atom). The BYSEgment option will center an entire segement as a group (providing it has no FIXED atoms). The remaining commands will allow certain other groups of atoms to be centered as a group. It wouldn't work well if only one part of a molecule was centered (there is no checking for this!). The command; IMAGE FIXED SELE ALL END - will turn off all centering IMAGE BYRES SELE RESNAME ST2 END - will allow centering of all ST2's IMAGE BYATOM SELE ALL END - will not work if there are any bonds  File: Images, Node: Operation, Up: Top, Previous: Centering, Next: Top Image Operation The IMAGE routines in CHARMM can be classified into five sections. These catagories are : 1) Set up images -IMREAD,REIMAG,INIMAG,IMPATC,IMATOM,IMSPEC 2) Update image arrays - UPIMAG,IMCENT,MKIMAT,IMMAP,MKIMNB 3) Set up energy lists - IMHBON,NEWHBL,IMHBFX,NBONDM 4) Compute image energy - EIMAGE,TRANSO,TRANSI 5) Print out - IMWRIT,IMPSFW The first catagory involves reading the image file (IMREAD) and setting up the data structure (REIMAG,INIMAG). In the section are also the routines involving image patching and setting up the centering options. The second category concerns itself with the selection of image groups are to be kept. This selection process is repeated each image update. Also done, is the centering, PSF remapping (if the atom list has changed), and the generation of the image exclusion lists. The third catagory in addition to finding the energy terms codes, also generates the nonbond and hydrogen bond lists between primary and image atoms. The fourth catagory is concerned with the computation of energy terms. For the actual computation of energy, standard routines are used (ENBOND,EHBOND,ENST2) with a modified calling sequence. The procedure used is: 1) Compute coordinates for all image atoms 2) Set up arrays for self energy terms (atom with its own image) 3) Compute self terms, divide energy by 2, zero out image forces 4) Compute remaining terms including forces on image atoms 5) Transform forces on image atoms back into the primary space Using a procedure where the forces on image atoms is kept, allows for a substantial reduction in the number of necessary image atoms. This results in the necessity that all transformations have an inverse. This procedure has the drawback that the self energy terms must be treated specially and that all Hbonds between image and primary atoms must be computed and then trimmed of any repeats. Since there is no treatment of the second derivative of the energy for image atoms, The procedures involving Newton-Raphson minimizations and vibrational analysis should be avoided (see *note Energy: (doc/energy.doc).). C DEC/CMS REPLACEMENT HISTORY, Element INTCOR.DOC C *5 18-NOV-1991 14:59:48 WON "Updated by B. Brooks" C *4 12-SEP-1991 19:16:20 WON "Update by Bernie Brooks" C *3 6-MAY-1991 17:18:25 WON "Info directive fixed" C *2 4-FEB-1991 17:24:49 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:50:21 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element INTCOR.DOC  File: INTCOR, Node: Top, Up: (doc/commands.doc), Next: Syntax The Internal Coordinate Manipulation Commands The commands in this section can be used to construct cartesian coordinates from internal coordinate values. The internal coordinate data structure can also be used for analysis purposes. There are flexible editing commands for manipulating the data structure. When these commands are used in conjunction with the Coordinate Manipulation commands (see *note Corman:(doc/corman.doc).) and the I/O commands (see *note IO:(doc/io.doc).), a rather complete model building facility exists. * Menu: * Syntax:: Syntax of the internal coordinate commands * Function:: Purpose of each of the commands * Structure:: Description of the structure of internal coordinates  File: INTCOR, Node: Syntax, Up: Top, Next: Function, Previous: Top Syntax of Internal Coordinates commands [SYNTAX IC - internal coordinate tables] IC { PARAmeters [ALL] } { FILL [COMP] [APPEnd] [PREServe] } { DIFFerences [COMP] [APPEnd] } { DERIvatives [COMP] [APPEnd] } { DYNAmics dynamics-spec } { EDIT } { BUILd [COMP] } { SEED atom atom atom [COMP] } { PURGe } { SCALe scale-spec } { RANDom [ISEEd int] } { GAUSsian UNIT int atom atom atom } { PUCKer 5x(atom) ANGLe real AMPL real } { } { { DELete } { BYNUM int [int] } } { { KEEP } { ic-selection } } { } { SAVE [PREServe] } { RESTore [PREServe] } { } { READ [FILE] [APPEnd] UNIT int } { WRITe [FILE] [RESId] UNIT int } { PRINt } atom::= {residue-number atom-name} { segid resid atom-name } { BYNUm atom-number } dynamics-spec::= { [AVERages] } [FIRStunit int] [NUNIts int] { FLUCtuations } [BEGIn int] [STOP int] [NSKIp int] ic-selection::= { } atom-selection { [FIRSt] [SECOnd] [THIRd] [FOURth] } scale-spec ::= [ BOND real ] [ ANGLe real ] [ DIHEdral real ] atom-selection ::= see *note select:(doc/select.doc). The syntax for the EDIT subcommands are: { DISTance atom atom real } { ANGLe atom atom atom real } { DIHEdral atom atom [*]atom atom real } { END }  File: INTCOR, Node: Function, Up: Top, Next: Structure, Previous: Syntax Purpose of the various Internal Coordinate commands Description : These commands are used to setup, modify and process the internal coordinates of the molecule. This operation is very useful in setting up atom coordinates whenever they are not known. This occurs when a protein structure is built from scratch or when an existing structure is modified. The modification can simply a conformational change, or a change in the residue sequence through replacement, insertion, or deletion. Many of these modifications can be processed within the program as it currently stands. Other more difficult modifications can be facilitated by editing the internal coordinate card file by using external programs. This facility is also useful as an analysis tool. Several support program use the output from IC tables for conformational analysis (phi-psi maps, ring pucker, pseudorotational angles, solvent structure,...). Command ordering : The Internal Coordinate commands (except EDIT and READ) can only be used if internal coordinates exist (i.e. if the IC common is filled). This can only be filled by reading an IC file, or by using the SETUp keyword in the GENErate or PATCh commands. The information used to setup is obtained from the residue topology file used in the generation process. Subcommand interpretation : 1) PARAmeter - Fill table with parameter values Fill the internal coordinates using standard values from the parameter file, unless otherwise specified in the residue topology file (see RTF:(IO)Rtf File Formats.). A value of zero for any bond or angle (not dihedral) indicates that this value should be obtained from the parameters. If the ALL keyword is specified, then all angle and bond values will be filled from the parameter set regardless of the existing values. Setting bond and angles values to zero with the IC edit command makes it possible to selectively use this command. 2) FILL - Convert from cartesian to internal coordinates Fill the internal coordinate values wherever possible from the known atomic coordinates. IC's for atoms that are not placed are zeroed unless the PREServe keyword is specified, in which case the entries are not modified. If the COMP keyword is used, then The alternate coordinate set will be used to fill the IC data structure. The APPE option will add the current values to the existing values of the table. 3) DIFFerence - Fill table with the difference of two structures The DIFF command will cause the IC table entries to be filled with differences of internal coordinate values. Normally the values are filled (MAIN-COMP), but this is reversed if the COMP keyword is used. The APPEnd keyword will cause the differences to be added to the existing IC table values. 4) DERIvative - Fill table with internal derivatives The DERIvative command will fill the IC table entries with the analytical internal derivatives associated with a particular vector (velocity, forces, or normal mode are typical examples). Normally, it is assumed that the vector is stored in the main coordinate set and the coordinates are stored in the comparison set. If the COMP keyword is specified, then their roles are reversed. The APPE keyword will cause the new values to be added to the existing table values. 5) DYNAmics - Fill table with dynamic averages or fluctuations. The IC DYNAmics command generates averages or fluctuations for the IC table from a dynamics trajectory. The syntax is; IC DYNAmics { [AVERages] } [FIRStunit int] [NUNIts int] { FLUCtuations } [BEGIn int] [STOP int] [NSKIp int] Either the averages, or the fluctuations about the current table values can be computed. The sequence; IC FILL IC DYNAmics AVERage ... PRINT IC IC DYNAMics FLUCtuations ... PRINT IC will print out the averages and fluctuations about the averages. For dihedrals, whether computing fluctuations or averages, a reference value is subtracted before summing (i.e. values are always within 180 degrees of the reference value), thus explaining the need for the IC FILL command preceeding the first IC DYNAmics command. 6) EDIT - Add to or modify the IC table elements Edit the internal coordinate file. This command causes the input stream to transfer to the IC edit mode. The edit mode commands are: DIST atom atom real ANGLE atom atom atom real DIHE atom atom [*]atom atom real END atom::= {residue-number atom-name} { segid resid atom-name } { BYNUm atom-number } These commands will specify a particular internal coordinate value. All occurences of the specified item will be modified. If the specified atoms have no corresponding IC table entry, then a new IC entry will be added for these specified atoms. For the ANGLe option when a new IC entry is added, the corresponding 1-2 and 2-3 distances will be filled from other existing values (or left as zeros). For the DIHEdral option, an optional '*' on the third atom denotes that this is the central atom of an improper dihedral type. When adding a new IC entry for dihedrals, the associated bond and angle terms are filled from existing table values is possible, otherwise, they are added with zeros. The END command is used to exit from the edit IC mode. 7) BUILd - Convert from internal to cartesian coordinates This command determines the cartesian coordinates for all unspecified atoms from the data in the IC file (wherever possible). The user is responsible to make sure that the designation for all atoms is unique. In the case that the system is over specified, An atom is placed on the first opportunity (no checking is done for currently placed atoms). If it is desired to modify the position of atoms with known coordinates, the coordinates for those atoms must be reinitialized using the COOR INIT command. If an IC element contains a zero bond length or angle (not dihedral), then it will not be used to place the terminal atom. This option is useful in cases where the system is overspecified and building is not desired for some IC's. For example; IC: 2 O4' 2 C2' 2 C1' 2 H1' 0.0 0.0 120.0 109.5 1.0 can be used to place H1' but will not place atom O4'. Again, if the COMP keyword is used, then the alternate coordinate set will be used and modified. 8) SEED - Place first three atoms for building reference When the cartesian coordinates are not specified for any atoms, the BUILd command cannot be used to generate positions since all positions are determined relative to known positions. The SEED command specifies the positions of the three atoms . It puts the first at the origin, the second on the x-axis, and the third in the xy-plane. The three atoms must have entries in the IC file corresponding to: dist 1-2, angle 1-2-3, dist 2-3. The COMP keyword causes the alternate coordinate set to be used. 9) DELEte - Delete selected elements from the table This commands deletes a specified set of IC's from the data file. The delete can be by number (using the BYNUM keyword and a range), or by atoms selection. Any IC that contains a selected atom will be removed. By default, the atom can match in any position. However, a specific match may be requested by specifying one or more of (FIRSt,SECOnd,THIRd,FOURth). Specifying all of them is equivalent to the default. 10) KEEP - Delete all non-selected elements from the table The keep command is the logical opposite of the DELEte command. Its options are identical, except that the selected set of IC's is kept, and the remaining ones are deleted. As in the IC DELEte command, a positional match may be selected. 11) PURGe - Clean up the IC table The PURGe command will cause all IC's that contain undefined atoms to be deleted. This is not automatic because sometimes it is desirable to keep partial IC table entries (where less than 4 atoms are defined). 12) SCALe - Scale all table elements by a factor. The SCALe command will multiply all elements of a table by a constant factor. This is primarily used when the table contains IC differences or derivatives, and new structures are to be generated after following the scale command be an IC FILL APPEnd command. 13) RANDom - Randomize all dihedral values The RANDom command will randomize all dihedral values in the table. It will use and modify the specified ISEED value. 14) SAVE - Save the current IC table The SAVE command will copy the current IC table to an second IC table for later retrieval. If the PREServe keyword is specified, then any IC elements already in the second table will be unmodified. 15) RESTore - Restore a previously saved IC table The RESTore command will copy the saved IC table (See IC SAVE command) to the current IC table. If the PREServe keyword is specified, then any IC elements already in the IC table will be unmodified. 16) GAUSsian - Make a GAUSSIAN86 input file from CHARMM coordinates The GAUSsian command will make a GAUSSIAN86 coordinate in Z-matrix form for use with the popular ab initio program. The MAIN coordinates will be used unless the COMP keyword is specified. The first three atoms must be specified (in the IC SEED format) and an output unit number must be specified for a write access file. 17) PUCKer - Set the ring pucker to a specified value. ....  File: INTCOR, Node: Structure, Up: Top, Previous: Function, Next: Top Internal Coordinate concepts: Given the positions of any three atoms, the position of a fourth atom can be defined in relative terms (internal coordinates) with three values: a distance, an angle, and a dihedral specification. Where many atoms are connected in a long sequence (as in proteins) it is easiest to consider four atoms in a chain. If the positions of one end of the chain is known, it is possible to find the positions of all of the remaining atoms with a series of internal coordinate values. But in the more general case, where some central portion of a molecule is known it is necessary to be able work in both directions. This lead to the present form of the internal coordinate data structure (five values for four atoms) where if either endpoint is unknown and the other three atoms are determined, the position of the end atom can be found. The improper type of internal coordinate data structure was created for branching structures (as opposed to simple chains). Since there are roughly five values in the data structure for every atom it is clear that the positions are overspecified. Keep this in mind when externally editing IC files. The program will use the first acceptable value when building a structure and ignore any redundancies. The EDIT commands will always modify all occurences of each edited parameter. Normal IC table entry: I \ \ J----K \ \ L values (Rij),(Tijk),(Pijkl),(Tjkl),(Rkl) Improper type of IC table entry I L \ / \ / *K | | J values (Rik),(Tikj),(Pijkl),T(jkl),(Rkl) Internal Coordinate file structure: The internal coordinate file can be stored in either card or binary form. for most purposes the card form will be used (since it can be edited). There are two types of elements in the internal coordinate file, those that correspond to normal dihedral angles and those that correspond to improper dihedrals. They can be distinguished by the presence of a '*' just before the iupac name of the third (K) atom (its presence denotes an improper dihedral type). For each element there are four atoms (referred to as I,J,K,L) and five values. Elements of the IC file are symmetric with respect to inverting the order of the atoms except that for improper types only atoms I and L can be interchanged (also the sign of phi must be changed since phi(IJKL)=-phi(LJKI) ). C DEC/CMS REPLACEMENT HISTORY, Element IO.DOC C *7 18-NOV-1991 15:03:13 WON "Updated by B. Brooks" C *6 26-OCT-1991 01:04:40 WON "WRITE PSF XPLOR documented" C *5 24-OCT-1991 01:32:23 WON "17-OCT-91 NIH update" C *4 6-MAY-1991 17:20:20 WON "Info directive fixed" C *3 4-FEB-1991 17:26:03 WON "from NIH, 02-Feb-91" C *2 2-NOV-1990 07:04:34 KOTTALAM "Multiple term dihedral documented" C *1 8-APR-1990 19:50:24 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element IO.DOC  File: IO, Node: Top, Up: (doc/commands.doc), Next: Read Input-Output Commands The commands described here are used for reading and writing data structures used in the main part of CHARMM. Some of data structures used in the analysis facility may also be read and written. * Menu: * Read:: Reading data from external sources * Write:: Writing data structures in machine readable form * Print:: Writing data structures in a human readable form on unit 6 * Titles:: Specifying and manipulating titles  File: IO, Node: Read, Up: Top, Next: Write, Previous: Top READ - Reads Data from External Sources This command reads data into the data structures from external sources. The external sources can be either card image files or binary files. The fortran unit number from which the information is read, is specified with the unit-spec. The precise format of all these files is described only in the source code as that serves as the only definitive, accurate, and up to date description of these formats. The description of the data structures provides pointers to the subroutines which should be consulted, see *note data: (doc/usage.doc)Data Structures. * Menu: * Read Syntax:: Syntax of the READ command * Sequence:: Reading a segment's sequence * Coordinate:: Reading coordinates * Universal:: Reading coordinates from nonstandard formats * Param files:: The formats used in parameter files * RTF file format:: The format used in topology files * Other files:: Reading all other file types  File: IO, Node: Read Syntax, Up: Read, Next: Sequence, Previous: Read Syntax of READ Command [SYNTAX READ] READ { RTF { CARD [APPEnd] [PRINt] } } [ UNIT integer ] { { [ FILE ] } } { PARAmeter { CARD [ PRINt ] [ NBON ] } } { { [ FILE ] } } { IC { [ CARD ] } [ APPEnd ] } { { FILE } } { SEQUence { [ CARD ] } } { { COOR [ RESId ] } } { { PDB [ RESId ] } } { { TIPS integer } } ! TIP3P water model { { ST2 integer } } ! ST2 water model { { WATEr integer } } ! OH2 residue model { { DUM integer } } ! Dummy atoms { { resname integer } } ! Any RESI in the RTF { HBONd { [ FILE ] } } { { CARD } } { PSF { [ FILE ] } } { { CARD } } { CONStraint { [ CARD ] } } { NBONd [ FILE ] } { TABLe [ FILE ] } { TRAJectory [ COMP ] } { IMAGes [ CARD ] [ INIT ] } { XRAY } { UNIVersal-coordinate-format } { COORdinate coor-spec [ COMP ] } coor-spec ::= { FILE [IFILE int] } coor-option { CONTinue } { CARD [OFFS int] [ RESI ] } { PDB [OFFS int] } { UNIVersal [OFFS int] [ RESI ] } { IGNOre } coor-option ::= [APPEnd] [INITial] [FREEfield] atom-selection Syntactic ordering: The second field must be specified as shown.  File: IO, Node: Sequence, Up: Read, Previous: Read Syntax, Next: Coordinate Specifying a sequence of residues for a segment The specification of SEQUence causes the program to accept a sequence of residue names to be used to generate the next segment in the molecule. Unless the WATEr, TIPS, or ST2 option is used, the sequence is specified as follows: title number of residues repeat(residue names) The form of the title is defined in the syntactic glossary, *note syn: (doc/usage.doc)Syntactic Glossary. The number of residues is specified on the line following the title in free field format. If the number of residues you specify is less than zero, CHARMM will read residues until it encounters a blank line or end of file. If the number is greater than zero, it will also stop once it has read at least as many residues as you've specified. If the number you specify is zero, you will get a warning message as one common error is to forget the number entirely. In this case, the first residue name will be consumed as the number and converted to zero. The residue names are specified as separate words, each no longer than 4 characters, on as many lines as are required for all the residues. This sequence may be placed immediately following the READ command if the unit number is the stream or may be placed in a separate file. When reading is complete, CHARMM will list all the residues it has read, and tell you which residues it thinks can be titrated. The WATEr option allows a sequence of water molecules to be specified. This will give the old 3-center water model (not recommended). The integer which follows the keyword gives the number of waters. The TIP3P water model may be specified with the TIPS option. Likewise, the ST2 option allows ST2 waters to be specified. Obviously, no sequence on separate lines need be given. The topology file must contain the residue named (OH2,TIP3,ST2); otherwise, the GENErate command invoked subsequently will fail. The COOR option will read the sequence from a CHARMM format card coordinate file. The residue numbers are ignored except that when a change occurs, a new residue is added. If the RESId keyword is also present, then the resid's are obtained from the resid field of the coordinate file. This is useful when one wants to specify residue names (rather than use the number representation). No other information is read from the coordinate file during this process.  File: IO, Node: Coordinate, Up:Read, Previous:Sequence, Next: Universal Reading coordinates The reading of coordinates is done with the READ COOR command, and there are several options (which may change over in future versions). Coordinates may be read into the main set or the comparison coordinate set using the COMP keyword. There are three possible file formats that can be used to read in coordinates. They are coordinate binary files, dynamics coordinate trajectories, and coordinate card images. Protein Data Bank (PDB) formatted files can also be read. They do however require some editing first. All the HEADER and other junk before the actual coordinate section has to be removed and optionally replaced by a standard CHARMM title. There should be no line with NATOM (= number of atoms) preceding the actual coordinates. CHARMM does no translation whatsoever of residue or atom names, so you would either have to rename some entries in the PSF or in the coordinate file in case there are differences. For all formats, a subset of the atoms in the PSF may be selected using the standard atom selection syntax. For binary files, This is a risky maneuver, and warning messages are given when this is attempted. Only coordinates of selected atoms may be modified. When reading binary files, or using the IGNOre keyword, coordinate values are mapped into the selected atoms sequentially (NO checking is done!). The reading of the first two file formats is specified with the FILE option. The program reads the file header to tell which format it is dealing with. The coordinate binary files have a file header of 'COOR' and contain only one set of coordinates. These are created with a WRIT COOR FILE command. The dynamics coordinate trajectories have a file header of 'CORD' and have multiple coordinate sets. These files are created by the dynamics function of the program. To specify which coordinate set in the trajectory to be read, the IFILE option is provided. One specifies the coordinates position within the file. The default value for this option will cause the first coordinate set to be read. If the IFILE value is negative, then the next file (other than the first one) will be read. This will only work if a set has already been read from the file with a positive IFILE value. For binary files, the APPEnd command will 'deselect' all atoms up to the highest one with a known position. This is done in addition to the normal atom selection. This is useful for structures with several distinct segments where it is desireable to keep separate coordinate modules. The CARD file format is the standard means in CHARMM for providing a human readable and writable coordinate file. The format is as follows: title NATOM (I5) ATOMNO RESNO RES TYPE X Y Z SEGID RESID Weighting I5 I5 1X A4 1X A4 F10.5 F10.5 F10.5 1X A4 1X A4 F10.5 The title is a title for the coordinates, see *note syn: (doc/usage.doc)Syntactic Glossary, for details. Next comes the number of coordinates. If this number is zero or too large, the entire file will be read. Finally, there is one line for each coordinate. ATOMNO gives the number of the atom in the file. It is ignored on reading. RESNO gives the residue number of the atom. It must be specified relative to the first residue in the PSF. The OFFSet option should be specified if one wishes to read coordinates into other positions. The APPEnd option adds an additional offset which points to the the residue just beyond the highest one with known positions. This option also 'deselects' all atoms below this residue (inclusive). For example, if one is reading in coordinates for the second segment of a two chain protein using two card files, and the APPEnd option is used, RESNO must start at 1 in both files for the file reading to work correctly. It should also be remembered that for card images, residues are identified by RESIDUE NUMBER. This number can be modified by using the OFFSet feature, which allows coordinates to be read from a different PSF. Both positive and negative values are allowed. The RESId option will cause the residue number field to be ignored and map atoms from SEGID and RESID labels instead. RES gives the residue type of the atom. RES is checked against the residue type in the PSF for consistency. TYPE gives the IUPAC name of the atom. The coordinates of an atom within a residue need not be specified in any particular order. A search is made within each residue in the PSF for an atom whose IUPAC name is given in the coordinate file. The RESId option overrides the residue number and fills coordinates based on the SEGID and RESID identifiers in the coordinate file. This is the recommended method where different PSF's are used. The IGNORE option allows one to read in a card coordinate file while bypassing the normal tests of the residue name, number, and atom name. When IGNORE is specified in place of card, the identifying information is ignored completely. Starting from the first selected atom, the coordinates are copied sequentially from the file. The PDB option works very much like the CARD option, but expects the actual file format to be according to Protein Data Bank standards: text IATOM TYPE RES IRES X Y Z W A6 I5 2X A4 A4 I5 4X 3F8.3 6X F6.2 Normally, the coordinates are not reinitialized before new values are read, but if this is desired, the INITialize keyword, will cause the coordinate values for all selected atoms to be initialized. Note that only atoms that have been selected, will be initialized. The COOR INIT command provides a more general way to initialize coordinates.  File: IO, Node: Universal, Up: Read, Previous: Coordinates, Next: Param Files Reading coordinates from nonstandard formats The reading of coordinates is done with the READ COOR command, and there are several options. One such option is the READ COOR UNIVersal command which will read using a previously specified format. The Universal format is specified by the READ UNIVersal command. This reads the specification from the input stream or from a specified file. READ UNIVersal The following commands clear the translation table and sets up default specifications for the file format. CHARMM - setup standard CHARMM format (default) PDB - setup brookhaven format AMBER - setup standard AMBER format UNKNown - setup null format (everything must be specified) The following commands specify the field locations of various items When reading free-of-field, the starting values are sorted to determine the ordering of parsing. SEGID start length RESID start length TYPE start length RESN start length IRES start length ISEQ start length X start length Y start length Z start length W start length The following commands specify how input lines should be considered. PICK start length string - choose only line that match one or more of these EXCL start length string - exclude any line that contains one of these TITL start length string - add any line containing one of these to the title The following commands specify character translation upon reading the file. TRANslate { SEGID external-segid internal-segid } { RESID external-resid internal-resid match-segid } { RESN external-resn internal-resn match-segid } { TYPE external-type internal-type match-resn match-segid } END - terminate reading universal file format  File: IO, Node: Param Files, Up: Read, Previous: Universal, Next: RTF File format The Format of Parameter Files [SYNTAX Parameter file format] Parameters can be read from cards or binary modules by the routine PARRDR. After the title, card file data is divided into sections beginning with a keyword line and followed by data lines read free field: BOND atom atom force_constant distance ANGLe or THETA atom atom atom force_constant theta_min UB_force_constant UB_rmin DIHE or PHI atom atom atom atom force_constant periodicity phase IMPRoper or IMPHI atom atom atom atom force_constant periodicity phase NBONd or NONB [nonbond-defaults] atom* polarizability e vdW_radius - [1-4 polarizability e vdW_radius] NBFIX atom_i* atom_j* emin rmin [ emin14 [ rmin14 ]] HBOND [AEXP ia] [REXP ir] [AHEX ih] [AAEX iaa] [hbond-defaults] donor-heavy-atom* acceptor-heavy-atom* well_depth distance where '*' allows wildcard specifications: * matches any string of characters (including none), % matches any single character, # matches any string of digits (including none), + matches any single digit. --------------------------------------------------------------------------- nonbond-defaults::= [NBXMod int] [CUTNB real] [CTOFNB real] [CTONNB real] [WMIN real] [E14Fac real] [EPS real] [ATOM ] [CDIElectric] [SHIFt ] [VATOm ] [VSWItch] [VDIStance] [BYGRoup] [GROUp] [RDIElectric] [SWITch] [VGROup] [VSHIft ] [VSIGma ] [BYCUbe ] hbond-defaults::= [ ACCEptor ] [ HBEXclude ] [ BEST ] [ NOACceptor ] [ HBNOexclude ] [ ALL ] [CUTHB real] [CTOFHB real] [CTONHB real] [CUTHA real] [CTOFHA real] [CTONHA real] [REXP int(def12)] [AEXP int(def10)] [HAEX int(def4)] [AAEX int(def2)] --------------------------------------------------------------------------- Sections end with the occurence of the next keyword line, or a line with the word END, the latter terminating parameter reading. Errors in the input file will result in warning messages but not termination of the run. No wildcard usage is allowed for bonds and angles. For dihedrals, two types are allowed; A - B - C - D (all four atoms specified) and X - A - B - X (only middle two atoms specified). Double dihedral specifications may be specified for the four atom type by listing a given set twice. When specifying this type in the topology file, specify a dihedral twice (with nothing intervening) and both forms will be used. There are five choices for wildcard usage for improper dihedrals; 1) A - B - C - D (all four atoms, double specification allowed) 2) A - X - X - B 3) X - A - B - C 4) X - A - B - X 5) X - X - A - B When classifying an improper dihedral, the first acceptable match (from the above order) is chosen. The match may be made in either direction ( A - B - C - D = D - C - B - A). The periodicity value for dihedrals and improper dihedral terms must be in integer. If it is positive, then a cosine functional form is used. Only positive values of 1,2,3,4,5 and 6 are allowed. Phase is either 0.0 or 180.0 for dihedrals with the minimum staggered or eclipsed respectively. When the periodicity is given as zero, for OTHER THAN THE FIRST dihdral in a multiple dihedral set, then a the amplitude is a constant added to the energy. This is needed to effect the Ryckaert-Bellemans potential for hydrocarbons (see below). When the periodicity is given as zero, then a harmonic restoring potential in (phi - phi_min) is used. The phase value gives phi_min for this option. This functional form is identical to that reported in the CHARMM paper, except that either functional form (refered to as proper and improper) may be used for dihedrals and improper dihedrals. The distinction between these terms is that seperate lookup tables are kept and the default atom choices are still different. For dihedrals, the selection is usually based on the middle two atoms, and for improper dihedrals, the selection is based on the outer two atoms. For either terms, all 4 atoms may be required. The HBOND line can be used to specify exponents for the hbond function, with ia and ir being the attractive and repulsive radial terms and ih and iaa the cosine exponents on the angular terms at the h and a respectively. Defaults 4, 6, 4, and 2 respectively. For atom types with no NBOND parameters given, no van der Waals interactions will be calculated. You will be warned, but be careful. The nbond parameters for 1-4 interactions can be specified by placing the extra set of parameters after the first. By default the same parameters will be used for 1-4 and all other interactions. NON-BOND parameter combination rules depend on how the parameters are listed. If the second number is negative, it is used as Emin, and Emin(ij)=-sqrt(Emin(i)*Emin(j)). If the second number is positive, it is used as Neff, and the Slater Kirkwood formula is used to compute Emin(ij). The PARRDR card field ,NBFIX, allows individual atom type van der Waal pair interactions to be specified. Subsequent lines must have; atom_i atom_j emin rmin [ emin14 [ rmin14 ]] If emin is positive, a severe warning is issued. The wildcard "X" may be given. In the case where both atoms are wildcards, the entire nbond parameter set will be modified. If emin14 and rmin14 are not specified, then the value of emin and rmin will be used. NOTE: The previous value will not be used. NBFIXes are processed in order. For that reason, wildcard usage should come first. In the case of duplicate specifications, there is no check, and the last specification will be used. The maximun number of NBFIX entries is currently set at 150. The space for this is allocated in PARMIO. PARAMETER I/O ADDENDUM: In order to calculate the Ryckaert-Bellemans torsional potential for butane and other extended atom hydrocarbons, the following terms should be included in the parameter file: V = gamma[1.116 - 1.462cos(phi) - 1.578 cos**2(phi) + 0.368 cos**3(phi) + 3.156 cos**4(phi) - 3.788 cos**5(phi)] and gamma = 1.987 kcal/mol J. P. Ryckaert and A. Bellemans, Chem. Phys. Lett. 30, 123 (1975). J. P. Ryckaert and A. Bellemans, Disc. Farad. Soc. 66, 95 (1978). PHI ! Ryckaert Bellemans has trans = 0.0 ! since cos is an even function cos(-phi)=cos(phi), invert the ! sign of the coefficients with odd power of cos(phi) CH3E CH2E CH2E CH3E 0.470467 5 0.0 CH3E CH2E CH2E CH3E 0.783947 4 0.0 CH3E CH2E CH2E CH3E 2.53516 3 0.0 CH3E CH2E CH2E CH3E 1.56789 2 0.0 CH3E CH2E CH2E CH3E 2.34787 1 0.0 CH3E CH2E CH2E CH3E -4.70368 0 0.0 The potential should be used with SHAKE bonds and angles or bonds only as required. The zero periodicity (constant) term should NOT be the first in the set, otherwise it will be treated as an improper torsion.  File: IO, Node: RTF File format, Up: Read, Previous: Coordinate, Next: Other files [SYNTAX RTF file format] The Format of a Residue Topology File Here is a description of what is currently (24-May-1982) in residue topology files (as they are stored in ascii files). You may use this format if you specify the CARD option in the READ command. The format of binary files depends on the current implementation of the RTF data structure (see RTF.FCM). The purpose of residue topology files is to store the information for generating a representation of macromolecule from its sequence. These files are read by RTFRDR a subroutine in RTFIO which should be be consulted for formats and the final word on what is actually done with these files. The residue topology files are named RTOP... . There are two forms, binary module (.MOD) and card format (usually .INP). The card format files are used only for creating binary modules and therefore are structured as input files for CHARMM, beginning with a run title and the command READ RTF CARD, followed by the actual topology file. The first section of the topology files is a title section in the usual format of up to ten lines delimited by a line containing only a * in column 1. The remaining information is read in free field format as commands to define the RTF. The ordering of the commands is important in that some information is needed to define others (i.e. the atoms of a residue must be defined before the bonds between them). The recommended structure of this file is: Initial setup: MASS specification for each atom type DECLarations of out of segment definitions DEFAults for patching on the fist and last residues AUTOgenerate anlges or dihedrals For each residue: RESIdue name and total charge specification (or PRESidue if this is a patch) ATOM definitions within this residue GROUping dividers between atom definitions BOND specification ANGLe specifications DIHEdral angle specifications IMPRoper dihedral angle specifications DONOr specifications ACCEptor specifications IC information PATChing residues to use if defaults are not desired Closing: END statement Display control: PRINT option The format above is not rigid. In particular, The 'out of residue declarations' may be augmented and redefined at any point. These declarations are checked against all 'out of segment' atom references. This is done to avoid potential problems where atom names are misspelled. The number following the declaration is ignored, and is for the users own reference (or debugging). The syntax of all subcommands are as follows: MASS atom-type-code atom-type-name mass DECLare name DEFAults [ FIRSt { name } ] [ LAST { name } ] { NONE } { NONE } AUTOgenerate [ ANGLes ] [ DIHEdrals ] [NOANgles] [NODIhedrals] { RESIdue } name [total-charge] { PRESidue } Residues labled PRES may only be used for patching. Residues defined with RESI may not be used as a patch. ATOM iupac atom-type-name charge repeat(exclusion-names) GROUp BOND repeat(iupac iupac) { ANGLe } repeat(iupac iupac iupac) { THETa } { DIHEdral } repeat(iupac iupac iupac iupac) { PHI } { IMPRoper } repeat(iupac iupac iupac iupac) { IMPHi } DONOr [ hydrogen ] [ heavy-atom ] [ antecedent-1 antecedent-2 ] [ BLNK ] [ hydrogen ] The antecedents are not required unless hydrogen position generation is desired. ACCEptor iupac [iupac [iupac] ] The first antecedents is required if and angle dependence about the acceptor atom is desired. The second antecedent is unused. { IC } { BILD } name name name name bond angle phi angle bond { BUILd } DELEte { ATOM } iupac [COMBine iupac] { BOND } (iupac iupac) { THETa | ANGLe } (iupac iupac iupac) { DIHEdral | PHI } (iupac iupac iupac iupac) { IMPHi | IMPRoper } (iupac iupac iupac iupac) Deletions are allowed only in patch residues (PRES); the optional COMBine keyword for ATOM deletions allows passing part of the IC data for the deleted atom to the "combine" atom, i.e. stereochemistry of atoms bonded to the deleted atom. PATChing [ FIRSt { name } ] [ LAST { name } ] { NONE } { NONE } PRINt { ON } { OFF } The PRINt command may be used to control the display of lines as they are read by the RTF reader. The initial setting for printing is controlled by the READ command itself. If PRINT is specified, then printing will initially be enabled; otherwise, the commands will not be echoed. PRINT ON turns on echoing of RTF specifications; PRINT OFF turns them off. This command is useful for debugging an addition to a previously tested topology file. A small sample RTF card file follows: * title for documentation example * 18 1 MASS 1 H 1.00800 MASS 11 C 12.01100 MASS 12 CH1E 13.01900 MASS 13 CH2E 14.02700 MASS 14 CH3E 15.03500 MASS 31 N 14.00670 MASS 38 NH1 14.00670 MASS 51 O 15.99940 MASS 56 OH2 15.99940 DECL -C DECL -O DECL +N DECL +H DECL +CA DEFA FIRS NTER LAST CTER RESI ALA 0.00000 GROU ATOM N NH1 -0.35 ATOM H H 0.25 ATOM CA CH1E 0.10 GROU ATOM CB CH3E 0.00 GROU ATOM C C 0.45 ATOM O O -0.45 BOND N CA CA C C +N C O N H BOND CA CB THET -C N CA N CA C CA C +N THET CA C O O C +N -C N H THET H N CA N CA CB C CA CB DIHE -C N CA C N CA C +N CA C +N +CA IMPH N -C CA H C CA +N O CA N C CB DONO H N -C CA ACCE O C BILD -C CA *N H 0.0000 0.00 180.00 0.00 0.0000 BILD -C N CA C 0.0000 0.00 180.00 0.00 0.0000 BILD N CA C +N 0.0000 0.00 180.00 0.00 0.0000 BILD +N CA *C O 0.0000 0.00 180.00 0.00 0.0000 BILD CA C +N +CA 0.0000 0.00 180.00 0.00 0.0000 BILD N C *CA CB 0.0000 0.00 120.00 0.00 0.0000 RESI OH2 0.00000 GROUP ATOM OH2 OH2 -0.40000 H1 H2 ATOM H1 H 0.20000 H2 ATOM H2 H 0.20000 BOND OH2 H1 OH2 H2 THET H1 OH2 H2 DONO H1 OH2 -O -O DONO H2 OH2 -O -O ACCE OH2 PATC FIRS NONE LAST NONE END  File: IO, Node: Other files, Up: Read, Previous: RTF file format, Next: Read Reading data other than the sequence or coordinates The parameter files (PARA) and internal coordinate files (IC) and hydrogen bond (HBONd) data files can be read as card images or binary files. Specifying CARD signifies card image input; specifying FILE signfies binary file input. Please note that topology file must be read in before the parameters can be read. Protein structure files (PSF) files and non bonded lists (NBONd) an only be read as binary files. The constraints (CONStraint) which includes both dihedral and harmonic restraints may only be read as a formatted file (card). There are two types of IC card files (residue number vs. resid's). The residue number option is the default, and atom assignments are based on residue number. This is the low precision form. The resid option is the high precision form and atom assignments are based on SIGID's and RESID's. This is also useful where different homologies are used. The Image file (IMAGes) containing transformation information can only be read in card image format (see *note images:(doc/images.doc).). The INIT keyword will remove all existing image data. Without the INIT keyword, any existing image items (such as bonds) would be kept. This allows one to modify the crystal geometry without the necessity of regenerating all image items. The TABLe file contains the nonbond energy lookup information. Once read in, The effects cannot be reversed. The nonbond energy evaluation is now under control of the table routines.  File: IO, Node: Write, Up: Top, Previous: Read, Next: Print WRITe - Writes Data Structures to External Files [SYNTAX WRITe] Syntax WRITe { { PSF } [FILE] } UNIT unit-number { [CARD] [XPLOr] } { { RTF } } { { PARAmeter } } { { NBONd } } { { TABLe } } { } { { COORdinate coor-spec } [CARD] } { [PDB ] } { [DUMB] } { { IC [RESId] } [FILE] } { { HBONd [ANAL] } } { } { { IMAGes imag-spec} [CARD] } { { ENERgy } } { { CONStraint } } { { TITLe } } title coor-spec:== [COMP] [OFFS int] [IMAGes] atom-selection imag-spec::= [ TRANsformations ] [ FORCes ] [ PSF ] Function The primary purpose of this command to save some of CHARMM's data structures. The coordinate and internal coordinate data structures can be written in formatted form so that they be edited independent of CHARMM using a text editor. The option, FILE, specifies that a file is to be written in unformatted form (binary). The option, CARD, specifies that a file is to written in formatted form. For the coordinate and internal coordinate file, CARD is the default. The coordinate option PDB gives a file in Protein Data Bank format, with just the ATOM records. The XPLOr option of WRITe PSF produces an XPLOR style PSF file (atom names are used instead of atom numbers) A set of title lines must follow the WRIT command. This title will be written at the start of the file and serves to document the file. For your protection, one should always make good use of this title, as it may be the only documentation for the file. The UNIT keyword specifes what Fortran unit the output should be written to. It cannot be omitted.  File: IO, Node: Print, Up: Top, Previous: Write, Next: Titles PRINt - writes information to output file (unit 6) [SYNTAX PRINt] Syntax PRINt { PSF [XPLOr] } { RTF } { CONStraint } { PARAmeter } { RESIdue } { COORdinate coor-spec } { IC } { HBONd [ ANAL ] } { IMAGes imag-spec } { TITLe } { ENERgy } coor-spec::= [COMP] [OFFS int] [IMAGes] atom-selection imag-spec::= [ TRANsformations ] [ FORCes ] [ PSF ] Syntactic ordering: All commands must be typed in the order shown. Function This command is used to list information contained in data structures used by the program. The information must already have been created through use of a READ, GENE, HBON, etc., command. The printable output is sent to unit 6. The XPLOr option of PRINt PSF produces an XPLOR type PSF listing. Atom names are printed instead of atom numbers. For hydrogen bonds, ANAL gives a geometrical and energy analysis of the hydrogen bonds. Representing the hydrogen bond as A2-A1-X-H....Y-, the distances X-Y, H-Y, the angle (180 - specifies that CPUminutes from the time the command is given is to be one deadline. Keyword CLOCk sets the time HH.MM (in 24-hour format) as one deadline. The routine assumes that if the command is issued after the specified time, you mean the following day. (If at 6 pm you start a job containing the line DEAD CLOC 13.00 CPU 600. your minimization will run until 600 CPU-minutes have been used, or until 1 pm the next day, whichever comes first.) 22) The ATLImit command can be given at any point in the input file. CHARMM checks before reading each command if either of the DEADlines (CPU or CLOCk) has been reached. If this is the case the alternate_command of the most recent ATLImit command is executed. This would typically be a GOTO SHUTdown or some other simple thing, but could be any CHARMM command. Currently the alternate_command is limited to 80 characters. 23) Substitutions and punctuation in command input. "!" Ignore this and all subsequent characters on this line "-" If this is the last character of a line then the following line is a continuation "*" As a first character indicates a title line. Alone on a line indicates a title terminator. "$" The default delimiter "* % # +" Atom selection wildcards, alone or in a word "@" Command parameter substitution "?" Energy value substitution 24) File inquiry. The inquiry command (from CHARMM) may be used to get a list of currently open files. This is very useful in interactive sessions when one has forgotten which FORTRAN units are already assigned. The command won't work if the files are assigned outside of CHARMM. 25) Random number generation. The expression ?RAND will have a random number substituted for it during command line evaluation. The default is to provide a number from a uniform distribution, between 0.0 and 1.0; the RANDom command allows modification of the distribution type and specification of other factors. The only required keyword is the distribution type, which must be second; for a GAUSsian distribution, a value for sigma is required; the default mean is 0.0. RANDom UNIForm [SCALe scale] [OFFSet offset] [ASIN] [ISEEd iseed] GAUSsian sigma [ACOS] Additional keywords: SCALe scale multiply the number by scale OFFSet offset add offset to the number ACOS treat the number as a cosine and return the angle (deg) ASIN treat the number as a sine and return the angle (deg) ISEEd iseed specify a new random seed (integer) Examples: RANDOM GAUSS 0.2 SCALE 10.0 ! gaussian mean of 0.0 with a sigma of 2. RANDOM UNIFORM SCALE 360. ! uniform 0. to 360 RANDOM UNIFORM ACOS SCALE .5 ! uniform angles with cosines from 0. to .5 RAND GAUS 5. OFFS 60. ! gaussian mean of 60. with a sigma of 5. RAND UNIF ISEED 7734 ! uniform new random seed Subsequent use of ?RAND will substitute a number from the appropriate distribution.  File: MISCOM, Node: Substitute, Up: Top, Previous: Function, Next: Top Command Line Substitution Parameters The following are substituiton parameters available within CHARMM; General: 'PI ' - Pi, 3.141592653589793 'KBLZ' - The Boltzmann factor (0.001987191) 'NSEG' - Number of segments 'NRES' - Number of residues 'NATO' - Number of atoms 'NGRP' - Number of groups 'NBON' - Number of bonds 'NTHE' - Number of angles 'NPHI' - Number of dihedrals 'NIMP' - Number of improper dihedrals 'NACC' - Number of acceptors 'NDON' - Number of donors Coordinate manipulation parameters: 'XAXI' - vector and length of defined axis form the COOR AXIS command. 'YAXI' 'ZAXI' 'RAXI' 'XCEN' - origin of axis vector 'YCEN' 'ZCEN' 'XMIN' - Extreem values from the COOR STAT command 'YMIN' 'ZMIN' 'WMIN' 'XMAX' 'YMAX' 'ZMAX' 'WMAX' 'XAVE' - Average values from the COOR STAT command 'YAVE' 'ZAVE' 'WAVE' 'MASS' 'RMS ' - Root mean squared difference between two structures. 'XMOV' - displacement of atoms from best fit. 'YMOV' - 'ZMOV' - 'THET' - Angle of rotation from best fit 'VOLU' - Volume from COOR VOLUme command 'MIND' - minimum distance from the COOR MIND command. 'RGYR' - Radius of gyration for the COOR RGYR command. 'XCM ' - Center of mass 'YCM ' 'ZCM ' SCALar STATistics command substitution parameters: 'SMIN' - Minimum value 'SMAX' - Maximum value 'SAVE' - Average value 'SWEI' - Total weight used in the averaging 'STOT' - Total of selected atoms 'NSEL' - Number of selected atoms Quick command substitution paramteters: 'XVAL' - X position of group of atoms 'YVAL' - X position of group of atoms 'ZVAL' - X position of group of atoms 'DIST' - Distance between two atom analysis 'THET' - Angle for three atom analysis 'PHI ' - Dihedral for four atom analysis Atom selection parameters: 'NSEL' - Number of selected atoms from the most recent atom selection. Vibrational analysis of thermodynamic properties: 'FTOT' - Vibrational free energy. 'STOT' - Vibrational entropy. 'HTOT' - Vibrational enthalpy. 'CTOT' - Vibrational heat capacity. 'ZTOT' - Zero point correction energy. 'FCTO' - Classical vibrational free energy. 'ETOT' - Total harmonic limit classical free energy (to compare with free energy perturbation simulations). See energy.doc for the energy related substitution parameters. New commands: I. SPECIfy specify-keywords specify-keywords ::= PARAllel [NCPU integer-number-of-cpus] | FLUSh | NOFLush | NBFActor real-nonbond-memory-factor | FNBL { ON | OFF } EWEX { ON | OFF } description: 1. PARAllel - Tells CHARMm to run parallel (where possible). The optional NCPU keyword specifies the maximum number of processors to use. If a number is specified that is greater than the maximum allowed for the particular machine, a warning message is printed and the number of cpu's is set to the maximum. Note that at startup CHARMm senses the number of cpu's and sets NCPU accordingly. 2. FLUSh - Specifies the that trajectory; coordinate; dynamics restart and other output files should be flushed after each data set is written. See below. This is the default action. The command is provided to reset 4. NBFActor - When the parallel non-bond list generators allocate memory for the temporary arrays used by each thread, the predicted size of list array (MXJNB and the like), is divided by the number of cpu's and multiplied by NBFACT. The default is 1.5 and has worked well so far. If it doesn't the SPECIfy NBFACT command is available to adjust it. 5. FNBL - FastNonBondListgeneration - Specifies whether or not to use the new non-bond list generation routines. Just included for testing and timing purposes. 6. EWEX - Ewald Exclusions - Specifies whether to generate special exclusion pair lists needed for Ewald calculations where there are non-bond exclusions. Default is on. If Ewald is not used the memory can be saved and some small amount of execution time skipped. See discussion of Ewald below. II. SYSTem "unix bourne shell commands" This command permits the user to issue Unix shell commands from the program. The command string must be enclosed in double quotes to prevent the CHARMm parser from converting the string to uppercase. C DEC/CMS REPLACEMENT HISTORY, Element MOLVIB.DOC C *1 12-SEP-1991 18:55:43 WON "Kuczera's MOLVIB documentation" C DEC/CMS REPLACEMENT HISTORY, Element MOLVIB.DOC  File: Molvib, Node: Top, Up: (doc/commands.doc), Next: Syntax The MOLVIB Module of CHARMM By K.Kuczera & J.Wiorkiewicz-Kuczera, May 1991 MOLVIB is a general-purpose vibrational analysis program, suitable for small to medium sized molecules (say of less than 50 atoms). For larger systems the detail of description may be too great. The main options are: - the vibrational problem in internal coordinates (GF) - the vibrational problem in cartesian coordinates (GFX) - analysis of GAUSSIAN program output (GFX,GAUS) - analysis of dependencies in internal coordinate sets (G) - canonic force field calculations (KANO) - crystal normal mode analysis for k=0 (CRYS) - generating cartesian displacements along some interesing directions (STEP) The different options use mostly the same package of subroutines called in different order. New applications may thus be easily added when necessary. Of special interest is the symbolic PED analysis package, enabling a clear and condensed overview of the usually complex PED contributions. * Menu: * Syntax:: Syntax of the MOLVIB command * Function:: Purpose of each of the keywords * Input:: MOLVIB Input Description  File: Molvib, Node: Syntax, Up: Top, Previous: Top, Next: Function [SYNTAX MOLVib command] MOLVib NDI1 int NDI2 int NDI3 int [NATOm int] [MAXSymbol int] [NGMAx int] [NBLMax int] [IZMAx int] [NOTOpology] [SECOnd] [PRINt]  File: Molvib, Node: Function, Previous: Syntax, Up: Top, Next: Input The following section describes the keywords of the MOLVib command. NDI1,NDI2,NDI3 are the MOLVIB variables NQ, NIC, NIC0; their definition here effectively replaces the MOLVIB 'DIM ' card. Two cases: a. Molecular vibrations NIC0 - number of primitive internal coordinates (PIC's), this must correspond to the number of entries following the 'IC' card NIC - number of IC's left after transformation by first U matrix If only one U matrix is used, this should be the same as NQ; if no U matrices used, NQ=NIC=NIC0 NQ - number of vibrational degrees of freedom. Usually this is the famous number 3*Nat-6 (3*Nat-5), but also separate symmetry blocks of the vibrational Hamiltonian may be entered. b. Crystal vibrations NIC0 - no. of primitive molecular coords (MC), i.e. external coords + primitive IC's NIC - no. of vibrational degrees of freedom = 3*NAT, where NAT is the total no. of atoms in unit cell NQ - here =NIC NATOm - defines the number of atoms in the system. To be used only in conjunction with the NOTO flag. If NOTO is not provided, the number of atoms from the PSF will be used in MOLVIB and will override any values provided here. IZMAx - needed for 'CRYS' option only - specifies maximum number of molecules in unit cell. Default is 10. MAXSymbol, NGMax, NBLMax - dimensions for PED analysis arrays. They specify the maximum number of symbols, coordinate groups and symmetry blocks, respectively. Defaults are NQ (NDI1) for all three. It is recommended not to modify these defaults. NOTOpology - if flag is present, CHARMM data structures will not be used, all information required is to be read in inside the module. If flag is absent, cartesian coordinates, atomic masses and cartesian force constants from CHARMM may be passed to MOLVIB, as needed. SECOnd - calculate second derivatives (force constants in cartesian coordinates) and pass them to MOLVIB. This is done through a call to the CHARMM routine ENERGY, so all preconditions for energy (and second derivative) calculations must be met. PRINT - flag for test printout of the CHARMM second derivatives being passed to MOLVIB.  File: Molvib, Node: Input, Up: Top, Previous: Function, Next: Top Input Description This data is processed by subroutine MOLINP. As the CHARMM command parser is not used, this input does not conform to CHARMM standards, e.g. - parameter substitution will not work - the STREAM command will not work, all commands will be read from the current input stream - OPEN, READ, WRITE, etc. commands will not work - most entries are not free format [SYNTAX MOLVIB input] The MOLVIB input consists of a series of blocks; each block consists of a command and an (optional) data structure; i.e. it has the form: command-spec [data-struc] command-spec ::== keyword [] [] [] [] format: A4,6X,4I5 data-struc ::== one of the MOLVIB input data structures; defined by the keyword. The list of currently supported keywords folows. One of the first group of keywords must be used first in order to define type of calculation. Keyword Interpretation G - perform redundancy analysis GF - solve standard Wilson GF problem GAUS - choose GAUSSIAN analysis option GFX - vibrational problem in cartesian coordinates KANO - determine canonical force field CRYS - crystal vibrations for k=0; STEP - generate cartesian displacements in a given For the remaining keywords, the order is arbitrary: Keyword Interpretation CART - read in cartesian coordinates MASA - interpret fourth column of cartesian coord input as A numbers MASZ - interpret the above column as Z numbers UMAT - read in U Matrix for similarity transformation FMAT - read in F matrix LX - read in cartesian eigenvectors IC - read in internal/external coordinate definitions PRNT - set print level TEST - set print level NULL - control card for 'G ' option with IGLEV=2 PED - read in PED data structure SCAL - read in scale factor for F matrix TRRM - remove translational and rotational contributions to LX MNAT - read in the numbers of atoms for each molecule in unit cell IFTR - specifies the dimension (and type) of F matrix SYMM - read in symmetry blocking data EXPF - read in reference frequencies for the system END - end input section, perform MOLVIB calculations and This section gives a more detailed explanantion of the keywords and the assocaited data structures. keyword Interpretation G - perform redundancy analysis == IGLEV IGLEV=1 - diagonalize G and write out eigenvalues and eigenvectors IGLEV=2 - additionally generate a set of null and independent coordinates orthogonal to the initially specified ones GF - solve standard Wilson GF problem GAUS - choose GAUSSIAN analysis option GFX - vibrational problem in cartesian coordinates KANO - determine canonical force field == ICANO ICANO=1 - preliminary run, just to output the FR matrix; one of the other keywords must follow GF, GFX or GAUS - so that FR is evaluated or just read in as part of those processes. ICANO=2 - evaluation of the canonic force field FR* N.B. No U matrix allowed here. Give: DIMensions, CARTesian coords, IC's and FMAT. CRYS - crystal vibrations for k=0; == IZMOL, == IFCRYS IZMOL - no. of molecules in unit cell IFCRYS=0 (default) - calculation analogous to GFX STEP - generate cartesian displacements in a given direction. == IFSTEP == ISTCOR == IFFMAT == IFLMAT Additionally, the card following the 'STEP' card contains the value of STPSIZ (real,free format) IFSTEP=1 - cartesian eigenvector no. ISTCOR (IFSTEP=2 - internal eigenvector no. ISTCOR, not implemented) (IFSTEP=3 - internal coordinate no. ISTCOR, not implemented) STPSIZ - step size, e.g. the transformation is X(I)=X(I)+STPSIZ*LX(I,ISTCOR) for cart. eigenvectors where LX the columns of LX are normalized. IFFMAT,IFLMAT - determine the starting point of the calculation: IFFMAT=0 and IFLMAT =1 - start from LX =2 - start from LS IFLMAT=0 and IFFMAT =1 - start from FX =2 - start from FS CART - cartesian coordinates for NAT atoms will follow == unused == IFC In MOLVIB usually used to define the number of atoms NAT. In the CHARMM version, NAT is specified on the MOLVIB command line (if NOTO flag is used) or is read from the PSF (if NOTO is absent). IFC - specifies format for cart. coords: IFC=0 free format, four real numbers per line X, Y, Z, and MASS (see below). IFC=1 CHARMM format; only atom entry lines, no titles or NATOM field, mass information in WMAIN field. N.B. For the 'GAUS' option use GAUSSIANxx CMS coordinates. FOR THE 'GFX ' option use GAUSSIANxx coordinates in the Z-matrix orientation. Mass specification : (1) enter mass in amu as fourth real number in entry line for each atom. (2) instead of mass place atomic number Z or mass number A as fourth real number and subsequently use a 'MASZ' or 'MASA' control cards. NB. For 'CRYS' NAT should be equal to no. of atoms in unit cell. MASA - interpret fourth column of cartesian coord input as atomic mass numbers (A) ; useful for isotopes, e.g. a mass of 2.0 will designate D, mass of 15.0 - 15N etc. MASZ - interpret the above column as atomic numbers (Z) UMAT - read in U Matrix for similarity transformation == IFU == INU == IUU == IZU IFU - defines format =0 Schachtschneider/Snyder format only supported INU = 1/0 - normalize/dont normalize rows of U IUU - defines FORTRAN unit for U read if left blank, unit input stream will be used if >0 then the data should be provided in the correct FORTRAN file IZU - multiplicity; usually IZU=(no. of molecules in unit cell). IZU.GT.1 turns on autogeneration of U for whole unit cell from the provided values for the first asymmetric unit. (see SUBROUTINE RDMAT in MOLVIO) Details: Two, one or none U matrices may be supplied on input. These are (generally) rectangular matrices which perform linear transformations on internal coordinate sets, of the type S=U*R ( or S(i) = {sum over j} U(i,j)*R(j) ), with S - final, and R initial coordinate sets. The function of the U matrix is e.g. to transform from primitive IC' s (of which there are NIC0>NQ) to a set of independent IC's NQ in number, or to scale the IC's by a factor (useful when trying to reproduce vibrations reported in the literature, as different research groups use different definitons of angle or dihedral IC's). If two U matrices are given, then the IC's (and the B and G matrices) are sequentially transformed using first U1, then U2. The F matrix is assumed to be expressed in the final IC's on input, and is not transformed (except for the 'KANO' option - see 'IFTR'). FMAT - read in F matrix, (the second derivatives of energy wrt coordinates) == IFF == ISF == IUF IFF - specifies format = 0 - Schatschneider/Snyder format = 1 - GAUSSIANxx format N.B. remember to use 'Z matrix' oriented cartesian coords. = 2 - CHARMM formatted SECO file format IFS = 1/0 - symmetrize/dont symmetrize (upper triangle assumed on input) IUF - FORTAN unit no., as for 'UMAT' (see RDMAT, RFORC, RDSECO in MOLVIO) LX - read in cartesian eigenvectors == IFL == unused == IUL IFL - specifies format = 0 - GAUSSIANxx format (see SUBROUTINE REIGEN) all 3*NAT eigenvectors read in N.B. remember to use 'standard' (or 'CMS') oriented cartesian coordinates = 1 - CHARMM binary format (see SUBROUTINE REIGCH) only the NQ=3*NAT-6 "vibrational" eigenvectors are expected by REIGCH; use "WRITE NORM 7 THRU ..." command to achieve this. NB. Binary files are machine specific. IUL - FORTAN unit no. from which to read , aas in 'UMAT' IC - read in internal/external coordinate definitions; == IZIC Five integers will be read from NIC0 lines in free format; each line contains: ITYP,I,J,K,L - specify type and four atom numbers as defined in cartesian coordinates Note: it is necessary to add zeros in unused fields. IZIC - multiplicity, usually = no. of molecules in unit cell. IZIC.GT.1 turns on autogeneration of internal/external coordinates for unit cell from the ones provided for the first asymmetric unit. ITYP=1,2,3,4 - internal coordinates ITYP = 1 - I-J bond stretch I --- J ITYP = 2 - I-J-K angle bend J / \ / \ I K ITYP = 3 - I-L bond angle with J-K-L plane (Wilson wag) K / / I --- L \ \ J ITYP = 4 - angle between IJK abd JKL planes I \ \ J --- K \ \ L For details : a) see SUBROUTINE BMAT b) see Wilson,Decius,Cross section 4.1, substituting their atom numbers with: ITYP=1 (12) -> (IJ) 2 (123) -> (IKJ) ! thats right ! 3 (1234) -> (IJKL) 4 (1234) -> (IJKL) A good reference for standard definitions of independent internal coordinates for a wide selection of chemical groups is: P.Pulay,G.Fogarasi,F.Pang & J.E.Boggs, JACS 101, 2550 (1979) For the 'CRYS' option, the external coordinates are defined here; their codes: ITYP=11 - x translation ITYP=12 - y translation ITYP=13 - z translation ITYP=14 - x rotation ITYP=15 - y rotation ITYP=16 - z rotation In this case the I field should hold the consecutive number of the molecule in the unit cell (consistent with MNAT data). PRNT - set print level == IPRNT IPRNT=0 - minimal printout IPRNT=5 - maximum printout [default is 2] TEST - equivalent to 'PRNT' with IPRNT=4 NULL - control card for 'G ' option with IGLEV=2 == NULL == NSTRT NULL = the number of orthonormal vectors for the null space to be read from the U2 matrix NSTRT = the number of starting vectors for the Gram-Schmidt procedure in the vibrational space Note: If any null coordinates are known, they should be orthonormalized and placed in the first NULL rows of U2. The program will then write out the complete set of orthonormal coordinates spanning the null space, starting with the ones provided. If NSTRT.GT.0 a completely independent calculation will be performed in the vibrational space. In that case, the NULL+1,...NULL+NSTRT rows of U2 should contain the known coordinates of the vibrational space orthogonal to each other and the redundancies (null space vectors). The program will construct an orthonormal basis of the vibrational space which is orthogonal to the redundancies, starting with the provided vectors. PED - symbolic PED analysis will be performed == NGRUP == NCUTP NGRUP - number of coordinate groups to be defined NCUTP - cutoff level; PED contributions below NCUTP % will not be printed, for clarity (default is 3%). The following cards must contain: 1. for each group I=1,NGRUP: LGRUP(I),IGRUP(I,J), J=1,LGRUP(I) - the number of coords in group and their consecutive numbers (these are the final numbers, i.e. after all U matrix operations) (20I3) 2. for each coordinate : IS,SS - its consecutive number (after all U matrix operations) and the assigned symbol. 4(I3,2X,A8,2X) - zero to four entries per line; blank fields skipped, negative IS value to end this input section. Only the first coord of each group needs a symbol defnition, the rest are set to this string; contributions from the whole group are added up and printed beside the group symbol. SCAL - scale the F matrix Fij' = FACT*Fij; the real value of FACT will be read from next line (F10.6). TRRM - remove translational and rotational contributions from cartesian coordinate vibrational eigenvectors. (currently used only for GAUS) MNAT - lines following this card will contain the numbers of atoms of the individual molecules comprising the unit cell (or molecular aggregate) in 20I3 format. Application - makes possible external coordinate use in vibrational analysis of mixed crystals or molecular aggregates (use CRYS option in both cases). The value of IZMOL should already be defined for this card IFTR - specifies the dimension (and type) of F matrix == IFTRAN = 0 - F is in primitive ICs R, NIC0xNIC0 = 1 - F is in S1=U1*R, NICxNIC = 2 - F is in S2=U2*U1*R coords, NQxNQ If card is not given, default IFTRAN=NUMAT is assumed (works only for 'KANO' option) SYMM - use symmetry (in symbolic PED analysis only) == NBLOCK It is assumed that by use of similarity transformations (the U matrices), the vibrational problem has been transformed to such coordinates that the Hamiltonian (G and F) is block-diagonal. This usually happens if the coordinates form a basis for the irreducible representations of the molecular point group. The following cards should contain the data: IBLOCK(I),I=1,NBLOCK - sizes of consecutive blocks (coordinate numbering is as for PED analysis, i.e. after all U matrix transformations) SBLOCK(I),I=1,NBLOCK - block symbols (e.g. representation names) EXPF - read in reference frequencies for the system Frequencies should be in ascending order (if 'SYMM' is present, the ordering should be separate within each block). The frequencies from MOLVIB will be printed out side-by-side with the reference set, differences and an rms deviation will computed. (If 'SYMM' is present, a separate analysis will be performed for each block). Format: free, 1 real value per line. END - end input section, perform MOLVIB calculations and return to CHARMM. Note: the Schactschneider/Snyder format This format is very useful for i/o of sparse matrices (or small and not so sparse ones). The basic format is: 4(2I3,F12.6) The two integer fields specify the row and column number, the real field - the value of the array element. Any elements not explicitly specified are set to zero. Each line of input may contain 0-4 entries, blank lines are ignored, a negative value for the column number terminates input. See subroutine RDMAT in MOLVIO. C DEC/CMS REPLACEMENT HISTORY, Element MONITOR.DOC C *3 6-MAY-1991 17:39:05 WON "Info directive fixed" C *2 4-FEB-1991 17:30:50 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:50:32 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element MONITOR.DOC  File: Monitor, Node: Top, Up: (doc/dynamc.doc), Next: Syntax Monitor commands: Commands to monitor various dynamics properties * Menu: * Syntax:: Syntax of the Monitor commands * Properties:: Description of the properties monitored  File: Monitor, Node: Syntax, Up: Top, Next: Properties, Previous: Top [SYNTAX MONItor dihedral transitions] Syntax of the MONItor commands MONItor {DIHEdral} [SHOW] FIRSt unit-number NUNIt integer BEGIn integer - STOP integer SKIP integer [SELEct atom-selection] FIRSt the unit number of the first file of dynamics coordinate sets from which the property is to be calculated. NUNIt the number of units of dynamics coordinate files. Fortran unit numbers must be assigned to the files consecutively from FIRST. BEGIn the first step number for the coordinate set from which the property will be calculated. STOP the last step number for the coordinate set from which the property will be calculated. SKIP the time increment between the step numbers of the coordinates. SELEct selected atoms for which the property is to be monitored. At this time, atoms may be selected only by the atom-selection keywords (e.g. RESID,TYPE,ATOM,RESN,SEGID) and NOT by tag-selections. (see *note select:(doc/select.doc).) DIHE Property: monitor the dihedral transitions. SHOW for monitoring dihedral transitions, print out the step number, the cumulative number of transitions, the dihedral name, the current dihedral angle, and the old and new minimum well positions each time a transition is found. ALL Lots of printout. UNIT Unit number to write results (default: outu)  File: Monitor, Node: Properties, Up: Top, Previous: Syntax, Next: Top Properties monitored using the MONItor commands DIHE: Dihedral transitions are monitored for any dihedral angle which can be made from the atoms selected. A transition is defined as a change in the dihedral angle which results in going from one well of the torsion potential to another well, AND which involves crossing at least 30 degrees beyond the barrier at the potential maximum. That is, for rotation about a bond between tetrahedral carbons, the minima are at +60, 180 and -60, while the maxima are at 0, +120 and -120. For an initial angle of +45, a transition is counted if the angle becomes > +150 or < -30. The old minimum was +60, and the new minima would be 180 or -60, respectively. The angle can change by as much as 120 degrees or as little as 60 degrees in going from one well to the next using this algorithm. For bonded atoms which both have trigonal geometry, the minima are +90 and -90, and a transition requires crossing 0 +- 30, or 180 +- 30 degrees. Only transitions for dihedrals with either 2 or 3 periodicity can be counted with the MONIt command. A word of caution: the above algorithm for counting transitions is by no means fool proof, therefore one should always look at the dihedral time series to obtain a more precise number of transitions. This is particularly true for mainchain phi and psi dihedrals which frequently have average positions which are not close to the minima for a tetrahedral atom. Large fluctuations can therefore be mistakenly (in a classical butane-type transition) counted as transitions. C DEC/CMS REPLACEMENT HISTORY, Element NBONDS.DOC C *7 5-FEB-1992 23:37:14 WON "Stote: extended electrostatics" C *6 18-NOV-1991 15:01:14 WON "Updated by B. Brooks and S. Fleischman" C *5 24-OCT-1991 01:33:53 WON "17-OCT-91 NIH update" C *4 13-MAY-1991 20:47:53 FISCHER "Doc. for INBFRQ : heuristic update-testing" C *3 6-MAY-1991 17:39:55 WON "Info directive fixed" C *2 4-FEB-1991 17:32:05 WON "from NIH, 02-Feb-91" C *1 8-APR-1990 19:50:35 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element NBONDS.DOC  File: Nbonds, Node: Top, Up: (doc/commands.doc), Next: Syntax Generation of Non-bonded Interactions Nonbonded interactions (frequently abreviated nbond) refer to van der Waals terms and the electrostatic terms between all atom pairs that are not specifically excluded from nonbond calculations as for example are directly bonded atoms *note nbx: (doc/struct.doc)nbx. These terms are defined on atom pairs and to a first aproximation would require the number of atoms squared calculations. To avoid this burden various truncation and approximation schemes can be employed in the program, breaking the nonbonded calculation into two parts, initialization and actual energy calculation. The method of approximation, cutoffs, and other relevant parameters can be entered any time the nbond specification parser is invoked. See the syntax section for a list of all commands that invoke this parser. * Menu: * Syntax:: Syntax of the nonbond specification * Defaults:: Defaults used in the nonbond specification * Function:: Description of the options * Tables:: Using nonbond lookup tables in place of analytic potential energy functions  File: Nbonds, Node: Syntax, Up: Top, Next: Defaults, Previous: Top [SYNTAX NBONDs] { NBONds } { [INBFrq integer] nonbond-spec } { UPDAte ... } { } { ENERgy ... } { } { MINImize ... } { } { DYNAmics ... } { } COMPARE ... NBOND [nonbond-spec] ... NOTE: The INBFrq value is remembered. If its value is zero, no interpretation of [nonbond-spec] will be made, as well as no modifications of the nonbond lists. It's default value is -1 . In all cases as many keywords and values as desired may be specified. The keywords are: nonbond-spec::= [method-spec] [distances-spec] [misc-specs] [INIT] [RESET] method-spec::= [ ELEC electrostatics-spec ] [ VDW vdw-spec ] [ st2-spec ] [ NOELectrostatics ] [ NOVDwaals ] [BYCUbe ] [BYGRoup] electrostatics-spec::= [ ATOM ] [ CDIElec ] [ SHIFted ] [ GROUp [ EXTEnded [ GRADients ] [ QUADrip ] ] ] [ RDIElec ] [ SWITched ] [ [ [ NOGRad ] [ NOQUads ] ] ] [ FSWItch ] [ [ NOEXtended ] ] [ FSHIft ] [ VGROUP [ VSWITched ] ] [ ] vdw-spec::= [ VATOM [ VSHIfted ] ] [ [ VSWItched ] ] [ [ VFSWitch ] ] st2-spec::= [ ST2List ] [ ST2Nolist ] distances-spec::= [general-dist] [vdw-sigma-distances] [warning-dist] general-dist::= [ CUTNB real ] [ CTONNB real ] [ CTOFNB real ] [ CTEXNB real ] vdw-sigma-distances::= [ SIGCUT real ] [ SIGADD real ] [ SIGON real ] [ SIGOFF real ] warning-dist::= [ WMIN real ] [ WRNMXD real ] [ SIGMAX real ] [ SIGWRN real ] misc-specs::= [ EPS real ] [ E14Factor real ] [ NBXM integer] [ NORXN ] [ RXNFLD rxnfld-spec ] [ RXNNB rxnfld-spec ] rxnfld-spec::= [ EPSEXT real ] [ ORDER integer ] [ SHELL real ]  File: Nbonds, Node: Defaults, Up: Top, Next: Function, Previous: Syntax The defaults for the nonbond specification reside with the parameter file. The defaults are specified at the begining of the van der Waal section. These defaults are the recommended options. The following command contains all defaults for one of the older protein parameter files, and is equvalent to the command "NBONDS INIT" in it usage when this parameter file is present. NBONDS ELEC ATOM NOEX NOGR NOQU SWIT RDIE VATOM VDW VSWI VDIS ST2L - CUTNB 8.0 CTEXNB 999.0 CTOFNB 7.5 CTONNB 6.5 - SIGCUT 1.5 SIGADD 0.5 SIGON 1.25 SIGOFF 1.5 - WMIN 1.5 WRNMXD 0.5 SIGMAX 7.5 SIGWRN 0.7 - EPS 1.0 NORXN EPSEXT 80.0 ORDER 10 SHELL 2.0 Values do not change unless explicitly specified, except for the ON/OFF values which cascade when the cutoff values are changed as; CTOFNB=CUTNB-0.5 CTONNB=CTOFNB-1.0 SIGOFF=SIGCUT SIGON =SIGOFF-0.25 WARNING:: These old defaults have been shown to be detrimental to protein behavior. It is generally better to use the defaults in the parameter sets. RECOMMENDED: Presented here is a suggested list of options. Where specifications are missing, substitute the defaults (see NBONDS.DOC for details): For atom based cutoffs: NBONDS ATOM SHIFT CDIE VDW VSHI - CUTNB 13.0 CTOFNB 12.0 CTONNB 8.0 WMIN 1.5 EPS 1.0 or NBONDS ATOM FSWITCH CDIE VDW VSHI - CUTNB 13.0 CTOFNB 12.0 CTONNB 8.0 WMIN 1.5 EPS 1.0 For group based cutoffs (better, but doesn't vectorize well): NBONDS GROUP FSWITCH CDIE VDW VSWI - CUTNB 13.0 CTOFNB 12.0 CTONNB 8.0 WMIN 1.5 EPS 1.0 For extended electrostatics : NBONDS GROUP SWITCH CDIE VDW VSWI EXTEND GRAD QUAD - CUTNB 13.0 CTOFNB 12.0 CTONNB 8.0 WMIN 1.5 EPS 1.0  File: Nbonds, Node: Function, Up: Top, Previous: Defaults, Next: Tables INBFRQ : ======== Update frequency for the non-bonded list. Used in the subroutine ENERGY() to decide whether to update the non-bond list. When set to : 0 --> no updates of the list will be done. +n --> an update is done every time MOD(ECALLS,n).EQ.0 . This is the old frequency-scheme, where an update is done every n steps of dynamics or minimization. -1 --> heuristic testing is performed every time ENERGY() is called and a list update is done if necessary. This is the default, because it is both safer and more economical than frequency-updating. Description of the heuristic testing algorythm : ----------------------------------------------- Every time the energy is called, the distance is computed each atom moved since the last list-update. If any atom moved by more than (CUTNB - CTOFNB)/2 since the last list-update was done, then it is possible that some atom pairs in which the two atoms are now separated by less than CTOFNB are not in the pairs-list. So a list update is done. If all atoms moved by less than (CUTNB - CTOFNB)/2 , then all atom pairs within the CTOFNB distance are already accounted for in the non-bond list and no update is necessary. Description of the code for the heuristic testing : -------------------------------------------------- This section describes how programmers can control the list-updating behavior when their routines call the ENERGY() subroutine. All list-updating decisions, whether they are frequency based or heuristic based, are made in the subroutine UPDECI(ECALLS) , which is called from only one place : at the very beginning of ENERGY(). UPDECI(ECALLS) can be controled through INBFRQ (via the CONTRL.FCM common block) and ECALLS (via the ENERGY.FCM common block) as follows : If INBFRQ = +n --> non-bond list is performed when MOD(ECALLS,n).EQ.0 . Image and H-bond lists are updated according to IMGFRQ and IHBFRQ. If INBFRQ = 0 --> non-bond list update is not performed. Image and H-bond lists are updated according to IMGFRQ and IHBFRQ. If INBFRQ = -1 --> all lists are updated when necessary (heuristic test). (note that ECALLS is incremented by ICALL every time ENERGY(,,,ICAL) is called. In most cases, ICALL=1 ) The current implementation of UPDECI() will work (without modifications) to decide whether the image/crystal non-bond lists need updating, provided the periodicity parameters don't change (i.e. constant Volume). UPDECI() is easily adapted to variable Volume dynamics/minimizations. This is described in comments of the routine itself. Further computational economy in update-testing : ------------------------------------------------- A programmer can sometimes skip the heuristic test itself, making the decision whether to do list-updating even more economical. This option is only available if the size of the step taken since the last call to ENERGY() is known. For an example of usage, see the subroutine ENERG() in TRAVEL. NON-BOND ENERGY TERMS. ====================== The electrostatic options are separate from the van der Waal options, though some keywords are shared between them. The following is a description of all options and keywords. 1) Electrostatics The ELEC keyword (default) invokes electrostatics. The NOELec keyword supresses all electrostatic energy terms and options. There are two basic methods for electrostatics, GROUp and ATOM. A model based on the GROUp method is the extended electrostatics model which approximates the full electrostatic interaction and eliminates the need for a cutoff function. This model is based on the partitioning of the electrostatic term into two contributions. One comes from the interaction between particles which are spatially close and is treated by conventional pairwise summation. The second contribution comes from interactions between particles which are spatially distant from one another and is treated by a multipole moment approximation. [The original model was described in B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, M. Karplus. J. Comp. Chem., 4, 187, (1983) and more recently in R.H. Stote, D.J. States and M. Karplus, J. Chimie Physique (1991)] A) Functional Forms Atom electrostatics indicates that interactions are computed on an atom-atom pair basis. There are two options that specify the radial energy functional form. The keywords CDIE and RDIE select the basic functional form. The SWIT and SHIF keywords determine the long-range truncation option. CDIE - Constant dielectric. Energy is proportional to 1/R. RDIE - Distance dielectric. Energy is proportional to 1/(R-squared) SWIT - Switching function used from CTONNB to CTOFNB values. SHIF - Shifted potential acting to CTOFNB and zero beyond. B) Atom electrostatics Atom electrostatics indicates that interactions are computed on an atom-atom pair basis. There are two options that specify the radial energy functional form. The keywords CDIE and RDIE select the basic functional form. The SWIT and SHIF keywords determine the long-range truncation option. [ ATOM ] [ CDIElec ] [ SHIFted ] [ RDIElec ] [ SWITched ] [ FSWItch ] [ FSHIft ] CDIE - Constant dielectric. Energy is proportional to 1/R. RDIE - Distance dielectric. Energy is proportional to 1/(R-squared) SWIT - Switching function used from CTONNB to CTOFNB values. SHIF - Shifted potential acting to CTOFNB and zero beyond. FSWI - Switching function acting on force only. Energy is integral of force. FSHI - Classical force shift method for CDIE (force has a constant offset) C) Group Electrostatics electrostatics-spec::= [ GROUp [ EXTEnded [ GRADients ] [ QUADrip ] ] ] [ CDIElec ] [ SWITched ] [ [ [ NOGRad ] [ NOQUads ] ] ] [ RDIElec ] [ FSWItch ] [ [ NOEXtended ] ] SWIT - Switching function used from CTONNB to CTOFNB values. FSWI - Switching function, but QiQj/Rcut is added before switching. (FSWI has no effect on neutral groups). D) Electrostatic Distances electrostatic-dist::= [ CUTNB real ] [ CTEXNB real ] [ CTONNB real ] [ CTOFNB real ] CTEXNB - defines the cutoff distance beyond which interaction pairs are excluded from the Extended Electrostatics calculation. E) Extended (group) Electrostatics electrostatics-spec::= [ ATOM ] [ CDIElec ] [ SHIFted ] [ GROUp [ EXTEnded [ GRADients ] [ QUADrip ] ] ] [ RDIElec ] [ SWITched ] [ [ [ NOGRad ] [ NOQUads ] ] ] [ [ NOEXtended ] ] EXTE - invokes the extended electrostatics command for calculating long range electrostatic interactions. NOEX - suppress the extended calculation. GRAD - keyword flags the inclusion of the field of the extended gradient in calculating the force on each atom,i.e. include first and second derivatives. QUAD - flags the inclusion of the quadrupole in the multipole expansion. F) Reaction Fields misc-specs::= [ EPS real ] [ E14Factor real ] [ NORXN ] [ RXNFLD rxnfld-spec ] [ RXNNB rxnfld-spec ] rxnfld-spec::= [ EPSEXT real ] [ ORDER integer ] [ SHELL real ] 2) Van Der Waal Interactions The VDW keyword (default) invokes the van der waal energy term. To supress this term, the NOVDw keyword may be used. A) Distance specified van der Waal Function vdw-spec::= [ VSHIfted ] [ VDIStance ] [ VSWItched ] 3) Miscellaneous options and keywords A) Dielectric specification misc-specs::= [ EPS real ] [ E14Factor real ] B) Warning Distance Specifications warning-dist::= [ WMIN real ] [ WRNMXD real ] [ SIGMAX real ] [ SIGWRN real ] WRNMXD - keyword defines a warning cutoff for maximum atom displacement from the last close contact list update (used in EXTEnded) C) Initialization D) ST2-ST2 interaction methods st2-spec::= [ ST2List ] [ ST2Nolist ] In all cases as many keywords and values as desired may be specified. The key words, their functions, and defaults are: 1) The method to be used 2) Distance cutoff in generating the list of pairs CUTNB value (default 8.0) 3) Distance cut at which the switching function eliminates all contributions from a pair in calculating energies. Once specified, This value is not reset unless respecified. CTOFNB value (default CUTNB-0.5) 4) Distance cut at which the smoothing function begins to reduce a pair's contribution. This value is not used with SHFT. Once specified, This value is not reset unless respecified. CTONNB value (default CTOFNB-1.0) 6) Dielectric constant for the extened electrostatics routines (RDIE option sets the dielectric equal to r times the EPS value) EPS value (default 1.0 for r dielectric) EPS 0.0 or NOELec (zero elecrostatic energy) 7) Warning cutoff for minimum atom to atom distance. Pairs are checked during close contact list compilation. WMIN value (default 1.5) 8) Warning cutoff for maximum atom displacement from the last close contact list update (used only in EXTEnded) WRNMXD value (default 0.5) ALGORITHMS There are four algorithms used in calulating the nonbonded energies, each making different approximations in an attempt to speed the calulation. Electrostatic interactions are the most difficult to deal with for two reasons. They do not fall off quickly with distance (so it is inappropriate to simply ignore all interactions beyond some cutoff), and they depend on odd powers of r necessitating expensive square root caluculations for each pair evaluated. The approximations used to make the electrostatics calculation more tractable are setting the dielectric constant equal to r or using a constant dielectric but only calculating distant interactions periodically (and storing the value in between). Setting the dielectric constant equal to the atom atom distance times a constant factor ( determined by the EPS keyword value ) makes the computation easier by eliminating the need to calculate square roots and by making the calculated contribution fall off more quickly. It also introduces problems. The force calculated using an r dependant dielectric will be larger than the force from a constant dielectric at short distances (5.0 angstroms or less by comparsion to a constant dielectric of 2.5). In addition, the electrostatic contribution still falls off relatively slowly and large distance cutoffs are needed. As the number of atom pairs included will be proportional to the cutoff cubed, this is a significant disadvantage. The SHIFt option is similar to SWITch except, the potential: E= (QI*QJ/EPS*R)*( 1.0 + R**4/CTOFNB**4 - 2.0*R**2/CTOFNB**2 ) is used when ( R < CTOFNB ) and zero otherwise. This potential and it first derivative approach zero as R becomes CTOFNB, without the messy computation of switching functions and steep forces at large R. CDIE uses a constant dielectric everywhere. This requires a square root to be calculated in the inner loop of ENBOND, slowing things down a bit, but it is physically more reasonable and widely employed by other groups doing empirical energy modelling (ex. ST2 water). This form allows a small CUTNB (5.0 angstroms with EPS=2.5) even though the electrostatic terms are still varying rapidly at that distance. The short range forces are identical to those calculated with the other options, reflecting the decrease in dielectric shielding at short ranges. The constant dielectric routines compile the close contact list using the same two stage minimum rectangle box search that is described above. In this way the efficiency of a residue by residue search is exploited while being certain that all necessary pairs are included. For close residue pairs an atom by atom search is then performed. Atom pairs are either included in the list of close contacts or their electrostatic interactions are calculated and stored. Description of the Extended Electrostatics method ------------------------------------------------- For the long range forces there is effectively no cutoff in the electrostatic energy when using the Extended Electrostatics model. The Extended Electrostatics model approximates the full electrostatic interaction by partitioning the electric potential and the resulting forces at any point ri into a near and extended contribution. The near contribution arises from the charged particles which are spatially close to ri while the extended contribution arises from the particles which are spatially distant from ri. The total electrostatic potential can be written as a sum of the two. The near region is defined in terms of a radial distance, CUTNB, for each atom. Interactions between atoms separated by a distance greater than CUTNB are calculated using a time saving multipole approximation when the nbond list is updated. These interactions are stored together with their first (NOGRad) or first and second (GRADients) derivatives. Interactions between particles within CUTNB are calculated by the conventional pairwise additive scheme. (For a more complete development of the model, see R.H. Stote, D.J. States and M. Karplus, J. Chimie Physique Vol. 11/12, 1991). The energy is calculated by explicitly evaluating pairs in the list and using the stored potentials, fields, and gradients to approximate the distant pairs. In essence the routines assume that for distant pairs the atom movements will be small enough that the changes in their electrostatic interactions can be accurately calculated using local expansions. In using this model the GROUp method for constructing the nonbond list must be used. The interactions between particles within CUTNB are truncated rather than having a SHIFt or SWITch function applied. Additionally, as one is calculating all electrostatic interactions in the system, the dielectric constant should bet set to 1.0. Not Available at this time: An option is offered to increase the accuracy of residue residue interactions by using a multipole expansion of one residue evaluated for each atom of the other. This cutoff for this treatment is CUTMP. For residue pairs outside of CUTMP only a single multipole evaluation is made and second order polynomial expansion is used to extrapolate to each atom. Ordinarily this is sufficient and CUTMP is set to 0.0. IMPLEMENTATION AND DATA STRUCTURES The initialization and list compilation is performed by the subroutine NBONDS. It functions by guessing how much space will be needed to store the close contact list, allocating that space (and space for electrostatic potentials and gradients if necessary) on the heap, and calling the appropriate subroutine to actually compile the nonbonded list (NBONDG). If sufficient space was not available 1.5 times as much is allocated and another attempt is made. ENBOND evaluates the nonbonded energy, calling EEXEL to evaluate the stored electric potentials and fields. Double precision is used for all arithmatic. All of the nonbonded cutoffs and lists are stored on the heap. BNBND is the descriptor array passed through most of the program (in some of the analysis routines an additional array BNBNDC is used for the comparision data structure). BNBND holds heap adresses and LNBND holds the lengths of the elements in the data structure. To actually access the data it is necessary to include INBND.FCM (an index common block) and specify HEAP(BNBND(xxx)) where xxx is the desired element name in INBND.FCM. This is arrangement has the advantage of allowing dynamic storage allocation and easy modification of the types of information passed from routine to routine. The contents of the nonbonded data structure are described in INBND.FCM. FAST VECTOR/SHARED-MEMORY PARALLEL ROUTINES If FASTER is greater than or equal to zero; the SPECify FNBL OFF has not been issued; VATOM and ATOM keywords have been used; groups and extended electrostatics have not been invoked and the cubing method has not been specified; the fast nonbond list routines will be used.  File: Nbonds, Node: Tables, Up: Top, Previous: Function, Next: Top Nonbond Lookup Tables The nonbond energy terms may be specified with a user supplied binary lookup table. The command; READ TABLE UNIT int will invoke this feature and disable all other energy term options. The nonbond list specifiers will still be used (cutoff distances...). This feature is not designed for casual users, and is not supported with test cases. Also, in version17, there is an uncorrected bug in the second derivative determination. To use this feature, first read the common file ETABLE.FCM for a description of variables, and then create a file the the routine REATBL (consult the source) can read. The sources for this option are contained in the file ETABLE.FLX. C DEC/CMS REPLACEMENT HISTORY, Element PARMFILE.DOC C *3 6-MAY-1991 17:40:48 WON "Info directive fixed" C *2 2-MAY-1991 11:13:48 MACKERELL "Update of parameter documentation" C *1 8-APR-1990 19:50:37 KOTTALAM "charmm documentation" C DEC/CMS REPLACEMENT HISTORY, Element PARMFILE.DOC  File: Parmfile, Node: Top, Up: (doc/commands.doc), Previous: (doc/usage.doc)Standard Files, Next: Overview CHARMM Emprical Energy Function Parameters This section describes parameters in the CHARMM empirical energy function. * Menu: * Overview:: Overview of CHARMM parameter file by A. D. Mackerell Jr. * Multiple:: Rules for the use of multiple dihedrals in CHARMM22 * Conversion:: Rules for conversion of old nucleic acid rtf and param to CHARMM22 format * PARMDATA:: Description of Parameter Files available for general use.  File: Parmfile, Node: Overview, Up: Top, Previous: Top, Next: Multiple Overview of CHARMM parameter file By Alexander D. MacKerell Jr., May 1991 This section of the documenation contains a brief description of the contents of a parameter file. The CHARMM parameter file contains the information necessary to calculate energies etc. when combined with the information from a PSF file for a structure. Information on the keywords found in the parameter file is in IO.DOC. (A) * CHARMM example parameter file * (B) BOND H O 500.0 1.00 (C) ANGLe (THETa) H O H 100.0 104.51 20.0 1.70 (D) DIHEdral (PHI) HT CT CT HT 10.0 3 180.0 X CT CT X 10.0 3 180.0 (E) IMPH O C CT N 5.0 1 0.0 X C CT X 5.0 1 0.0 X X CT N 5.0 1 0.0 O X X N 5.0 1 0.0 (F) NBONDed nonbond-spec H 0.00 -0.046 0.2245 0.00 -0.023 0.2245 O 0.00 -0.120 1.8000 0.00 -0.060 1.8000 (G) NBFIX H O -0.30 1.50 -0.15 1.50 (H) HBONDs hbond terms (IO.DOC) H O -0.00 1.00 (I) END The parameter file starts with a title (A) which contains information on the origins and applicability of that file. Section (B) BONDs, contains information on all bond force constants and equilibrium geometries. In this as well as the remainder of the parameter file the bonds etc. are specified by the atom type associated with each IUPAC atom in the topology file. Section (C) ANGLes or THETas, are specified by 3 atom types followed by the force constant and equilibrium geometry. If a Urey-Bradley term is desired between the 1 and 3 atom types of the angle a second U-B force constant and equilibrium geometry are included. Section (D) DIHEdrals (PHI), contains the 4 atom types specifing a dihedral followed by the force constant, the multiplicity of the dihedral and the minimium geometry of the dihedral. With dihedrals wildcards, X, as shown may be included for the terminal atoms. Also, multiple dihedrals of different multiplicities may be specified for a single dihedral as outlined below. Improper dihedrals (E) IMPH, used for out of plane motions are specified in the same fashion as dihedrals. The use of wildcards, X, is also allowed in a number of variations. Multiple improper dihedrals are not supported. Parameters for (F) NONBonded VDW parameters may be specified in two ways. Initially the Tanford-Kirkwood Formula was used where the atom polarizabilities, Number of effective electrons, and (minimum radius)/2 were required. In this formulation the first term following the atom type is the atom polarizability, the second term is the number of effective electrons and must be positive in order to specify the Tanford-Kirkwood Formula and the third term is the (minimum radius)/2. If the second term is negative, then the first number is ignored, the second term is the well-depth (epsilon) and the third term is the (minimum radius)/2. Both formulations use the Lennard-Jones 6-12 formula to determine the VDW interactions, in the first method the Tanford-Kirkwood Formula is used to calculate the well-depth (epsilon) and in the second method it is used directly. With both formulations a second set of 3 numbers may be specified to indicate the VDW parameters to be used for the calculation of 1-4 nonbonded interactions. Wildcards (*, %, etc. see MISCOM.DOC) may be used with the NONBond as well as the NBFIX and HBOND sections of the parameter file. The NBFIX section (G) allows VDW interactions between specific atom pairs to be modified. This is done by specifing the 2 atom types followed by the well depth and the minimum radius (not (minimum radius)/2 as in NBOND). A second well depth and minimum radius may be specified to determine the 1-4 interactions. The final section (H) contains the hydogen bond well depths and minimum radii for various atom pairs. In current versions of the CHARMM parameter sets (PARAM19, CHARMM22 protein and nucleic acid parameters) hydrogen bonding is included in the electrostatic and VDW interactions. Thus, the HBOND well depth is set to -0.00 and in most calculations IHBFRQ should be set to 0 to avoid updating the hydrogen bond lists. This facility is still supported to allow calculations using the Lennart Nilsson nucleic acid parameters, AMBER parameters and for analysis of hydrogen bond geometries. It should be noted that both the NBOND and HBOND keywords are followed by a number of keywords dictating truncation schemes, 1-4 interaction treatments and dielectric constants, amoung others. These specifications are of the upmost importance for relabile calculations and deviations from the default values supplied with the parameter files should be done with the utmost caution.  File: Parmfile, Node: Multiple, Up: Top, Next: Conversion, Previous: Overview Rules for the use of multiple dihedrals in CHARMM22 1) The association of 1 or more dihedrals with different multiplicities to a specfic dihedral type (as specified by atom types) is specified by the presence of 2 or more dihedral parameters in the parameter file. When multiple dihedrals are read in the parameter file CHARMM22 will list those dihedrals. 2) If dihedral angles are AUTOGENERATED, then the RTF should not specify them again. Additional dihedrals in the RTF will be ignored and warnings given. 3) Without AUTOGENERATE, each dihedral should appear only once in the RTF. Multiple listings of a dihedral will be ignored and warnings given. 4) The order or position of the dihedral entries associated with a specific dihedral is not important, however, it is suggested that they be placed sequentially in the parameter file. 5) Wildcards may be used in the parameter file to specify multiple dihedrals(ie. X C1 C2 X), however, all the dihedrals in the parameter file associated with that dihedral type must be wildcards. Use of wildcards with multiple dihedrals is NOT recommeded. 6) Specific dihedral entries always override wildcard entries. For example: X C2 C3 X 100.0 1 180.0 C1 C2 C3 C4 100.0 2 180.0 X C2 C3 X 100.0 3 180.0 will assign the 2-fold term to C1-C2-C3-C4 while 1-fold and 3-fold terms would be assigned to C5-C2-C3-C6 and any other dihedral centered about the C2-C3 bond. This assignment of the multiple terms to a number of dihedrals is why the use wildcards for the specification of multiple dihedrals in NOT recommeded. The preferred method is as follows: C5 C2 C3 C6 100.0 1 180.0 X C2 C3 X 100.0 2 180.0 C5 C2 C3 C6 100.0 3 180.0 will assign the 1-fold and 3-fold terms to C5-C2-C3-C6 and the 2-fold term to C1-C2-C3-C4 and any other dihedral centered about the C2-C3 bond. This limits the potential for multiple dihedrals being mistakenly assigned to a dihedral centered on the C2-C3 bond. Thus, it is advised that when creating a multiple dihedral all 4 atom types be explicitly stated and, if necessary, new atom types be created to avoid conflicts. 7) This design is such that previous CHARMM topology and parameter files for proteins are compatible with CHARMM22. However, due to complexities in the multiple dihedral setup for the nucleic acid sugars (ribose and deoxyribose) the nucleic acid topology and parameter files are NOT compatible with CHARMM22. In order to make them compatible the following alterations must be performed. Alternatively, the altered files may be obained from Alexander D. MacKerell Jr.  File: Parmfile, Node: Conversion, Up: Top, Previous: Multiple, Next: PARMDATA Rules for conversion of old nucleic acid rtf and param to CHARMM22 format ALL-HYDROGEN Protocal for conversion of all-hydrogen nucleic acid topology and parameter files (topnah*.inp and parnah*.inp) from a CHARMM21 or previous format to a format compatible with CHARMM22. This change is due to a new methodology for the treatment of multiple dihedrals in CHARMM22. In Topology File (TOPNAH1.INP, TOPNAH1E.INP, TOPNAH1R.INP) 1) Create a new atom type, OSS 2) Convert the atom type of all O4' atoms to OSS In Parameter File (PARNAH1.INP) 1) Copy all OS parameters (bonds, angles, dihedrals etc.) and in the copy change OS to OSS. Be sure that the original OS parameter remains. Some OS to OSS copies can be avoided (such as OS P terms), however, one must be careful that all the necessary OSS parameters relating to O4' are present. Creating extra OSS parameters which are unused is not a problem. One exception occurs with the dihedral OS CH CH OS, where only one of the terminal OS atom should be converted to OSS. 2) In the DIHEDRAL (PHI) parameters under the heading "WILMA OLSON SUGAR MODEL" the following steps must be performed once all the OSS dihedral parameters are created. A) In all the explicit OS terms which don't include wildcards (X) or P atom types and have both 2 and 3-fold periodicities (2nd of 3 numbers following the dihedral) the 2nd 3-fold term must be commented out with a !. B) Of the new explicit OSS terms the following 3-fold terms must be commented out with a !. OSS CH CH OS 1.4000 3 0.0000 OH CH CH OSS 1.4000 3 0.0000 Lastly, when generating the structure be sure only the AUTOGENERATE ANGLE term is used. (i.e. do NOT use AUTOGENERATE DIHEDRAL). At this point the topology and parameter files should be compatible with CHARMM22 (but not CHARMM21 or a previous version of CHARMM). A test should be performed on a (deoxy)ribose containing containing compound. In this test the energies should be calculated 1) using CHARMM21 or a previous version using the original, unmodified topology and parameter files and 2) with CHARMM22 using the modified OSS containing topology and parameter files. These energies should be equivalent. EXTENDED (UNITED) ATOM Protocal for the conversion of extended (united,explicit) atom nucleic acid topology and parameter files from CHARMM21 or previous format to a format compatible with CHARMM22. This change is due to a new methodology for the treatment of multiple dihedrals in CHARMM22. In Topology File (TOPRNA10 or TOPRNA10R) 1) Create 2 new atom types, OSS and OST 2) Convert the atom type of all O4' atoms to OSS except in the the patch PRES DEOX where it must be changed to atom type OST. This conversion to OST must also be performed in any residue, such as RESI DRIB, in which deoxyribose is used explicitly. 3) In the patch PRES DEOX add the line: ATOM O4' OST -0.30 ! (check the charge) before the GROUP statement and comment out the terms !DELETE DIHE O4' C4' C3' O3' ! WE NEED THIS AS A MULTIPLE TERM IN DEOXY !DIHE O4' C4' C3' O3' ! threefold !DIHE O4' C4' C3' O3' ! twofold such that no alterations in the dihedral setup are made. In Parameter File (PARDNA10.INP) 1) Copy all OS parameters twice (bonds, angles, dihedrals etc.); in the first copy change OS to OSS and in the second change OS to OST. Be sure that the original OS parameter remains. Some OS to OSS(OST) copies can be avoided (such as terms in which OS is adjacent to P), however, one must be careful that all the necessary OSS(OST) parameters relating to O4' are present. Creating extra OSS(OST) parameters which are unused is not a problem. One exception occurs with the dihedral OS CH CH OS, where only one of the terminal OS atom should be converted to OSS(OST). 2) In the DIHEDRAL (PHI) parameters under the heading "WILMA OLSON SUGAR MODEL" the following steps must be performed once all the OSS(OST) dihedral parameters are created. A) In all the explicit OS terms which don't include wildcards (X) or P atom types and have both 2 and 3-fold periodicities (2nd of 3 numbers following the dihedral) the 2nd term must be commented out with a ! (mostly 3-fold terms and 1 or 2 2-fold term). B) Of the new explicit OSS terms the following 3-fold terms must be commented out with a !. OSS CH CH OS 1.4000 3 0.0000 OH CH CH OSS 1.4000 3 0.0000 C) Maintain all of the OST dihedral terms. An example of the additions/alterations to pardna10.inp are listed below. BOND HO OSS 450.0000 0.9600 HO OST 450.0000 0.9600 OSS CH 292.0000 1.4300 OSS C2 292.0000 1.4300 OST CH 292.0000 1.4300 OST C2 292.0000 1.4300 C3 OSS 292.0000 1.38 C3 OST 292.0000 1.38 C OSS 292.0000 1.43 C OST 292.0000 1.43 THETA OSS C2 C3 150.5000 111.0000 OSS C2 CH 70.0000 112.0000 OSS C2 C2 82.0000 112.0000 OST C2 C3 150.5000 111.0000 OST C2 CH 70.0000 112.0000 OST C2 C2 82.0000 112.0000 C2 CH OSS 46.5000 111.0000 C2 CH OST 46.5000 111.0000 C3 CH OSS 46.5000 111.0000 C3 CH OST 46.5000 111.0000 CH CH OSS 46.5000 111.0000 CH CH OST 46.5000 111.0000 OSS CH NS 46.5000 111.0000 OSS CH NH2E 46.5000 111.0000 OST CH NS 46.5000 111.0000 OST CH NH2E 46.5000 111.0000 C2 OSS C2 82.0000 111.5000 CH OSS CH 46.5000 111.5000 HO OSS CH 46.5000 107.3000 HO OSS C2 46.5000 107.3000 C2 OST C2 82.0000 111.5000 CH OST CH 46.5000 111.5000 HO OST CH 46.5000 107.3000 HO OST C2 46.5000 107.3000 CH OSS C3 46.5 107.3 CH OST C3 46.5 107.3 C OSS C3 46.5 120.5 C OST C3 46.5 120.5 O C OSS 70.0 120.0 O C OST 70.0 120.0 CH C OSS 70.0 125.3 NA C OSS 70.0 120.0 CH C OST 70.0 125.3 NA C OST 70.0 120.0 OSS CH CS 46.5 111.0 OST CH CS 46.5 111.0 PHI X CH OSS X 0.9000 3 0.0000 X CH OST X 0.9000 3 0.0000 X C2 OSS X 0.5000 3 0.0000 X C2 OST X 0.5000 3 0.0000 ! OSS SUGAR TERMS OSS CH CH OS 0.5000 2 0.0000 !OSS CH CH OS 1.4000 3 0.0000 Should be commented out OH CH CH OSS 0.5000 2 0.0000 !OH CH CH OSS 1.4000 3 0.0000 Should be commented out OSS CH CH CH 0.5000 2 0.0000 OSS CH CH CH 1.4000 3 0.0000 OSS CH C2 CH 1.0000 2 0.0000 OSS CH C2 CH 1.4000 3 0.0000 OSS CH CH C2 1.4000 3 0.0000 OSS CH CH C2 0.5000 2 0.0000 OSS C2 C2 C2 1.4 3 0.0 OSS C2 C2 C2 0.5 2 0.0 ! OST SUGAR TERMS OST CH CH OS 0.5000 2 0.0000 OST CH CH OS 1.4000 3 0.0000 OH CH CH OST 0.5000 2 0.0000 OH CH CH OST 1.4000 3 0.0000 OST CH CH CH 0.5000 2 0.0000 OST CH CH CH 1.4000 3 0.0000 OST CH C2 CH 1.0000 2 0.0000 OST CH C2 CH 1.4000 3 0.0000 OST CH CH C2 1.4000 3 0.0000 OST CH CH C2 0.5000 2 0.0000 OST C2 C2 C2 1.4 3 0.0 OST C2 C2 C2 0.5 2 0.0 ! additional terms for tRNA OSS CH CS CF 1.5 3 0.0 OST CH CS CF 1.5 3 0.0 C2 CH C OSS 1.5 3 0.0 C2 CH C OST 1.5 3 0.0 X C OSS X 1.8 2 180.00 X C OST X 1.8 2 180.00 ! THE FOLLOWING TERMS UNDER THE HEADER ! "WILMA OLSON SUGAR MODEL": ! SHOULD BE COMMENTED OUT !OS CH CH OS 1.4000 3 0.0000 !OS CH CH CH 1.4000 3 0.0000 !OH CH CH OS 1.4000 3 0.0000 !OS CH C2 CH 1.4000 3 0.0000 !OS CH CH C2 0.5000 2 0.0000 !OS C2 C2 C2 0.5 2 0.0 IMPHI OSS X X CH 31.5000 0 35.2600 OST X X CH 31.5000 0 35.2600 CH OSS C2 NS 31.5000 0 35.2600 CH OSS CH NS 31.5000 0 35.2600 CH OSS C2 NH2E 31.5000 0 35.2600 CH OSS CH NH2E 31.5000 0 35.2600 CH OST C2 NS 31.5000 0 35.2600 CH OST CH NS 31.5000 0 35.2600 CH OST C2 NH2E 31.5000 0 35.2600 CH OST CH NH2E 31.5000 0 35.2600 NBONDED OSS 0.64 7.0 1.6 OST 0.64 7.0 1.6 Lastly, when generating the structure be sure only the AUTOGENERATE ANGLE term is used. (i.e. do NOT use AUTOGENERATE DIHEDRAL). At this point the topology and parameter files should be compatible with CHARMM22 (but not CHARMM21 or a previous version of CHARMM). A test should be performed on a (deoxy)ribose containing containing compound. In this test the energies should be calculated 1) using CHARMM21 or a previous version using the original, unmodified topology and parameter files and 2) with CHARMM22 using the modified OSS containing topology and parameter files. These energies should be equivalent.  File: Parmfile, Node: PARMDATA, Up: Top, Previous: Conversion, Next: Top Description of Parameter Files available for general use. Currently parameter files for both proteins and nucleic acids exist using both all-hydrogen and extended atom representations. The most recent parameter files are the all-hydrogen sets for proteins, PAR_ALL22_PROT.INP, and for nucleic acids, PAR_ALL22_NA.INP. The extended atom protein, PARAM19.INP, and nucleic acid, PARDNA10_22.INP, topologies are from the Version 19 release. Alterations in the dihedrals have been made to PARDNA10_22.INP to reproduce the original results, however, due to the small increase in the number of atoms upon going from the extended to all-hydrogen representations in nucleic acids it is recommeded that PAR_ALL22_NA.INP be used. Hopefully, in the not to distant future, new extended atom parameter files will be developed based on the all-hydrogen files. References for the all-hydrogen protein and nucleic acid parameter files are Proteins: MacKerell, A. D. Jr. et al., Manuscript in Preparation. Nucleic Acids: MacKerell, A. D. Jr., Wiorkiewicz, J. and Karplus, M., Manuscript in Preparation. It is anticipated that the above will be submitted in J. Am. Chem. Soc. There will also be a series of papers on the parameterization of various residue types. C DEC/CMS REPLACEMENT HISTORY, Element PDETAIL.DOC C *3 5-JAN-1992 14:47:44 WON "IC pert documented by CLB" C *2 6-MAY-1991 17:41:53 WON "Info directive fixed" C *1 12-JUL-1990 10:54:54 KOTTALAM "Charlie's pert documentation" C DEC/CMS REPLACEMENT HISTORY, Element PDETAIL.DOC  File: PDETAIL, Node: Top, Up: (doc/perturb.doc), Next: Introduction Details about TSM Free Energy Calculations * Menu: * Introduction:: What will be covered. * Theory and Methodolgy:: General discussion. * Practice:: How to do it.  File: PDETAIL, Node: Introduction, Up: Top, Next: Theory and Methodology, Previous: Top Introduction For a good overview of free energy simulation methods, the follow- ing references are suggested: M. Mezei and D. L. Beveridge, in Annals of the New York Academy of Sciences, chapter titled "Free Energy Simulations", 482 (1986) 1; T. P. Straatsma, PhD dissertation, "Free Energy Evaluation by Molecular Dynamics Simulations", University of Groningen, Netherlands (1987) and S. H. Fleischman and C. L. Brooks III, "Thermodynamics of Aqueous Solvation: Solution Properties of Alchohols and Alkanes", J. Chem. Phys., 87, (1987) p. 3029, D. J. Tobias and C. L. Brooks III, J. Chem. Phys., 89, (1988) 5115-5127, and D.J. Tobias, "The Formation and Stability of Protein Folding Initiation Structures", Ph.D. dissertation Carnegie Mellon University (1991). In the previous nodes we have generally referred to this area of molecular simulation as a "perturbation" theory. Actually, none of the techniques used are actually perturbation methods. The relationships used for computing the relative free energy differences are all exact in the statistical mechanical sense. The use of the term perturbation in this context arises from the fact that in the pre-number crunching supercomputer days, various series expansions were derived from these equations and were in fact perturbation theories. The name thermodynamic integration might be used, however common practice has been to apply it to only one particular formulation (and furthermore not put that under the rubric of thermodynamic perturbation). Finally, the use of the name "free energy simulations" is another misonomer for two reasons: 1) we can calculate the temperature derivative thermodynamic properties as well (Delta E and Delta S) and the one thing we can't get is absolute free energies (as van Gunsterin has pointed out , Mother Nature doesn't integrate all over phase space either). In fact, we generally are limited to calculating relative changes in free energies, i.e. Delta Delta A's. In thermodynamic perturbation theory, a system with the potential energy function U0 is perturbed to one with the potential function U1, and the resulting free energy difference is calculated as A1 - A0 = -kT ln < exp[ -(1/kT)*(U1 - U0)] > where k is Boltzmann's constant, T is temperature (degrees K), and A0 and A1 are the excess Helmholtz free energies of systems 0 and 1, respectively. Two methods of thermodynamic perturbation are implemented in CHARMM: 1) Chemical perturbation, where the perturbation being considered is a change in the system's potential function parameters and topology, e.g., CH3OH is "mutated" to CH3CH3, and 2) Internal coordinate perturbation, where the perturbation represents a variation in configuration, and the potential function remains the same for the perturbed and unperturbed systems. Each of these is discussed separately below.  File: PDETAIL, Node: Theory and Methodology, Up: Top, Next: Practice, Previous: Introduction THEORY AND METHODOLOGY * Menu: * Chemical:: Chemical Perturbation Theory and Methodology * Internal:: Internal Coordinate Perturbation Theory and Methodology * References:: Some References on Thermodynamic Perturbation  File: PDETAIL, Node: Chemical, Up: Theory and Methodology, Next: Internal, Previous: Theory and Methodology Chemical Perturbation If you have read either (Fleischman and Brooks, 1987) or (Straatsma, 1987) or any of the McCammon or Kollman perturbation (oops! that word again) papers, then you have seen the standard schpiel on why getting Delta A's (or Delta G's) of solvation or drug/enzyme binding, among other processes is so difficult and that if one is satisfied with relative changes in free energies it is computationally more tractable to "trans-mutate" various parts of a system in a way that is usually physically unreasonable but computationally feasible and thermodynamically equivalent to that obtained from the physical process. Read some of the aforementioned references if this doesn't ring a bell. So that's what we are doing - calculating relative changes in free energies (Delta Delta A) for solvation and small molecule/enzyme binding, among other things. In the rest of this node, we will discuss a little bit of the theory (you're better off reading the papers) and lot about the actual how-to-do-it in our implementation. Subsequent nodes discuss the actual implementation and some issues to consider when attempting this type of calculation. The Hamiltonian There are three basic techniques for calculating relative changes in free energy and their temperature derivative properties: 1) the so-called "perturbation" approach 2) "Thermodynamic Integration" (TI) 3) and the somewhat dubious "slow-growth" technique (which is actually a step-child of the TI method). In all of the methods we use a hybrid hamiltonian, N N H(lambda) = H + (1 - lambda) H + lambda H . o R P where: H = "Environment" part of the Hamiltonian o H = "Reactant" part of the Hamiltonian R H = "Product" part of the Hamiltonian P lambda = coupling parameter (extent of transformation) N = integer exponent The various terms will be explained shortly. First, a bit about our Weltanshauung viz. free energy simulations. The system is divided into four sets of atoms: 1) The reactant atoms 2) the product atoms 3) the colo atoms and 4) the environment atoms. The reactant and product atoms are those that are actually being changed. The colo atoms (short for co-located charge) are those in which only the charge changes in going from reactant to product. The environment atoms are the rest of the system (e.g. solvent; parts of a molecule common to both reactant and product). The reactant and product designations are arbitrary and are used as a convention to denote the direction in which we are mutating (i.e. start with reactant end up with product). A simple example, taken from (Fleischman and Brooks, 1987) is the calculation of the relative change in the solvation free energies of methanol and ethane. This one has been done by virtually everybody that has written a free energy simulation code. The system is represented by the water molecules (we used a box of 125 in our study) and the hybrid methanol/ethane system, with aliphatic methyl groups represented as extended atoms. O1--H1 / H / / H C1--C2 O \ \ O H / H Using the depiction above, for the tranformation of methanol -> ethane the reactant atoms are the hybrid's O1 and H1; the product atom is the hybrid's C2 methyl group. The hybrid molecule's C1 methyl group changes charge as one goes from reactant to product. This is a colo atom, in going from methanol -> ethane it starts with the methanol methyl group charge and ends up (at lambda = 1.) with the ethyl methyl group charge. Otherwise, C1 is considered an environment atom. The atoms of the water molecules constitute the actual environment atoms in this system. If the hybrid molecule was larger it too could contain environment atoms. All potential energy terms involving the reactant atoms, as well as the electrostatic interactions involving colo atoms with their reactant charges, go into H . The kinetic energies of the reactant atoms R also are included in this term. Similarly, the potential energy terms involving product atoms and the colo product charge electrostatic interactions along with the kinetic energies of the product atoms go into H . The rest of the energy terms are incorporated into H . P o Note that for a potential energy term to be included in, say, H only one R atom in the given interaction has to be a reactant atom (or in the case of a electrostatic interaction a colo atom). Similarly for product terms. For electrostatic terms involving colo atoms effectively what is done is that electrostatic terms containing colo atoms are calculated twice, once with the reactant charges and then again with the product charges. Terms between colo reactant charges and reactant atoms are avoided and similarly for product atoms. Actually, the programming details are a bit more complicated than that and if interested see *NOTE implementation: (pimplem). The outcome is the same as just described. It is assumed that when the hybrid molecule is constructed in the residue topology file, there are no internal coordinate energy terms involving reactant and product atoms. As yet no checking is done in the program. Similarly, it is assumed that non-bonded exclusions have been specified between reactant and product atoms. In our implementation, the Hamiltonian is constructed exactly as specified in the equation above. In many papers, that particular form of the Hamiltonian is given in the theoretical section (or more likely, the form with N=1, i.e. linear) and in the actual implementation the lambda dependence of the Hamiltonian is quite a bit more complex. This is done in those implementations where the force constants and other parameters in the energy terms are factored by lambda rather than calculating various energy terms and factoring them. In a statistical mechanical sense there is no particular reason that forces one to factor the Hamiltonian consistently like we do. The thermodynamics holds regardless of path and the equipartion theorem for obtaining the kinetic energy works just as well (though in other implementations it appears that the factoring of the kinetic energy is ignored anyway). However, we feel that there are certain advantages to doing it this way. First, there is a certain conceptual simplicity in factoring the Hamiltonian consistently for reactant and product terms enmass. Second, it makes obtaining the derivatives of the Hamiltonian with respect to lambda, d E(lambda)/dlambda programmatically simple. These derivatives are needed in the TI and, the related, slow growth methods. Actually, the current algorithm for the slow growth method in our implementation uses finite differences for the derivative as do Kollman and van Gunsterin. This could easily be changed. Factoring the energy terms rather than functional parameters permits a more modular design and makes incorporating changes by others to energy functions terms easier. The Free Energy Equations As we said there are two (maybe three, depending how you count it) different ways that we obtain the free energy changes. For the thermodynamic integration method (TI) the following expression is used: _ 1 / | /\ A = | < d H(lambda)/d lambda > d lambda -- | lambda _/ 0 Expressions for energy and entropy changes can be derived for this equation (Mezei and Beveridge, 1986) and have been incorporated into our program. They suffer from very high uncertainties due to presence of ensemble averages over the total energy which are then multiplied by ensemble averages over d H/d lambda. One is apparently better off getting average energies at the endpoints and subtracting. The method we have used the most is the thermodynamic perturbation technique. For this, the free energy change is given as follows: /\ A (lambda -> lambda') = - kT ln < exp - (V - V ) > -- R P lambda Notice that all of the averages are at lambda. To get the total delta A ___ \ /\ A = / /\ A (lambda -> lambda') -- --- -- i i i the pieces are added up. The user must insure that the whole lambda range is covered. For example, in the methanol -> ethane calculation, we ran dynamics at 3 points: lambda = .125, .5 and .875. To cover the range we calculated delta A's as follows: lambda' lambda lambda' 0.000 0.125 0.250 0.250 0.500 0.750 0.750 0.875 1.000 I.e., for each lambda in which dynamics were run two delta A's were calculated, one lower and one higher than the corresponding lambda. This has been termed "double-wide" sampling. Note that the pieces all join up. In our implementation we have the capability of calculating the temperature derivative related thermodynamic properties, delta E and delta S. This is effected by the use of equations derived by Brooks. See Fleischman and Brooks, 1987 for the corrected set of equations. They use a finite difference approximation to the derivatives that avoids then necessity of taking the differences of large averages that would result from using the explicit temperture derivatives. With both of the aformentioned methods the technique for accomplishing the simulations is called the "window" procedure. In these methods simulations are run at a discrete number of lambda points (we generally use 3 - 6 and long trajectories; other workers use up to 100 lambda points and very short trajectories). In the case of thermodynamic perturbation the total free energy change is pieced together from perturbations done with each "window". In the case of thermodynamic integration the integration is done by a quadrature method. In our implementation, we fit the ensemble average as a function of lambda to a cubic spline polynomial and then integrate the polynomial analytically. No extrapolation to endpoints is done. So if you start at lambda = .125 and end at lambda = .875 (like we do) you can use thermodynamic perturbation to get the end points (.125 -> 0 and .875 -> 1.) and TI for the middle. An alternative sampling method is termed "slow growth". It is more or less an approximation to the thermodynamic integration method. In this case instead of lambda being a constant for a given trajectory (as in the window method), instead the parameter varies monotonically with each time step. n steps ---- \ /\ A = / H(lambda + delta lambda) - H(lambda ) -- ---- i i i and lambda = lambda + delta lambda i i-1 Where H is the Hamiltonian and delta lambda = 1/nstep.  File: PDETAIL, Node: Internal, Up: Theory and Methodology, Next: References, Previous: Chemical Internal Coordinate Perturbation According to the thermodynamic perturbation (TP) theory, the Helmholtz free energy difference, A1 - A0, between system 0, in which the conformational coordinate of interest (e.g. an internal coordinate) is equal to x, and another system, in which the coordinate has been "perturbed" by the amount dx, is given by the equation A1 - A0 = A(x + dx) - A(x) = kT ln < exp [-kT (U(x + dx) - U(x)) ] > (1) x where U is the potential energy k is Boltzmann's constant and T is temperature (degrees K). The <...>x notation denotes a canonical ensemble average over the "reference" ensemble in which the coordinate is equal to x. Although the potential energy may depend on many degrees of freedom, for the sake of simplicity we have only explicitly indicated its dependence on x. If we assume that the ergodic hypothesis holds, we can equate the ensemble average appearing in equation (1) to the time average computed from an MD simulation, e.g. = (1/N) * Sum { exp [ -(1/kT)*(Ui(x+dx) - Ui(x)) ] } (2) 1->N where Ui is the value of the potential energy at the ith timestep and N is the number of timesteps in the simulation. Since the average is over the reference ensemble, we must constrain the system so the value of the coordinate of interest is x at each step of the simulation. In other words, we must impose the holonomic constraint sigma = x(t) - x(0) = 0 (3) during the integration of the equations of motion. The conformational coordinate may correspond to a set of internal coordinates. In that case, equation (3) implies a set of holonomic constraints. In addition to enforcing the conformational constraint, we need to carry out the perturbation (x -> x + dx), calculate the potential energy difference, U(x + dx) - U(x), and restore the constraint at each step of the simulation. With the above considerations, the following pseudo-computer code illustrates schematically the implementation of the TP method into an MD simulation: set up dynamics; specify constraint and perturbation do i = 1,N compute potential energy and forces take unconstrained dynamics step satisfy constraints perform perturbation compute potential energy restore constraints end do compute averages and thermodynamics A detailed description of an algorithm for satisfying internal coordinate constraints is given in (Tobias, 1991). We concentrate here on the tasks of specifying and performing the perturbation, and computing the difference in the potential energy of the perturbed and reference systems. The specification of the perturbation consists of identifying the degree(s) of freedom to be perturbed and the atoms whose positions change as a result of the perturbation. Our implementation allows for perturbations of distances, angles, and torsions between groups of atoms. For example, we may use a distance perturbation to study the breakup of a salt-bridge (ion pair) formed by the sidechains of lysine and glutamic acid, where x might be the distance between the N atom in the lysine sidechain and the carboxyl C atom in the glutamic acid sidechain, and the perturbation would consist of moving the entire glutamic acid residue. Alternatively, we could use an angle perturbation to study the angular dependence of the strength of a hydrogen bond between two amides, where x is the O...H-N angle, and the perturbation moves the entire hydrogen bond donor molecule. Or, we could use a torsional perturbation to study the trans-gauche isomerization in butane, where x is the dihedral angle for methyl group rotation about the central C-C bond, and the perturbation moves a terminal methyl group and the hydrogen atoms on the adjacent methylene group. In addition to simple perturbations of a single internal coordinate, we can define more complicated perturbations involving more than one internal coordinate in order to study correlated conformational transitions. For example, we could combine perturbations of the phi and psi dihedral angles to study backbone conformational equilibria in peptides (see *Notes implementation: (pimplem).). The procedure for carrying out internal coordinate perturbations during molecular dynamics simulations may be summarized as follows: after choosing an internal coordinate to perturb, and deciding which atoms will be moved by the perturbation, we compute a Cartesian displacement vector which changes the internal coordinate by a specified amount, and add the displace- ment vector to the positions of the atoms to be moved. Thus, in our imple- mentation, the perturbation can be described as a rigid body movement of the perturbed atoms relative to the unperturbed atoms. Once we have moved all of the atoms involved in the perturbation, we need to compute the potential energy difference, delta U = U1 - U0 = U(x + dx) - U(x). To do this, we could compute U(x), carry out the perturbation and compute U(x + dx), and simply take the difference. However, this direct route is computationally inefficient, because interaction energies between atoms which are not moved by the perturbation are unnecessarily recomputed. To minimize the computational effort required to compute delta U, we only consider the interactions which change as a result of the perturbation. For this purpose, we partition the system into two parts: the atoms which are moved by the perturbation (denoted by "s" for "solute"), and those which are not (denoted by "b" for "bath"). With this partitioning, we can write the potential energy as a sum of three contributions: U(x) = Uss(x) + Usb(x) + Ubb(x), (4) where Uss, Usb, and Ubb are the solute-solute, solute-bath, and bath-bath interaction energies, respectively. Clearly, Ubb(x + dx) - Ubb(x) = 0 since the positions of the bath atoms are not changed by the perturbation. Thus, U1 - U0 = Uss(x + dx) - Uss(x) + Usb(x+dx) - Usb(x). (5) Since, in typical applications, the number of solute atoms is much smaller than the number of bath atoms, equation (5) represents a large reduction in computational effort over the direct route. Before we proceed, we point out that when we need U(x + dx) in addition to delta U (e.g. for computation of conformational entropies using finite difference temperature derivatives of the TP free energy (see *Note description: (perturb).), we can use the following expression: U(x + dx) = Uss(x + dx) + Usb(x + dx) + Ubb(x) = Uss(x + dx) + Usb(x + dx) + U(x) - Uss(x) - Usb(x), (6) since Ubb(x + dx) = Ubb(x). We assume that U(x) is computed when the forces required for the propagation of the dynamics are computed. Thus, we still only need to compute the changes in the solute-solute and solute-bath interaction energies which result from the perturbation. In general, when we have a choice, we partition the system so that the solute consists of the smallest possible number of atoms. There are two good reasons for this. First, smaller solute partitions require less effort to compute the interaction energies. Second, with smaller solute partitions, there is less of a chance that the solute atoms will "run into" bath atoms as a result of the perturbation. When solute and bath atoms run into one another, there is a large, positive van der Waals contribution to deltaU. This is undesireable because large delta U values lead to poorer convergence of the average in equation (2). The partitioning of the system is especially important when the perturbations are carried out in "crowded" environments, such as in solution or in the interior of a protein. In some cases it is useful to divide the solute partition into two sections, and accomplish the desired perturbation by moving each section by half the perturbation. For example, to perturb the dihedral angle in butane by dx, we could include both methyl carbons and all of the hydrogens in the solute partition, with the C1 methyl group and C2 methylene hydrogens in one section, and the C4 methyl group and C3 hydrogens in the other section, and move each section by dx/2. This "double move" strategy is useful when the perturbation is carried out on a small molecule in a crowded environment where the movement of n atoms by dx/2 is more favorable than the movement of approximately n/2 atoms by dx. The option to perform perturbations in this fashion is available in our implementation. In principle, we could get the free energy difference between any two conformations, x0 and x1, in a single simulation using the TP theory expression: delta A = A1 - A0 = A(x1) - A(x0) = - kT ln < exp [ -(1/kT)*(U(x1) - U(x0)) ] > . (7) x0 However, in practice, for typical simulation lengths, the average in equation (7) exhibits acceptable convergence only when deltaA <= 2kT (Beveridge & DiCapua, 1989). Thus, if the free energy difference between the conformations x0 and x1 or the free energy barrier separating them, is more than about 2kT, then a single simulation is not sufficient to determine accurately the free energy difference. This problem is circumvented by breaking up the range of the coordinate, x1 - x0 into n segments or "windows", y(i), dy = (x1 - x0)/(n + 1); y(i) = x0 + (i - 1) dy; i = 1,...,n, (8) and running a series of n simulations where the free energy differences, delta A(i) = A(y(i+1)) - A(y(i) = -kT ln < exp [ -(1/kT)*(U(y(i+1)) - U(y(i))) ] > (9) y(i) are computed. Then the free energy difference between x1 and x0 is obtained by summing the results from the n windows, e.g. x1 delta A = Integral (p(deltaA(y))/py) dy x0 = Sum delta A(i), (10) 1->n where p(z) denotes the partial derivative of z. Aside from yielding more accurate free energy differences, the window method is attractive because it allows us to map out the free energy surface as a function of the conformational coordinate. By far the most time consuming task in a molecular dynamics simulation is the evaluation of the forces necessary to propagate the equations of motion. The additional work required for computing the interaction energies needed for the TP free energy differences is relatively small. Thus, it is advantageous to get more than one free energy difference from a single simulation. This is the motivation for using the so-called "double-wide" sampling method (Beveridge & DiCapua, 1989), where the free energy differences A(y + dy) - A(y) and A(y - dy) - A(y) are obtained in one simulation. Furthermore, we can divide dx into m subintervals, dy(m) = dy/m, and compute 2m free energy differences, +/- delta A(i,k) = A(y(i) +/- dy(m)) - A(y(i)) = -kT ln < exp[ -(1/kT)*(U(y(i) +/- kdy(m)) - U(y(i))) ] > y(i) k = 1,...,m; (11) over the range y(i) - dy <= y <= y(i) + dy from a single simulation with x = y(i). Then we sum the free energy differences from the various subintervals (in analogy with equation (10)) to get a free energy surface for each window. This "double-wide, multiple-point" window method allows a higher resolution mapping of the free energy surface with little additional computational effort. Let us now comment on how dy is chosen. As we have already said, dy should be chosen so that the free energy change in a given window is not more than a couple of kT. In addition, the shape of the free energy surface in a given window can be used to determine a good choice for dy. A reasonable choice for a given system can be made by considering results from short simulations with a modest dy and several subintervals at a couple of values of y in the range of interest. In general, for perturbations in crowded environments (e.g. in solution or the interior of a protein), excessively large values of dy always result in positive free energy differences. This is because the perturbation results in repulsive van der Waals interactions of the atoms in the solute partition with those in the bath partition. The value of dy where the free energy difference begins to sharply increase can then be regarded as the upper bound on acceptable dy values. Of course, it is possible that the underlying free energy surface really does rise sharply beyond the second subinterval in both directions. That is why we suggest running another test at a different value of y. In addition to running short test calculations, it is also useful to consult previous work to get a preliminary estimate for an acceptable size of a perturbation in a similar system (for several examples, see (Tobias, 1991)). In our implementation, the information needed to calculate conformational thermodynamics (free energies, internal energies, entropies, average interaction energies), and their associated statistical uncertainties, is written to a datafile during a simulation. The data file is subsequently "post-processed" to yield the quantities of interest. The alternative approach is to calculate the average properties of interest as the simulation progresses, and simply write out the final results at the end of the simulation. The latter approach has the advantage that large, cumbersome data files do not need to be saved on a mass-storage device (e.g. disk or tape). However, we prefer the post-processing approach because of the flexibility it gives us in the analysis of the data. For example, we can: examine the time evolution, and hence the "convergence", of the average properties; carry out the averaging on an arbitrary amount of the data; compare various protocols for computing the statistical uncertainties or finite-difference temperature derivatives, etc. We use the method of block averages (a.k.a. batch averages) to compute the average properties and their uncertainties (Wood, 1968). In this method, the total number of samples, N, is divided into m "batches" of n samples (mn = N), and the average of the property of interest, i, is computed for each batch i: i = (1/n) Sum O(k,i), (12) k=1->n where O(k,i) is the kth observation of O in the ith batch. The average of the N samples, , is simply the average of the batch averages: = (1/m) Sum ; (13) i=1->m i and the "uncertainty", std, is estimated from the standard deviation in the batch averages: std = ( Sum [ ( - )**2 / m(m - 1)] )**1/2 (14) i=1->n i We use equation (14) to compute the uncertainty in the average of the exponential in equation (1). Then we obtain the uncertainty in the free energy (and other thermodynamic functions) by error propagation (Young, 1962), e.g. std(delta A)**2 = (p(delta A)/p(z))**2 (std(z))**2 = (kT*std(z)/z)**2 (15) where z is the average of the exponential in equation (1). In order for the uncertainty given by equation (14) to be a good estimate of the "true" uncertainty (e.g. in a large number of random samples), the block size must be chosen so that the block averages are uncorrelated (randomly distributed), and the number of blocks is not too small for the evaluation of a meaningful standard deviation. The block size n is typically chosen arbitrarily and possible correlations in the data are ignored. More refined uncertainties can be obtained by considering the actual correlation of the data determined explicitly from the autocorrelation function (Straatsma, et al., 1986). However, we presently have no facility for carrying out the correlation function analysis.  File: PDETAIL, Node: References, Up: Theory and Methodology, Next: Theory and Methodology, Previous: Internal References Beveridge, D. L. & DiCapua, F. M. (1989), in "Computer Simulations of Biomolecular Systems", eds. van Gunsteren, W. F. & Weiner, P. K. (Escom, Leiden). Straatsma, T. P. (1987). "Free Energy Evaluation by Molecular Dynamics Simulations" (Ph.D. dissertation, Department of Physical Chemistry, University of Groningen). Tobias, D. J. (1991). "The Formation and Stability of Protein Folding Initiation Structures" (Ph.D. Dissertation, Department of Chemistry, Carnegie Mellon University). Wood, W. W. (1968), in "Physics of Simple Liquids", eds. Rowlinson, J. S. & Rushbrooke, G. S. (North-Holland, Amsterdam). Young, H. D. (1962). "Statistical Treatment of Experimental Data" (McGraw-Hill, New York).  File: PDETAIL, Node: Practice, Up: Top, Previous: Theory and Methodology, Next: Top Practice In this node we tell you how to actually set up and run free energy simulations. The calculation is done in three steps. The first two steps occur in the same input file - perturbation set up and running the dynamics. The last step, the post-processing, is generally done with a separate input file since the output of several trajectories are usually used. To set up the free energy simulation dynamics input file you start with the usual set up for a dynamics run: psf, coordinates, image input or stochastic boundary condition input etc.. In addition you have to issue free energy simulation (FES) set up commands. Currently the set up input is initiated by the TSM command (*Note syntax: (perturb).) (*Note description: (perturb).). For chemical perturbations, these com- mands define the reactant, product and colo lists; the type of simulation: slow growth or window procedure (both the thermodynamic perturbation and the thermodynamic integration methods can be done with the window proce- dure). For internal coordinate perturbations, the setup commands define the internal coordinate(s) to be perturbed, the set of atoms moved by the perturbation, and how and where the thermodynamic results will be written. * Menu: * CPrac:: Chemical Perturbation Practice * IPrac:: Internalal Coordinate Perturbation Practice  File: PDETAIL, Node: CPrac, Up: Practice, Previous: Practice, Next: IPrac CHEMICAL PERTURBATION - PRACTICE As currently configured , most of the minimization routines will work using the hybrid V(lambda) potential. We generally do any minimization prior to dynamics with the hybrid molecule unperturbed since we are really concerned with removing bad contacts. It is not guaranteed in the future that the V(lambda) will be available to the minimization routines. After the FES set up has been entered flags have been set in the program and data structures created and dynamics can be run with no changes in the commands used in any other dynamics run. One will normally run some thermalization runs with the data being discarded. For a thermalization run the SAVE command in the FES set up is generally not used. For production runs for TI or Thermodynamic Perturbation (TP) the SAVE option must be issued in the FES set up input. This will result in the output of V(R) and V(P), lambda among other things in a formatted file. All this will be discussed below with examples. * Menu: * SetUp:: Setting Up the FES Simulation and Running Dynamics * PostD:: Post-processing the Data * Optional:: Using Some Optional FES Set Up Commands  File: PDETAIL, Node: SetUp, Up: CPrac, Previous: CPrac, Next: PostD Setting Up the FES Simulation and Running Dynamics Below is a fragment of the input file for setting up the thermalization of the ethanol -> propane hybrid. Windowing will be used and we can decide at the end whether to post-process the output using TI or TP. Using the representation below the system is partioned as follows: O1--H1 / / C1--C2 \ \ C3 The reactant atoms are O1 and H1; the only product atom in this example is C3 and there is one COLO atom, C2. The methyl group C1 is an "environment" atom. It is present in both reactant and product and in our model its charge does not change in going from reactant to product. * Ethanol -> Propane * ! Read topology file READ RTF CARD * TOPOLOGY FILE ethanol -> propane * 20 1 ! Version number MASS 1 H 1.00800 ! hydrogen which can h-bond to neutral atom MASS 13 CH2E 14.02700 ! - " - two MASS 14 CH3E 15.03500 ! - " - three MASS 53 OH1 15.99940 ! hydroxy oxygen ! This is put in to force the necessity of using a GENERATE Noangles ! in the input file. The standard topology files use this statement. AUTOGENERATE ANGLEs RESI ETP 0.000 GROU C1 CH3E 0. ! environment atom C2 CH2E 0.265 ! COLO atom the charge is the reactant charge O1 OH1 -0.7 ! reactant atom H1 H 0.435 C3 ! reactant atom note the non-bonded exclusion with GROU C3 CH3E 0. ! product atom BOND C1 C2 !environment term BOND C2 O1 O1 H1 !reactant terms BOND C2 C3 !product term ! the angles MUST be specified ! note the absence of O1 C2 C3 between reactant and product atoms ANGLe C1 C2 C3 !product term ANGLe C1 C2 O1 C2 O1 H1 !reactant terms ! this will be a V(R) term. DIHED C1 C2 O1 H1 ! don't really need it but what the heck. DONO H1 O1 ACCE O1 IC C1 C2 O1 H1 1.54 111. 180. 109.5 0.96 IC C2 O1 H1 BLNK 0. 0. 0. 0. 0. IC C3 C2 C1 BLNK 0. 0. 0. 0. 0. PATCH FIRST NONE LAST NONE ! END ! Read parameter file READ PARAM CARD * parameter file for ETP hybrid. * BOND CH2E CH3E 225.0 1.54 CH2E OH1 400.0 1.42 OH1 H 450.0 0.96 THETA CH3E CH2E CH3E 45.0 112.5 CH3E CH2E OH1 45.0 111.0 CH2E OH1 H 35.0 109.5 PHI CH3E CH2E OH1 H 0.5 3 0.0 NONBONDED NBXMOD 5 ATOM CDIEL SHIFT VATOM VDISTANCE VSWIT - CUTNB 8.0 CTOFNB 7.5 CTONNB 6.5 EPS 1.0 E14FAC 0.4 WMIN 1.5 ! Emin Rmin ! (kcal/mol) (A) H 0.0440 -0.0498 0.8000 CH2E 1.77 -0.1142 2.235 1.77 -0.1 1.9 CH3E 2.17 -0.1811 2.165 1.77 -0.1 1.9 OH1 0.8400 -0.1591 1.6000 HBOND AEXP 4 REXP 6 HAEX 0 AAEX 0 NOACCEPTORS HBNOEXCLUSIONS ALL - CUTHB 0.5 CTOFHB 5.0 CTONHB 4.0 CUTHA 90.0 CTOFHA 90.0 CTONHA 90.0 ! H* N% -0.00 2.0 ! WER potential adjustment H* O* -0.00 2.0 END ! read the sequence of one residue read sequence card * ETP * 1 ETP ! Generate the hybrid molecule. Note that we use the NOANGLE command ! because of the AUTOGENERATE ANGLES command in the RTF file. GENERATE ETP SETUP NOANGLE ! determine the geometry and coordinates IC SEED 1 C1 1 C2 1 O1 IC PARAM IC PURGE IC BUILD ! The Hybrid molecule is built. Now set up the FES stuff. TSM ! Assign reactant list: REAC sele etp 1 O1 .or. etp 1 H1 end ! Assign product list: PROD sele etp 1 C2 end ! Set lambda - we will use TI or TP. ! The lambda dependence of the Hamiltonian will be linear. ! This is the default and the POWEr 1 command is actually unecessary. LAMBda .125 POWEr 1 ! The common methyl group is a colo atom. Since the charge in the ! rtf was for the reactant the RCHArge command is actually unecessary. COLO ETM 1 C2 PCHArge 0. RCHArge 0.265 ! ! This is a thermalization run - so no save statement. ! Just terminate the FES setup with an END statement. END ! Set up dynamics. ! Since we are interested in the thermodynamic properties and not ! the dynamics, we can use Langevin heat bath dynamics to maintain ! temperature equilibration. Lambda is .125. title * etp: Ethanol To Propane * FES run * !a simple expedient shake bond angle ! Set-up Langevin dynamics for temperature control scalar fbeta set 50.0 sele .not. hydrogen end ! ! open restart file for output open unit 3 write form name etp0.res ! dynamics langevin timestep 0.001 nstep 10 nprint 2 iprfrq 2 - firstt 298.0 finalt 298.0 twindl -5.0 twindh 5.0 - ichecw 1 teminc 60 ihtfrq 20 ieqfrq 200 - iasors 0 iasvel 1 iscvel 0 - iunwri 3 nsavc 0 nsavv 0 iunvel 0 - iunread -1 - !{* Nonbond options *} inbfrq 10 imgfrq 10 ilbfrq 0 tbath 300.0 rbuffer 0.0 - eps 1.0 cutnb 8.0 cutim 8.0 ctofnb 7.75 stop *END of INPUT* This file has everything you need to run the example. The topology and parameter input are included. The FES set up was initiated with the TSM command. The reactant and product lists were specified with REAC and PROD commands that use the standard CHARMM atom selection syntax. Had their been either no reactant atoms or product atoms then the command would have been REAC NONE or PROD none as the case may be. Note that specifying both would have resulted in an error condition being flagged. Since we are using the window method we specified LAMBda as being 0.125. We also explicitly specified the lambda dependence of the Hamiltonian as being (1-lambda)**1 for the reactant part and lambda**1 for the product part. Since not entering the POWEr parameter causes a default of 1 for the exponent in was unecessary to actually enter it. There is one COLO atom in the system. The product charge of the C2 methylene extended atom was 0. In the RTF the charge was .265 which is the reactant (ethanol) charge. Since that's what we want for the reactant charge there was actually no need to enter the RCHArge parameter. Again, we put it there for illustrative purposes, the default is to assume that for any COLO atom the charge in the RTF is the reactant charge unless the RCHArge parameter is included in the COLO command. Note that charges can also be changed with the SCALAR command. We could have chosen a value of the POWEr parameter other than one (i.e. non-linear lambda scaling). This is potentially useful when using the TI method for the free energy change. Non-linear scaling has one major advantage. At lambda = 0 the components of the derivatives dH(lambda)/dlambda due to the product part of the Hamiltonian are identically zero and similarly, at lambda = 1 the components due to the reactant part are zero. This solves the "lambda goes to zero catastrophe" problem. This is the problem that as lambda approaches zero or one the positions of the atoms affected (mostly product or reactant, respectively, and sometimes environment atoms bonded to them) feel forces that approach a constant or zero value (zero potential energy) and can thus have positions anywhere in phase space. Since the approximations to the ensemble averages are obtained from finite length trajectories, determining values of those quantities becomes a computationally intractable proposition. The TI integral over dlambda will tend to diverge when linear scaling is used. In both the TI and TP methods actually calculating the dynamics trajectory generally will be problematical, with large movements of the atoms resulting in bad van der Waals contacts (the r**12 repulsion eventually is felt) and fraying of bonds with lambda approaching zero or one. Another way of viewing the situation is that at lambda = 0 or 1 the product or reactant atoms, respectively, do not exist yet. Doing the perturbation to lambda' (or equivalently viewing the derivative, dH/dlambda, as a perturbation to lambda + dlambda) requires having the coordinates of atoms that do not exist yet or any longer. Non-linear scaling and the TI method can be used to avoid this difficulty for the reasons given in the previous paragraph. Another way is to scale the TI integral by a function that reduces the weight of the integrand as lambda -> 0 or 1. This is discussed in Mezei and Beveridge. For lambda = 0, if use of the TI method with non-linear lambda scaling was planned we would issue a command, prior to the FES setup, to delete the product atoms from the hybrid molecule rtf: DELEte ATOMs SELEct etp 1 C2 END This is a standard CHARMM PSF modification command and would be issued after the segment generation. Alternatively, we could have just used an RTF for ethanol. The FES setup command sequence would be modified slightly from the previous example: TSM REAC sele etp 1 O1 .or. etp 1 H1 end ! no product atom at lambda = 0 PROD NONE ! non-linear lambda scaling LAMBda .125 POWEr 2 END Note that since there are no product atoms at lambda = 0, the PROD NONE command is issued. Also there is no need for the COLO command. For lambda at 1 we can use an equivalent procedure (left as an assignment for the reader). In most of our work to date, we have used linear scaling and the TP method. To get around the catastrophe problem, we do not run dynamics at lambda = 0 or 1. Instead we run them at values of lambda a small distance away from 0 or 1 and "perturb" down to the endpoints. One potential problem may occur with this procedure. In cases, such as that of the transformation hydrophobic -> hydrophobic solute in aqueous solution, where water structure rearrangements around the solute are the major contributing factor to the free energy change, not sampling at lambda = 0 or 1 may mean that the significant part of phase space for the rearrangement is not adequately sampled. If in going from reactant -> product (or vice versa) a significant volume becomes newly accessible to the solvent, the presence of the r**-12 repulsive forces from the "almost but not completely disappeared" atoms may conceivably prevent the necessary configurations of the water molecules from appearing in the finite length trajectory. This problem has not been investigated yet. Non-linear scaling may be preferred for sampling efficiency, a debatable point that has been discussed by a number of researchers. Problems can result since the monotonicity of the integrand in the TI intregral is no longer assured. In the case of the TP method, the non-linear scaling forces the use of very small "perturbations" lambda -> lambda'. The non-linear exponent makes the delta V(lambda -> lambda') very large. For example, if the exponent is 6 and lambda = .5 and lambda' = .25, a not unreasonable "window", the potential energy term for the product gets multiplied by .5**6 = 0.16 for lambda and .25**6 = 0.00024 for lambda'. So one has terms of exp(-beta(.15V - 0.00024V ) in P P the ensemble average for the TP method, causing extremely slow converge