Tutorial:Selecting subsets of atoms

From MSL-Libraries
Revision as of 20:16, 23 March 2010 by Dwkulp (talk | contribs) (Created page with '<font color="red">NOTE: TUTORIAL WRITING IN PROGRESS DO NOT FOLLOW THIS</font> This is an example on how to select atoms and residues with MSL, following the example program ex…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

NOTE: TUTORIAL WRITING IN PROGRESS DO NOT FOLLOW THIS


This is an example on how to select atoms and residues with MSL, following the example program example_selecting_atoms_and_residues.cpp in the examples/ subdirectory.

Complete source of example_selecting_atoms_and_residues


Background

To compile

% make bin/example_selecting_atoms_and_residues

To run the program

Go to the main directory and run the command (note, the location of the exampleFiles subdirectory needs to be provided as an argument)

% bin/example_selecting_atoms_and_residues exampleFiles

Program description

  • Read a PDB file.
60	System container;
61	if (!container.readPdb(file)) {
		// reading failed, error handling code here
68	}
  • Print the sequence. This PDB happens to have 3 chains.
73	cout << container << endl;

which outputs:

A: {1}GLY LYS LYS MET VAL VAL ALA {8}LEU
B: {1}ASP LYS ASP PHE ALA SER GLU LYS LEU ALA GLU LEU {13}VAL
C: {1}ASP ALA LEU VAL ILE LEU {7}THR
  • In each object of the molecular hierarchy, the () operator returns elements of the next step down by index i (as a reference). The following lines demonstrate that.
92	Chain & chn = container(0); // get the first chain in the System (chain "A") by reference
93	Position & pos = chn(0);    // get the first position in the Chain (1) by reference
94	Residue & res = pos(0);     // get the first identity in the Position (ASP) by reference
95	Atom & atm = res(0);        // get the first atom in the Residue (N) by reference
  • The molecular objects can all be printed using the << operator.
97	cout << "First chain of System: " << chn << endl;
98	cout << "First position of the Chain: " << pos << endl;
99	cout << "First identity of the Position: " << res << endl;
100	cout << "First atom of the Residue: " << atm << endl;

which outputs:

First chain of System: A: {1}GLY LYS LYS MET VAL VAL ALA {8}LEU

First position of the Chain: A,1 [GLY]
First identity of the Position: A,1,GLY
First atom of the Residue: N    GLY    1  A [    23.784     49.695      6.481] (conf   1/  1) +
  • Let's now loop over the entire hierarchy. The size() function tells us how many elements there are
109	cout << "The System has " << container.size() << " chains" << endl;
110	// loop over each chain and get the number of positions
111	for (unsigned int i=0; i<container.size(); i++) {
112		cout << "   Chain " << i << " (id: " << container(i).getChainId() << ") has " << container(i).size() << " positions" << endl;
113
114		// loop over each position and get the number of identities
115		for (unsigned int j=0; j<container(i).size(); j++) {
116			cout << "      Position " << j << " (id: " << container(i)(j).getPositionId() << ") has " << container(i)(j).size() << " identity" << endl;
117
118			// loop over each identity and print the name
119			for (unsigned int k=0; k<container(i)(j).size(); k++) {
120				cout << "         Identity " << k << " (id: " << container(i)(j)(k).getIdentityId() << ") is a " << container(i)(j)(k).getResidueName() << endl;
121
122				// loop over each atom and print the name
123				for (unsigned int l=0; l<container(i)(j)(k).size(); l++) {
124					cout << "            Atom " << l << " (id: " << container(i)(j)(k)(l).getAtomOfIdentityId() << ") is a " << container(i)(j)(k)(l).getName() << endl;
125				}
126			}
127		}
128	}

which outputs:

The System has 3 chains
   Chain 0 (id: A) has 8 positions
      Position 0 (id: A,1) has 1 identity
         Identity 0 (id: A,1,GLY) is a GLY
            Atom 0 (id: A,1,GLY,N) is a N
            Atom 1 (id: A,1,GLY,CA) is a CA
            Atom 2 (id: A,1,GLY,C) is a C
            Atom 3 (id: A,1,GLY,O) is a O
      Position 1 (id: A,2) has 1 identity
         Identity 0 (id: A,2,LYS) is a LYS
            Atom 0 (id: A,2,LYS,N) is a N
            Atom 1 (id: A,2,LYS,CA) is a CA
...
  • The hierarchy is System>Chain>Position>Residue>Atom. However, in most cases each Position contains just one residue type, unless design or mutagenesis is involved. The above loop can be rewritten to skip the identities and go from Position to Atom. To do so we use the [] operator in Position. In all objects [] returns an atom (by reference).
138	Atom & aSys = container[0]; // get the first atom (N) of tye system by reference
139	Atom & aChn = chn[0];       // get the first atom (N) of the chain by reference
140	Atom & aPos = pos[0];       // get the first atom (N) of the position by reference
141	Atom & aRes = res[0];       // get the first atom (N) of the identity by reference
  • The total number of atoms is given by the function atomSize(), which is used by the Position in the loop System>Chain>Position>Atom
157	cout << "The System has " << container.size() << " chains" << endl;
158	// loop over each chain and get the number of positions
159	for (unsigned int i=0; i<container.size(); i++) {
160		cout << "   Chain " << i << " (id: " << container(i).getChainId() << ") has " << container(i).size() << " positions" << endl;
161
162		// loop over each position and get the number of atoms
163		for (unsigned int j=0; j<container(i).size(); j++) {
164			cout << "      Position " << j << " (id: " << container(i)(j).getPositionId() << ") is a " << container(i)(j).getResidueName() << " and has " << container(i)(j).atomSize() << " atoms" << endl;
165
166			// loop over each atom and print the name, note using atomSize instead of size
167			for (unsigned int k=0; k<container(i)(j).atomSize(); k++) {
168				// note using the [] operator
169				cout << "            Atom " << k << " (id: " << container(i)(j)[k].getAtomId() << ") is a " << container(i)(j)[k].getName() << endl;
170			}
171		}
172	}

which outputs:

The System has 3 chains
   Chain 0 (id: A) has 8 positions
      Position 0 (id: A,1) is a GLY and has 4 atoms
            Atom 0 (id: A,1,N) is a N
            Atom 1 (id: A,1,CA) is a CA
            Atom 2 (id: A,1,C) is a C

Back to the tutorial page