Difference between revisions of "Tutorial:Selecting subsets of atoms"

From MSL-Libraries
Jump to navigationJump to search
(Created page with '<font color="red">NOTE: TUTORIAL WRITING IN PROGRESS DO NOT FOLLOW THIS</font> This is an example on how to select atoms and residues with MSL, following the example program ex…')
 
Line 8: Line 8:
  
 
=== Background ===
 
=== Background ===
 
+
 +
MSL objects can be selected if they inherit the "Selectable" object. 
 +
 +
            Selection By Name:
 +
        Can select based on selection flags that are inside each Atom.
 +
      By default every atom has an "all" selection flag set.
 +
     
 +
        You can create selection flags for atoms by:
 +
          1. Using Atom object:            a.setSelectionFlag("foobar")
 +
          2. Using AtomSelection object:    as.select("foobar, chain A")
 +
 +
               
 +
    Selection By Property:         
 +
          Chain A:    as.select("chain A")
 
=== To compile ===
 
=== To compile ===
 
<source lang="text">
 
<source lang="text">
Line 27: Line 40:
 
// reading failed, error handling code here
 
// reading failed, error handling code here
 
68 }
 
68 }
</source>
 
 
* Print the sequence.  This PDB happens to have 3 chains.
 
<source lang="cpp">
 
73 cout << container << endl;
 
</source>
 
which outputs:
 
<source lang="text">
 
A: {1}GLY LYS LYS MET VAL VAL ALA {8}LEU
 
B: {1}ASP LYS ASP PHE ALA SER GLU LYS LEU ALA GLU LEU {13}VAL
 
C: {1}ASP ALA LEU VAL ILE LEU {7}THR
 
</source>
 
 
* In each object of the molecular hierarchy, the () operator returns elements of the next step down by index ''i'' (as a reference).  The following lines demonstrate that.
 
<source lang="cpp">
 
92 Chain & chn = container(0); // get the first chain in the System (chain "A") by reference
 
93 Position & pos = chn(0);    // get the first position in the Chain (1) by reference
 
94 Residue & res = pos(0);    // get the first identity in the Position (ASP) by reference
 
95 Atom & atm = res(0);        // get the first atom in the Residue (N) by reference
 
</source>
 
* The molecular objects can all be printed using the << operator.
 
<source lang="cpp">
 
97 cout << "First chain of System: " << chn << endl;
 
98 cout << "First position of the Chain: " << pos << endl;
 
99 cout << "First identity of the Position: " << res << endl;
 
100 cout << "First atom of the Residue: " << atm << endl;
 
</source>
 
which outputs:
 
<source lang="text">
 
First chain of System: A: {1}GLY LYS LYS MET VAL VAL ALA {8}LEU
 
 
First position of the Chain: A,1 [GLY]
 
First identity of the Position: A,1,GLY
 
First atom of the Residue: N    GLY    1  A [    23.784    49.695      6.481] (conf  1/  1) +
 
</source>
 
* Let's now loop over the entire hierarchy.  The ''size()'' function tells us how many elements there are
 
<source lang="cpp">
 
109 cout << "The System has " << container.size() << " chains" << endl;
 
110 // loop over each chain and get the number of positions
 
111 for (unsigned int i=0; i<container.size(); i++) {
 
112 cout << "  Chain " << i << " (id: " << container(i).getChainId() << ") has " << container(i).size() << " positions" << endl;
 
113
 
114 // loop over each position and get the number of identities
 
115 for (unsigned int j=0; j<container(i).size(); j++) {
 
116 cout << "      Position " << j << " (id: " << container(i)(j).getPositionId() << ") has " << container(i)(j).size() << " identity" << endl;
 
117
 
118 // loop over each identity and print the name
 
119 for (unsigned int k=0; k<container(i)(j).size(); k++) {
 
120 cout << "        Identity " << k << " (id: " << container(i)(j)(k).getIdentityId() << ") is a " << container(i)(j)(k).getResidueName() << endl;
 
121
 
122 // loop over each atom and print the name
 
123 for (unsigned int l=0; l<container(i)(j)(k).size(); l++) {
 
124 cout << "            Atom " << l << " (id: " << container(i)(j)(k)(l).getAtomOfIdentityId() << ") is a " << container(i)(j)(k)(l).getName() << endl;
 
125 }
 
126 }
 
127 }
 
128 }
 
</source>
 
 
which outputs:
 
<source lang="text">
 
The System has 3 chains
 
  Chain 0 (id: A) has 8 positions
 
      Position 0 (id: A,1) has 1 identity
 
        Identity 0 (id: A,1,GLY) is a GLY
 
            Atom 0 (id: A,1,GLY,N) is a N
 
            Atom 1 (id: A,1,GLY,CA) is a CA
 
            Atom 2 (id: A,1,GLY,C) is a C
 
            Atom 3 (id: A,1,GLY,O) is a O
 
      Position 1 (id: A,2) has 1 identity
 
        Identity 0 (id: A,2,LYS) is a LYS
 
            Atom 0 (id: A,2,LYS,N) is a N
 
            Atom 1 (id: A,2,LYS,CA) is a CA
 
...
 
</source>
 
 
* The hierarchy is '''System'''>'''Chain'''>'''Position'''>'''Residue'''>'''Atom'''.  However, in most cases each Position contains just one residue type, unless design or mutagenesis is involved.  The above loop can be rewritten to skip the identities and go from Position to Atom.  To do so we use the [] operator in Position.  In all objects [] returns an atom (by reference).
 
 
<source lang="cpp">
 
138 Atom & aSys = container[0]; // get the first atom (N) of tye system by reference
 
139 Atom & aChn = chn[0];      // get the first atom (N) of the chain by reference
 
140 Atom & aPos = pos[0];      // get the first atom (N) of the position by reference
 
141 Atom & aRes = res[0];      // get the first atom (N) of the identity by reference
 
</source>
 
 
*  The total number of atoms is given by the function ''atomSize()'', which is used by the Position in the loop System>Chain>Position>Atom
 
<source lang="cpp">
 
157 cout << "The System has " << container.size() << " chains" << endl;
 
158 // loop over each chain and get the number of positions
 
159 for (unsigned int i=0; i<container.size(); i++) {
 
160 cout << "  Chain " << i << " (id: " << container(i).getChainId() << ") has " << container(i).size() << " positions" << endl;
 
161
 
162 // loop over each position and get the number of atoms
 
163 for (unsigned int j=0; j<container(i).size(); j++) {
 
164 cout << "      Position " << j << " (id: " << container(i)(j).getPositionId() << ") is a " << container(i)(j).getResidueName() << " and has " << container(i)(j).atomSize() << " atoms" << endl;
 
165
 
166 // loop over each atom and print the name, note using atomSize instead of size
 
167 for (unsigned int k=0; k<container(i)(j).atomSize(); k++) {
 
168 // note using the [] operator
 
169 cout << "            Atom " << k << " (id: " << container(i)(j)[k].getAtomId() << ") is a " << container(i)(j)[k].getName() << endl;
 
170 }
 
171 }
 
172 }
 
</source>
 
 
which outputs:
 
<source lang="text">
 
The System has 3 chains
 
  Chain 0 (id: A) has 8 positions
 
      Position 0 (id: A,1) is a GLY and has 4 atoms
 
            Atom 0 (id: A,1,N) is a N
 
            Atom 1 (id: A,1,CA) is a CA
 
            Atom 2 (id: A,1,C) is a C
 
 
</source>
 
</source>
  

Revision as of 20:47, 23 March 2010

NOTE: TUTORIAL WRITING IN PROGRESS DO NOT FOLLOW THIS


This is an example on how to select atoms and residues with MSL, following the example program example_selecting_atoms_and_residues.cpp in the examples/ subdirectory.

Complete source of example_selecting_atoms_and_residues


Background

MSL objects can be selected if they inherit the "Selectable" object.

            Selection By Name:

Can select based on selection flags that are inside each Atom. By default every atom has an "all" selection flag set.

You can create selection flags for atoms by: 1. Using Atom object: a.setSelectionFlag("foobar") 2. Using AtomSelection object: as.select("foobar, chain A")


Selection By Property: Chain A: as.select("chain A")

To compile

% make bin/example_selecting_atoms_and_residues

To run the program

Go to the main directory and run the command (note, the location of the exampleFiles subdirectory needs to be provided as an argument)

% bin/example_selecting_atoms_and_residues exampleFiles

Program description

  • Read a PDB file.
60	System container;
61	if (!container.readPdb(file)) {
		// reading failed, error handling code here
68	}

Back to the tutorial page