Difference between revisions of "Tutorial:Selecting subsets of atoms"

From MSL-Libraries
Jump to navigationJump to search
(Created page with '<font color="red">NOTE: TUTORIAL WRITING IN PROGRESS DO NOT FOLLOW THIS</font> This is an example on how to select atoms and residues with MSL, following the example program ex…')
 
(Background)
 
(8 intermediate revisions by the same user not shown)
Line 8: Line 8:
  
 
=== Background ===
 
=== Background ===
 +
 +
MSL objects can be selected if they inherit the  '''[[MSL Objects:Selectable|Selectable]]''' object.  So far, Atoms and Residues are Selectable.  The requirement for being selectable is to implement the
 +
virtual "addSelectableFunctions", which define what properties of the object can be selected upon.
 +
 
 +
 +
<font color="blue">Selection By Name</font>
 +
 +
Can select based on selection flags that are inside each Atom.  By default every atom has an "all" selection flag set.
 +
     
 +
You can create selection flags for atoms by:
 +
* Using  '''[[MSL Objects:Atom|Atom]]''' object:     
 +
<source lang="cpp">   
 +
    a.setSelectionFlag("foobar")
 +
</source>
 +
* Using  '''[[MSL Objects:AtomSelection|AtomSelection]]''' object:   
 +
<source lang="cpp">
 +
    as.select("foobar, chain A")
 +
</source>
 +
 +
<font color="blue">Selection By Property</font>
 +
 +
Using '''[[MSL Objects:AtomSelection|AtomSelection]]''' Object, to select all atoms of chain A:
 +
<source lang="cpp">
 +
    as.select("chain A")
 +
</source>
  
 
=== To compile ===
 
=== To compile ===
Line 21: Line 46:
  
 
=== Program description ===
 
=== Program description ===
* Read a PDB file.
+
* Creating an AtomSelection object
 
<source lang="cpp">
 
<source lang="cpp">
60 System container;
+
61 if (!container.readPdb(file)) {
+
AtomSelection sel(sys.getAtomPointers());
// reading failed, error handling code here
+
 
68 }
 
 
</source>
 
</source>
  
* Print the sequence.  This PDB happens to have 3 chains.
+
* Basic selection 'all'
 
<source lang="cpp">
 
<source lang="cpp">
73 cout << container << endl;
+
AtomPointerVector allAts = sel.select("all");
 
</source>
 
</source>
which outputs:
+
 
<source lang="text">
+
A: {1}GLY LYS LYS MET VAL VAL ALA {8}LEU
+
* Select chain A atoms and store as "chA" for easy retrieval
B: {1}ASP LYS ASP PHE ALA SER GLU LYS LEU ALA GLU LEU {13}VAL
+
<source lang="cpp">
C: {1}ASP ALA LEU VAL ILE LEU {7}THR
+
 
 +
AtomPointerVector chainAAts = sel.select("chA, chain A");
 +
 
 
</source>
 
</source>
  
* In each object of the molecular hierarchy, the () operator returns elements of the next step down by index ''i'' (as a reference).  The following lines demonstrate that.
+
 +
* Expanding the selection logic , first a long-hand selection string is used. second a short-hand version of the same selection is used.
 
<source lang="cpp">
 
<source lang="cpp">
92 Chain & chn = container(0); // get the first chain in the System (chain "A") by reference
+
 
93 Position & pos = chn(0);    // get the first position in the Chain (1) by reference
+
AtomPointerVector backbone    = sel.select("bb, (((name CA OR name N) OR name C) OR name O)");
94 Residue & res = pos(0);     // get the first identity in the Position (ASP) by reference
+
 
95 Atom & atm = res(0);       // get the first atom in the Residue (N) by reference
+
AtomPointerVector bbShortHand = sel.select("bb2, name CA+N+C+O");
 +
 
 
</source>
 
</source>
* The molecular objects can all be printed using the << operator.
+
 
 +
 
 +
 
 +
* Selections using names from previous selections
 
<source lang="cpp">
 
<source lang="cpp">
97 cout << "First chain of System: " << chn << endl;
+
AtomPointerVector test = sel.select("new, bb AND name O");
98 cout << "First position of the Chain: " << pos << endl;
 
99 cout << "First identity of the Position: " << res << endl;
 
100 cout << "First atom of the Residue: " << atm << endl;
 
 
</source>
 
</source>
which outputs:
 
<source lang="text">
 
First chain of System: A: {1}GLY LYS LYS MET VAL VAL ALA {8}LEU
 
  
First position of the Chain: A,1 [GLY]
+
* Selections using NOT operator
First identity of the Position: A,1,GLY
 
First atom of the Residue: N    GLY    1  A [    23.784    49.695      6.481] (conf  1/  1) +
 
</source>
 
* Let's now loop over the entire hierarchy.  The ''size()'' function tells us how many elements there are
 
 
<source lang="cpp">
 
<source lang="cpp">
109 cout << "The System has " << container.size() << " chains" << endl;
 
110 // loop over each chain and get the number of positions
 
111 for (unsigned int i=0; i<container.size(); i++) {
 
112 cout << "  Chain " << i << " (id: " << container(i).getChainId() << ") has " << container(i).size() << " positions" << endl;
 
113
 
114 // loop over each position and get the number of identities
 
115 for (unsigned int j=0; j<container(i).size(); j++) {
 
116 cout << "      Position " << j << " (id: " << container(i)(j).getPositionId() << ") has " << container(i)(j).size() << " identity" << endl;
 
117
 
118 // loop over each identity and print the name
 
119 for (unsigned int k=0; k<container(i)(j).size(); k++) {
 
120 cout << "        Identity " << k << " (id: " << container(i)(j)(k).getIdentityId() << ") is a " << container(i)(j)(k).getResidueName() << endl;
 
121
 
122 // loop over each atom and print the name
 
123 for (unsigned int l=0; l<container(i)(j)(k).size(); l++) {
 
124 cout << "            Atom " << l << " (id: " << container(i)(j)(k)(l).getAtomOfIdentityId() << ") is a " << container(i)(j)(k)(l).getName() << endl;
 
125 }
 
126 }
 
127 }
 
128 }
 
</source>
 
  
which outputs:
+
AtomPointerVector notBB = sel.select("notBB, not name CA+C+O+N");
<source lang="text">
+
 
The System has 3 chains
 
  Chain 0 (id: A) has 8 positions
 
      Position 0 (id: A,1) has 1 identity
 
        Identity 0 (id: A,1,GLY) is a GLY
 
            Atom 0 (id: A,1,GLY,N) is a N
 
            Atom 1 (id: A,1,GLY,CA) is a CA
 
            Atom 2 (id: A,1,GLY,C) is a C
 
            Atom 3 (id: A,1,GLY,O) is a O
 
      Position 1 (id: A,2) has 1 identity
 
        Identity 0 (id: A,2,LYS) is a LYS
 
            Atom 0 (id: A,2,LYS,N) is a N
 
            Atom 1 (id: A,2,LYS,CA) is a CA
 
...
 
 
</source>
 
</source>
  
* The hierarchy is '''System'''>'''Chain'''>'''Position'''>'''Residue'''>'''Atom'''.  However, in most cases each Position contains just one residue type, unless design or mutagenesis is involved.  The above loop can be rewritten to skip the identities and go from Position to Atom.  To do so we use the [] operator in Position.  In all objects [] returns an atom (by reference).
+
* Selections using range "-" operator.
 
 
 
<source lang="cpp">
 
<source lang="cpp">
138 Atom & aSys = container[0]; // get the first atom (N) of tye system by reference
+
AtomPointerVector res26 = sel.select("res26, resi 2-6");
139 Atom & aChn = chn[0];      // get the first atom (N) of the chain by reference
 
140 Atom & aPos = pos[0];      // get the first atom (N) of the position by reference
 
141 Atom & aRes = res[0];       // get the first atom (N) of the identity by reference
 
 
</source>
 
</source>
  
* The total number of atoms is given by the function ''atomSize()'', which is used by the Position in the loop System>Chain>Position>Atom
+
* Selection using distances to a sub-selection
 
<source lang="cpp">
 
<source lang="cpp">
157 cout << "The System has " << container.size() << " chains" << endl;
+
AtomPointerVector oxygensHbondSer = sel.select("oxygensHbondSer, name O WITHIN 3.5 OF name OG");
158 // loop over each chain and get the number of positions
 
159 for (unsigned int i=0; i<container.size(); i++) {
 
160 cout << "  Chain " << i << " (id: " << container(i).getChainId() << ") has " << container(i).size() << " positions" << endl;
 
161
 
162 // loop over each position and get the number of atoms
 
163 for (unsigned int j=0; j<container(i).size(); j++) {
 
164 cout << "     Position " << j << " (id: " << container(i)(j).getPositionId() << ") is a " << container(i)(j).getResidueName() << " and has " << container(i)(j).atomSize() << " atoms" << endl;
 
165
 
166 // loop over each atom and print the name, note using atomSize instead of size
 
167 for (unsigned int k=0; k<container(i)(j).atomSize(); k++) {
 
168 // note using the [] operator
 
169 cout << "            Atom " << k << " (id: " << container(i)(j)[k].getAtomId() << ") is a " << container(i)(j)[k].getName() << endl;
 
170 }
 
171 }
 
172 }
 
 
</source>
 
</source>
  
which outputs:
+
 
<source lang="text">
 
The System has 3 chains
 
  Chain 0 (id: A) has 8 positions
 
      Position 0 (id: A,1) is a GLY and has 4 atoms
 
            Atom 0 (id: A,1,N) is a N
 
            Atom 1 (id: A,1,CA) is a CA
 
            Atom 2 (id: A,1,C) is a C
 
</source>
 
  
 
----
 
----

Latest revision as of 22:02, 23 March 2010

NOTE: TUTORIAL WRITING IN PROGRESS DO NOT FOLLOW THIS


This is an example on how to select atoms and residues with MSL, following the example program example_selecting_atoms_and_residues.cpp in the examples/ subdirectory.

Complete source of example_selecting_atoms_and_residues


Background

MSL objects can be selected if they inherit the Selectable object. So far, Atoms and Residues are Selectable. The requirement for being selectable is to implement the virtual "addSelectableFunctions", which define what properties of the object can be selected upon.


Selection By Name

Can select based on selection flags that are inside each Atom. By default every atom has an "all" selection flag set.

You can create selection flags for atoms by:

    
    a.setSelectionFlag("foobar")
    as.select("foobar, chain A")

Selection By Property

Using AtomSelection Object, to select all atoms of chain A:

    as.select("chain A")

To compile

% make bin/example_selecting_atoms_and_residues

To run the program

Go to the main directory and run the command (note, the location of the exampleFiles subdirectory needs to be provided as an argument)

% bin/example_selecting_atoms_and_residues exampleFiles

Program description

  • Creating an AtomSelection object
	
	AtomSelection sel(sys.getAtomPointers());
  • Basic selection 'all'
	AtomPointerVector allAts = sel.select("all");


  • Select chain A atoms and store as "chA" for easy retrieval
	AtomPointerVector chainAAts = sel.select("chA, chain A");


  • Expanding the selection logic , first a long-hand selection string is used. second a short-hand version of the same selection is used.
	AtomPointerVector backbone    = sel.select("bb, (((name CA OR name N) OR name C) OR name O)");

	AtomPointerVector bbShortHand = sel.select("bb2, name CA+N+C+O");


  • Selections using names from previous selections
	AtomPointerVector test = sel.select("new, bb AND name O");
  • Selections using NOT operator
	AtomPointerVector notBB = sel.select("notBB, not name CA+C+O+N");
  • Selections using range "-" operator.
	AtomPointerVector res26 = sel.select("res26, resi 2-6");
  • Selection using distances to a sub-selection
	AtomPointerVector oxygensHbondSer = sel.select("oxygensHbondSer, name O WITHIN 3.5 OF name OG");



Back to the tutorial page