Tutorial:Using regular expression: Selecting sequence motifs of a Chain
From MSL-Libraries
Jump to navigationJump to searchThis is an example on how to select sequence motifs from Chain objects. A Chain object versus a System object is used because regular expressions can not span across chains.
This tutorial is in progress, there may be missing example files, source files or bugs in the code.
Complete source of example_regular_expressions.cpp
To compile
% make bin/example_regular_expresssions
To run the program
Go to the main directory and run the command (note, the location of the exampleFiles subdirectory needs to be provided as an argument)
% bin/example_regularExpressions exampleFiles/example0004.pdb
Program description
System sys;
if (!sys.readPdb(file)) {
// reading failed, error handling code here
}
// Check to make sure chain A exits in sys
if (!sys.exists("A")){
// error code here.
}
// Get a Chain object
Chain &ch = sys.getChain("A");
// Regular Expression Object
RegEx re;
// Find 3 Prolines surrounded by two Glycines on one side and three Glycines on the other
string regex = "GG(PPP)GGG";
// Now do a sequence search...
vector<pair<int,int> > matchingResidueIndices = re.getResidueRanges(ch,regex);
// Loop over each match.
for (uint m = 0; m < matches.size();m++){
// Loop over each residue for this match
for (uint r = matches[m].first; r < matches[m].second;r++){
// Get the residue
Residue &res = ch.getResidue(r);
// .. do something cool with matched residues ...
}
}