Difference between revisions of "Google Summer of Code Ideas"

From MSL-Libraries
Jump to navigationJump to search
(Developing a website to run MSL applications)
 
(28 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
'''MSL (Molecular Software Libraries)''' is an open source C++ library of objects related to macromolecular modeling.
 
'''MSL (Molecular Software Libraries)''' is an open source C++ library of objects related to macromolecular modeling.
 +
[[File:Msl_hierarchy.png | frame | Hierarchy of molecular objects in MSL ([http://www.ncbi.nlm.nih.gov/pubmed/22565567 Kulp et al. 2012])]]
 +
[[File:FtsB.png | frame | Structural model of a membrane protein obtained with MSL ([http://www.ncbi.nlm.nih.gov/pubmed/23520975 LaPointe et al. 2013])]]
 +
[[File:side_chains.png | frame | A unique feature of MSL is that it can hold multiple type of side-chain types (c, d) and/or multiple conformations of the side-chain (a, b) at the same position in a protein ([http://www.ncbi.nlm.nih.gov/pubmed/22565567 Kulp et al. 2012])]]
 +
[[File:TRP_EBL.png | frame | A library of side chain conformations created with MSL ([http://www.ncbi.nlm.nih.gov/pubmed/22576292 Subramaniam et. al Proteins 2012])]]
  
'''Goal'''
 
  
The goal of MSL is to create a toolbox to enable a programmer to design powerful and advanced applications for molecular analysis, modeling, prediction and design.  At the same time, MSL should also allow the rapid creation of applications that perform simple tasks (for example, measuring the distance between two atoms).
+
==Goal==
  
'''History'''
+
The goal of MSL is to be an efficient toolbox to enable a programmer to design powerful and advanced applications for molecular analysis, modeling, prediction and design.  At the same time, MSL is designed with simplicity in mind to also allow the rapid creation of applications that perform simple tasks (for example, measuring the distance between two atoms).
 +
 
 +
==History==
  
 
MSL has been developed in the past 5 years (9 years if we include its early predecessor).  It has an extensive code base (over 150 objects and 100,000 lines of code).  After years in beta, the first stable version (1.0) was released in 2012.
 
MSL has been developed in the past 5 years (9 years if we include its early predecessor).  It has an extensive code base (over 150 objects and 100,000 lines of code).  After years in beta, the first stable version (1.0) was released in 2012.
Line 13: Line 18:
 
MSL is described in its primary citation:
 
MSL is described in its primary citation:
 
* Kulp DW, Subramaniam S, Donald JE, Hannigan BT, Mueller BK, Grigoryan G and Senes A "Structural informatics, modeling and design with an open-source Molecular Software Library (MSL)", ''J Comput. Chem.'' 2012 '''33(20)''', 1645-61 ([http://senes.biochem.wisc.edu/pdf/22968_ftp.pdf Download PDF])
 
* Kulp DW, Subramaniam S, Donald JE, Hannigan BT, Mueller BK, Grigoryan G and Senes A "Structural informatics, modeling and design with an open-source Molecular Software Library (MSL)", ''J Comput. Chem.'' 2012 '''33(20)''', 1645-61 ([http://senes.biochem.wisc.edu/pdf/22968_ftp.pdf Download PDF])
 +
 
   
 
   
 
MSL has been extensively used in scientific publications encompassing a variety of applications. For example,  [http://senes.biochem.wisc.edu/pdf/FtsBdimer-preprint.pdf protein structure prediction], [http://www.ncbi.nlm.nih.gov/pubmed/23422424 protein design], [http://www.ncbi.nlm.nih.gov/pubmed/23089864 protein engineering] and [http://senes.biochem.wisc.edu/pdf/EBL-2012-Proteins.pdf modeling algorithms/methods], and others.
 
MSL has been extensively used in scientific publications encompassing a variety of applications. For example,  [http://senes.biochem.wisc.edu/pdf/FtsBdimer-preprint.pdf protein structure prediction], [http://www.ncbi.nlm.nih.gov/pubmed/23422424 protein design], [http://www.ncbi.nlm.nih.gov/pubmed/23089864 protein engineering] and [http://senes.biochem.wisc.edu/pdf/EBL-2012-Proteins.pdf modeling algorithms/methods], and others.
  
 +
==Current development==
  
Place some figures here
 
 
[[File:Msl_hierarchy.png]]
 
  
  
 +
The main theme of MSL development will continue to be the creation of a flexible yet powerful software framework that is easily accessible to programmers familiar with C++/object-orientation. 
  
 
+
We are striving to improve the core framework through
'''Current development'''
 
 
 
The focus so far has been on creating a flexible yet powerful software framework that is easily accessible to programmers familiar with C++/object-orientation. This will continue to be the main theme of MSL development and we are striving to improve the core framework through
 
 
* more advanced objects
 
* more advanced objects
 
* more efficient implementations
 
* more efficient implementations
Line 38: Line 40:
 
   
 
   
  
With these goals in mind, we have come up with a list of project ideas that will greatly help MSL developers and users.
+
There are a number of areas, listed below, in which GSoC students of different background and skills could contribute to realize these goals.
  
 
== Ideas ==
 
== Ideas ==
  
  
1) '''Implementing a generic language-interface (swig.org) allowing for easy MSL use in python, R, perl, octave, etc.'''
+
====Embedding MSL code for use in higher-level languages====
  
 
'''Brief explanation:'''
 
'''Brief explanation:'''
 +
 +
The idea is to make MSL code available to programs in other higher-level languages.  This could be done using an interface compiler  like the SWIG (see swig.org)
  
 
'''Expected results:'''
 
'''Expected results:'''
 +
 +
Infrastructure and example applications in multiple higher-level languages.
  
 
'''Knowledge Prerequisite:'''
 
'''Knowledge Prerequisite:'''
  
 +
C++ and one of Python or R or perl
  
  
  
2) '''MSL-light'''
+
====MSL-Light====
  
 
'''Brief explanation:'''
 
'''Brief explanation:'''
 +
 +
The objects in MSL are sometimes too complex for simple operations on large molecules. The idea is to implement light-weight versions of some core objects, for example, the AtomContainer object is a lighter version of the System object.
  
 
'''Expected results:'''
 
'''Expected results:'''
 +
 +
A collection of light-weight objects and methods.
  
 
'''Knowledge Prerequisite:'''
 
'''Knowledge Prerequisite:'''
  
 +
C++, Data Structures and Algorithms
  
  
  
3) '''Convert MSL pointers to using a Smart/Owning pointer system'''
+
====Smart pointers in MSL====
  
 
'''Brief explanation:'''
 
'''Brief explanation:'''
 +
 +
MSL uses basic pointers which are prone to bugs due to dangling pointers, memory leaks, etc.,
  
 
'''Expected results:'''
 
'''Expected results:'''
 +
 +
Replace regular pointers with the appropriate smart pointers in MSL.
  
 
'''Knowledge Prerequisite:'''
 
'''Knowledge Prerequisite:'''
  
 +
C++
  
  
  
4) '''Create a centralized options mechanism that makes managing options easier.'''
+
====Efficient program options management====
  
 
'''Brief explanation:'''
 
'''Brief explanation:'''
 +
 +
MSL has an advanced OptionParser object that supports name/value based program options. A mechanism needs to be implemented to group program options for MSL that uses the underlying OptionParser.
  
 
'''Expected results:'''
 
'''Expected results:'''
 +
 +
The mechanism developed should facilitate option reuse and in general, easy option handling in MSL applications.
  
 
'''Knowledge Prerequisite:'''
 
'''Knowledge Prerequisite:'''
  
 +
C++
  
  
  
5) '''Create Debian/Ubuntu deb packages and Redhat rpm packages to facilitate distribution and installation.'''
+
====MSL Linux Distribution Packages====
  
 
'''Brief explanation:'''  
 
'''Brief explanation:'''  
 +
 +
The MSL needs to be compiled from source. The distribution of the software as a .deb and/or .rpm package will simplify installation in some of the most popular Linux distros.
  
 
'''Expected results:'''
 
'''Expected results:'''
 +
 +
One-click/command MSL installation.
  
 
'''Knowledge Prerequisite:'''
 
'''Knowledge Prerequisite:'''
  
 +
Linux, Make
  
6) '''Creating a web interface to run MSL programs.'''
 
  
 +
====Developing a website to run MSL applications====
  
 
'''Brief explanation:'''
 
'''Brief explanation:'''
  
There are several MSL applications that are directly useful to structural biologists. For example, the sidechain modeling programs, the structure prediction programs, etc., are of general interest and it would be useful to create a web interface that allows the use these applications with a minimal/no knowledge of MSL.
+
There are several MSL applications that are directly useful to structural biologists. For example, the side-chain modeling programs, the structure prediction programs, etc., are of general interest and it would be useful to create a web interface that allows the use of these applications with a minimal/no knowledge of MSL.
 +
 
 +
'''Expected results:'''
 +
 
 +
A website to run MSL applications on the lab cluster, post process the output and make the results available to the user.
 +
 
 +
'''Knowledge Prerequisite:'''
 +
 
 +
Scripting Languages and Web Design
 +
 
 +
 
 +
====Expansion and improvement of the example and test programs====
  
 +
'''Brief explanation:'''
 +
 +
MSL ships with a number of small example programs which are to be used as starting points for a programmer interested in adopting the software.  MSL also has a number of quality control test programs of known output that are run every time a new commit is performed. 
  
 
'''Expected results:'''
 
'''Expected results:'''
  
A page on the lab website that manages jobs submitted by users. This may also involve running multiple programs on the lab cluster, post processing, and making the results available.
+
Expansion and improvement of example and test programs. This is an area that will allow a student to get familiar with the molecular modeling principles implemented in MSL.
 
 
  
 
'''Knowledge Prerequisite:'''
 
'''Knowledge Prerequisite:'''
  
Scripting Languages and Web Design
+
C++.  Knowledge of proteins and modeling is beneficial but not required.
  
 +
==Contact==
  
Please email '''mslib-discussion@lists.sourceforge.net''' if you would like to work on any of these ideas.
+
Please email '''sabareeshs@gmail.com''' if you are interested in working on any of these ideas and would like to obtain more information.

Latest revision as of 18:28, 2 April 2013

MSL (Molecular Software Libraries) is an open source C++ library of objects related to macromolecular modeling.

Hierarchy of molecular objects in MSL (Kulp et al. 2012)
Structural model of a membrane protein obtained with MSL (LaPointe et al. 2013)
A unique feature of MSL is that it can hold multiple type of side-chain types (c, d) and/or multiple conformations of the side-chain (a, b) at the same position in a protein (Kulp et al. 2012)
A library of side chain conformations created with MSL (Subramaniam et. al Proteins 2012)


Goal

The goal of MSL is to be an efficient toolbox to enable a programmer to design powerful and advanced applications for molecular analysis, modeling, prediction and design. At the same time, MSL is designed with simplicity in mind to also allow the rapid creation of applications that perform simple tasks (for example, measuring the distance between two atoms).

History

MSL has been developed in the past 5 years (9 years if we include its early predecessor). It has an extensive code base (over 150 objects and 100,000 lines of code). After years in beta, the first stable version (1.0) was released in 2012.

The code can be downloaded at the MSL SourceForge page and the entire development tree is available at the MSL SVN repository. Several examples are now available in the repository that demonstrate the capabilities of MSL objects in modeling applications.

MSL is described in its primary citation:

  • Kulp DW, Subramaniam S, Donald JE, Hannigan BT, Mueller BK, Grigoryan G and Senes A "Structural informatics, modeling and design with an open-source Molecular Software Library (MSL)", J Comput. Chem. 2012 33(20), 1645-61 (Download PDF)


MSL has been extensively used in scientific publications encompassing a variety of applications. For example, protein structure prediction, protein design, protein engineering and modeling algorithms/methods, and others.

Current development

The main theme of MSL development will continue to be the creation of a flexible yet powerful software framework that is easily accessible to programmers familiar with C++/object-orientation.

We are striving to improve the core framework through

  • more advanced objects
  • more efficient implementations
  • implementing more modeling algorithms and protocols

Another important direction is making MSL more accessible to a wider audience by

  • creating interfaces to other programming languages
  • making distribution of MSL easier
  • building ready-to-use applications
  • hosting MSL applications on web servers for public use


There are a number of areas, listed below, in which GSoC students of different background and skills could contribute to realize these goals.

Ideas

Embedding MSL code for use in higher-level languages

Brief explanation:

The idea is to make MSL code available to programs in other higher-level languages. This could be done using an interface compiler like the SWIG (see swig.org)

Expected results:

Infrastructure and example applications in multiple higher-level languages.

Knowledge Prerequisite:

C++ and one of Python or R or perl


MSL-Light

Brief explanation:

The objects in MSL are sometimes too complex for simple operations on large molecules. The idea is to implement light-weight versions of some core objects, for example, the AtomContainer object is a lighter version of the System object.

Expected results:

A collection of light-weight objects and methods.

Knowledge Prerequisite:

C++, Data Structures and Algorithms


Smart pointers in MSL

Brief explanation:

MSL uses basic pointers which are prone to bugs due to dangling pointers, memory leaks, etc.,

Expected results:

Replace regular pointers with the appropriate smart pointers in MSL.

Knowledge Prerequisite:

C++


Efficient program options management

Brief explanation:

MSL has an advanced OptionParser object that supports name/value based program options. A mechanism needs to be implemented to group program options for MSL that uses the underlying OptionParser.

Expected results:

The mechanism developed should facilitate option reuse and in general, easy option handling in MSL applications.

Knowledge Prerequisite:

C++


MSL Linux Distribution Packages

Brief explanation:

The MSL needs to be compiled from source. The distribution of the software as a .deb and/or .rpm package will simplify installation in some of the most popular Linux distros.

Expected results:

One-click/command MSL installation.

Knowledge Prerequisite:

Linux, Make


Developing a website to run MSL applications

Brief explanation:

There are several MSL applications that are directly useful to structural biologists. For example, the side-chain modeling programs, the structure prediction programs, etc., are of general interest and it would be useful to create a web interface that allows the use of these applications with a minimal/no knowledge of MSL.

Expected results:

A website to run MSL applications on the lab cluster, post process the output and make the results available to the user.

Knowledge Prerequisite:

Scripting Languages and Web Design


Expansion and improvement of the example and test programs

Brief explanation:

MSL ships with a number of small example programs which are to be used as starting points for a programmer interested in adopting the software. MSL also has a number of quality control test programs of known output that are run every time a new commit is performed.

Expected results:

Expansion and improvement of example and test programs. This is an area that will allow a student to get familiar with the molecular modeling principles implemented in MSL.

Knowledge Prerequisite:

C++. Knowledge of proteins and modeling is beneficial but not required.

Contact

Please email sabareeshs@gmail.com if you are interested in working on any of these ideas and would like to obtain more information.