Team:MIT/Tools/

From 2011.igem.org

(Difference between revisions)
Line 61: Line 61:
* Analyze the runs of the models through histograms and clustering
* Analyze the runs of the models through histograms and clustering
-
''mcell'' is a thin layer on top of the combination of <a href="http://ryanroper.wordpress.com/2011/05/04/multiscale-simulation-with-compucell3d-and-bionetsolver/">BionetSolver</a>, which model internal cell states as systems of ODEs, and <a href="http://compucell3d.org">CompuCell3D</a>, which models cell shape dynamics using the Glazier-Graner-Hogeweg methos. It was created largely in reaction to inconveniences experienced when using BionetSolver and CompuCell3D as detailed  <a href="http://ryanroper.wordpress.com/2011/05/04/multiscale-simulation-with-compucell3d-and-bionetsolver/">here</a>. It is in continuous development.
+
''mcell'' is a thin layer on top of the combination of <a href="http://ryanroper.wordpress.com/2011/05/04/multiscale-simulation-with-compucell3d-and-bionetsolver/">BionetSolver</a>, which model internal cell states as systems of ODEs, and <a href="http://compucell3d.org">CompuCell3D</a>, which models cell shape dynamics using the Glazier-Graner-Hogeweg methos. It was created largely in reaction to inconveniences experienced when using BionetSolver and CompuCell3D as detailed  <a href="http://ryanroper.wordpress.com/2011/05/04/multiscale-simulation-with-compucell3d-and-bionetsolver/">here</a>. It is in continuous development. We encourage you to read through the code and look at our modeling data as examples to understand the framework.
<h3> Quick Links</h3>
<h3> Quick Links</h3>
Line 105: Line 105:
<h3> Directory Structure </h3>
<h3> Directory Structure </h3>
<p><em>mcell</em> uses a directory structure to manage its models. All models are contained in a <em>modeling directory</em>, the location of which is configurable in <em>mcell.py</em>. Every folder in the modeling directory is considered by <em>mcell</em> to be a model. A model folder contains 3 necessary files: a CompuCell3D configuration file (<em>cc3d.xml</em>), a hook file(<em>hook.py</em>), and a file containing <code>ModelSteppable</code> (<em>steppable.py</em>). A model folder also contains files that start with "<em>circuit_</em>", which are all assumed to be SBML models of circuits. These files define a model. </p>
<p><em>mcell</em> uses a directory structure to manage its models. All models are contained in a <em>modeling directory</em>, the location of which is configurable in <em>mcell.py</em>. Every folder in the modeling directory is considered by <em>mcell</em> to be a model. A model folder contains 3 necessary files: a CompuCell3D configuration file (<em>cc3d.xml</em>), a hook file(<em>hook.py</em>), and a file containing <code>ModelSteppable</code> (<em>steppable.py</em>). A model folder also contains files that start with "<em>circuit_</em>", which are all assumed to be SBML models of circuits. These files define a model. </p>
 +
 +
<p>All folders inside the model folder are considered to be runs of the model; the name of the folder is the name of the run.  Inside the run folder, the <em>LatticeData</em> folder contains vtk files that describe the state of the cell shapes at every frame of the model; the <em>CellData.txt</em> file which describes the complete state of every cell for every frame, one frame per line; and the <em>model</em> folder, which contains the final model files with parameters substituted into them. (To run a model, <em>mcell</em> actually loads the model files, adds in the given parameters, writes them to the <em>model</em> folder, and runs CompuCell3D on <em>cc3d.xml</em> in the <em>model</em> folder.)</p>
<h4>Parameters</h4>
<h4>Parameters</h4>
-
<p>Each of these files can contain strings in the form "%(something)s"; when the model is run with <code>model.run("NewRun", something=2)</code>, the parameter <code>2</code> will be substituted in place of that string for the duration of the run.</p>
+
<p>Each of the model files can contain strings in the form "%(something)s"; when the model is run with <code>model.run("NewRun", something=2)</code>, the parameter <code>2</code> will be substituted in place of that string for the duration of the run.</p>
-
<p>All folders inside the model folder are considered to be runs of the model; the name of the folder is the name of the run.  Inside the run folder, the <em>LatticeData</em> folder contains vtk files that describe the state of the cell shapes at every frame of the model; the <em>CellData.txt</em> file which describes the complete state of every cell for every frame, one frame per line; and the <em>model</em> folder, which contains the final model files with parameters substituted into them. (To run a model, <em>mcell</em> actually loads the model files, adds in the given parameters, writes them to the <em>model</em> folder, and runs CompuCell3D on <em>cc3d.xml</em> in the <em>model</em> folder.)</p>
+
<h4>Easy definition of a model</h4>
 +
<p>here are several utility functions in <em>mcell_model_utils.py</em> that allow easier definition of the coupling between internal and external cell states. Note: <em>mcell_model_utils.py</em> should be imported into every model's steppable inside the <code>ModelSteppable</code> class definition for them to work. All functions expect a setup for <code>ModelSteppable</code> as in the <em>Template</em> model.</p>
 +
<p><code>self.setToWeightedValuesOf("a", "b")</code> sets protein "a" in all cells to the value of "b" in the neighboring cells, weighted by contact area. This is used to easily set up a model of surface proteins interacting on adjacent cells.</p>
 +
<p><code>self.updateAdhesionMolecule("a", "b")</code> sets the amount of adhesion molecule "a" in CompuCell to the amount of protein "b" for every cell. This is used in conjunction with the AdhesionFlex plugin; it expects <code>self.adhesionFlexPlugin = CompuCell.getAdhesionFlexPlugin()</code> in the initialization of a steppable.</p>
-
<h3> Command Line Management </h3>
+
<h3>Command Line Management </h3>
<p>To run <em>mcell</em>, run python and type <code>from mcell import *</code>. This initializes a <code>models</code> dictionary containing all the models in your modeling directory. Each model is a fully functional python object! If we have a model called <em>Test</em>, then:
<p>To run <em>mcell</em>, run python and type <code>from mcell import *</code>. This initializes a <code>models</code> dictionary containing all the models in your modeling directory. Each model is a fully functional python object! If we have a model called <em>Test</em>, then:
<ul>
<ul>
Line 122: Line 127:
   <li><code>models['Test'].destroy()</code> irrevocably destroys the model.</li>
   <li><code>models['Test'].destroy()</code> irrevocably destroys the model.</li>
</ul>
</ul>
 +
 +
<h2><a id="analysis"> Rendering and Analysis of Runs </a></h2>
 +
<p>If <code>r</code> is a <code>Run</code> object (taken from something like <code>r = models['Test'].runs['NewRun']</code>), then:
 +
<ul>
 +
  <li><code>r.load()</code> will load the run data into memory. This is time and memory intensive, as, unfortunatly, multicellular runs entail a lot of data.</li>
 +
  <li><code>r.render(species="Reporter", unload=False, pix_size=5)</code> will render the model into <em>movie.avi</em> in the run's folder, coloring the cells proportional to the amount or "Reporter" protein that they express. The data for the model will not be unloaded after the run, and the pixel size of a singly CompuCell3D matrix entry is 5. All these arguments are optional and default to the values given here.</li>
 +
  <li><code>r.render_hist_movie(species="Reporter")</code> will render a movie of the histograms of the values of the "Reporter" protein in all the cells as the simulation evolves. The argument is optional and "Reporter" is the default. The name of the movie will be <em>histograms.avi</em>.</li>
 +
  <li><code>r.frames</code> is an array that, after <code>load</code>ing the model, contains <code>Frame</code> objects that contain the raw data for every frame of the simulation. Specifically, <code>r.frames[0].cell_states[2]['Reporter']</code> contains the value of the "Reporter" protein in cell with id 2 during the first frame. <code>r.frames[0].cell_types</code>, <code>r.frames[0].cell_ids</code>, and <code>r.frames[0].cell_clusters</code> are 2-dimensional arrays containing the cell types, the cell ids, and the cell clusters on the CompuCell3D "playing field"
 +
for that frame.</li>
 +
</ul>
 +
</p>

Revision as of 05:42, 28 September 2011

Navigation

  • G-Level Assembly
  • Geneious
  • mcell Modeling Platform
assembly stuf
geneious stuff

''mcell'' - A Multicellular Modeling Framework

''mcell'' is a small set of Python classes that allows the enterprising modeler to: * Easily create very flexible models of of multicellular dynamics * Manage the models already created through a simple command-line interface * Easily change defined parameters in models * Render the runs of the models in a convenient, simple way * Analyze the runs of the models through histograms and clustering ''mcell'' is a thin layer on top of the combination of BionetSolver, which model internal cell states as systems of ODEs, and CompuCell3D, which models cell shape dynamics using the Glazier-Graner-Hogeweg methos. It was created largely in reaction to inconveniences experienced when using BionetSolver and CompuCell3D as detailed here. It is in continuous development. We encourage you to read through the code and look at our modeling data as examples to understand the framework.

Quick Links

Why?

Over the course of the summer, we tried several ways of modeling multicellular dynamics. The first thing we tried was everyone's trusted favorite, MATLAB. We generated a system of several thousand ODEs, in which groups of 5 ODEs represented the state of a single cell, and were coupled to ODEs describing adjacent cells. Unfortunately, this was very, very slow, and did not allow for the modeling of cell shape dynamics.

Then, we discovered a way to couple CompuCell3D with ODE models of internal cell states. We realized that, if we wanted to do any sort of reasonably flexible modeling, we needed to do something like this. So, we downloaded CompuCell3D, read the documentation, and started modeling various circuits.

Now, CompuCell3D coupled with BionetSolver is a great system. However, we quickly discovered that this wasn't quite enough for our purposes. For instance, CompuCell3D allows one to dynamically change the "cell type", and this the cell color. However, as we were trying to engineer cells, we were interested in the dynamics of the protein expression levels -- in fundamentally continuous quantities -- rather than a discrete notion of "cell type"! CompuCell 3D would not let us visualize the values that we needed -- so we wrote a Python script to render the models the way we wanted to.

Other problems emerged. We wanted to be able to analyze the data in other ways -- in particular, we wanted to see histograms of the levels of expression of some protein. We wanted to be able to cluster cells by the dynamics of some protein. Both of these problems led to more Python scripts. In addition, we realized that trying to manage all our models was becoming quite inconvenient without a unified organization scheme.

So, we had several hastily-written, badly organized Python scripts and many, many folders scattered around. We then decided that this problem needs fixing, and created a small set of classes to unify the object model do the file management, rendering, and analysis, for us -- using a convenient and extensible command-line interface. Thus, mcell was born.

Installation

Dependencies: Python 2.5.6, CompuCell3D 3.5.0, BionetSolver 1.0.0 for Python 2.5, matplotlib, PIL, and hcluster.

Also very useful: Synthetic Biology Workbench (for defining circuits)

Download mcell from here. Unzip the files somewhere into your PATH. Open mcell.py, and edit the variables between #START CONFIG and #END CONFIG. (They have self-explanatory names, and their functions are explained in the file.)

How it works

In mcell models, BionetSolver models the internal cell states, and CompuCell3D models the cell dynamics.

BionetSolver

BionetSolver reads one or several SBML files, each of which define a circuit as a system of chemical reactions in several containers with given rate laws. SBML models are simply XML files, and can be written with a text editor; however, it is much easier to define them using a graphical designer, like JDesigner, or using a simple scripting language, like [Jarnac]. (Both of those editors can be obtained by installing the Synthetic Biology Workbench.

After loading the circuits, BionetSolver is in posession of a system of ODEs that define the internal state of each modeled cell. To simulate each cell, BionetSolver simply Euler-steps its ODE system forward with a fixed time step. Some of the variables in the ODE model (like, say, the concentration of a protein on the neighboring cells) are actually parameters that are continually updated from the CompuCell3D thread.

More documentation on BionetSolver can be found by downloading it from the link above.

CompuCell3D

CompuCell3D uses the Glazier-Graner-Hogeweg model to simulate cell shape dynamics. The playing field is a matrix. An entry of the matrix corresponds to an ID of the cell occupying that spot on the playing field. Thus, a cell is represented as a set of points on the matrix. The medium the cell are in is represented by a single large cell occupying all the "empty space".

To iterate the model forward, CompuCell3D chooses a random point on the matrix -- the invading point -- and a random adjacent point -- the receiving point. If the two points are from different cells, then the receiving point's cell id becomes set to the invading point's cell id with some probability e^-(T*k(state)), where T is set in the configuration file and k(state) is a function of the state of the system that is determined by the plugins defined in the configuration file. (For example, the Contact plugin adds a term to k that makes cells of one cell type want to stick or not stick to cells of another cell type. The Surface plugin adds a term that grows larger as the surface area of the cell deviates from a defined amount.) This procedure is called a Monte Carlo step (MCS). Certain parameters of the plugins -- say, the type, or the adhesiveness of a cell -- can be continuously updated to reflect the internal state of the cell.

Coupling BionetSolver and CompuCell3D

CompuCell3D provides a mechanism to script its function called a Steppable. Thus, to couple the internal states of the cells and the cell shape dynamics, a steppable called ModelSteppable is used. ModelSteppable initializes BionetSolver with circuits before CompuCell starts running its model, but after it has created the initial state of the cells. Then, CompuCell3D takes several MCSs to update the cell shape state. ModelSteppable then loops over the cells, updating the internal states of the cells in the BionetSolver circuits to match the external states, and the external state parameters in CompuCell3D to match the internal states of the cells. The full internal state every cell is printed to one file, and the state of the cell shapes is printed to another file. This cycle then repeats many, many times.

Management of Models

Directory Structure

mcell uses a directory structure to manage its models. All models are contained in a modeling directory, the location of which is configurable in mcell.py. Every folder in the modeling directory is considered by mcell to be a model. A model folder contains 3 necessary files: a CompuCell3D configuration file (cc3d.xml), a hook file(hook.py), and a file containing ModelSteppable (steppable.py). A model folder also contains files that start with "circuit_", which are all assumed to be SBML models of circuits. These files define a model.

All folders inside the model folder are considered to be runs of the model; the name of the folder is the name of the run. Inside the run folder, the LatticeData folder contains vtk files that describe the state of the cell shapes at every frame of the model; the CellData.txt file which describes the complete state of every cell for every frame, one frame per line; and the model folder, which contains the final model files with parameters substituted into them. (To run a model, mcell actually loads the model files, adds in the given parameters, writes them to the model folder, and runs CompuCell3D on cc3d.xml in the model folder.)

Parameters

Each of the model files can contain strings in the form "%(something)s"; when the model is run with model.run("NewRun", something=2), the parameter 2 will be substituted in place of that string for the duration of the run.

Easy definition of a model

here are several utility functions in mcell_model_utils.py that allow easier definition of the coupling between internal and external cell states. Note: mcell_model_utils.py should be imported into every model's steppable inside the ModelSteppable class definition for them to work. All functions expect a setup for ModelSteppable as in the Template model.

self.setToWeightedValuesOf("a", "b") sets protein "a" in all cells to the value of "b" in the neighboring cells, weighted by contact area. This is used to easily set up a model of surface proteins interacting on adjacent cells.

self.updateAdhesionMolecule("a", "b") sets the amount of adhesion molecule "a" in CompuCell to the amount of protein "b" for every cell. This is used in conjunction with the AdhesionFlex plugin; it expects self.adhesionFlexPlugin = CompuCell.getAdhesionFlexPlugin() in the initialization of a steppable.

Command Line Management

To run mcell, run python and type from mcell import *. This initializes a models dictionary containing all the models in your modeling directory. Each model is a fully functional python object! If we have a model called Test, then:

  • models['Test'].runs is a dictionary containing all the runs of the object (which are also Python objects)
  • models['Test'].cc3d_config contains a python object that represents the cc3d.xml for the model. running models['Test'].cc3d_config.edit() would open your favorite text editor to edit the config, and running models['Test'].cc3d_config.copy(dest) will copy the CC3D configuration to dest. The same goes for the hook file(.hook) and the ModelSteppable file(.steppable).
  • models['Test'].circuits is an array containing python objects like described above that reference the circuits in the model.
  • models['Test'].clone('NewModel') creates a new model identical to Test called NewModel. This is incredibly useful for creating models that are similar to other models, something done quite often. Cloning the Template model creates a fairly good setup to begin defining a model.
  • models['Test'].run('NewRun', params) runs Test with parameters params (a dictionary) and creates the NewRun run. Parameters for a model are defined as described in the Parameters section above.
  • models['Test'].destroy() irrevocably destroys the model.

Rendering and Analysis of Runs

If r is a Run object (taken from something like r = models['Test'].runs['NewRun']), then:

  • r.load() will load the run data into memory. This is time and memory intensive, as, unfortunatly, multicellular runs entail a lot of data.
  • r.render(species="Reporter", unload=False, pix_size=5) will render the model into movie.avi in the run's folder, coloring the cells proportional to the amount or "Reporter" protein that they express. The data for the model will not be unloaded after the run, and the pixel size of a singly CompuCell3D matrix entry is 5. All these arguments are optional and default to the values given here.
  • r.render_hist_movie(species="Reporter") will render a movie of the histograms of the values of the "Reporter" protein in all the cells as the simulation evolves. The argument is optional and "Reporter" is the default. The name of the movie will be histograms.avi.
  • r.frames is an array that, after loading the model, contains Frame objects that contain the raw data for every frame of the simulation. Specifically, r.frames[0].cell_states[2]['Reporter'] contains the value of the "Reporter" protein in cell with id 2 during the first frame. r.frames[0].cell_types, r.frames[0].cell_ids, and r.frames[0].cell_clusters are 2-dimensional arrays containing the cell types, the cell ids, and the cell clusters on the CompuCell3D "playing field" for that frame.