We imagine a world in which biologist are able to share their data with the click of a button. A world in which thorough reliable measuring procedures exist that help scientist characterize their parts using a reliable user friendly computing environment. A world in which not just data but also the models supporting that data are shared. A world in which scientist can not only share the burden of modelling but also the burden of computation.
Cumulus is our way of making that world come a little closer. Cumulus has been developed to run both on the cloud and on a local grid by combining the azure platform with windows clients. Considering as a study case the parameter study applications for our particular project.
How we came to building Cumulus
One of the first things we did when we started to model our circuit was to review the most popular modelling tools in the iGEM competition. You can read the results of this review here. In summary we decided all the available tools were to constraining and that custom build software was in order.
The quest for parameters
Even building a simple model of a genetic circuit will give the most experienced modellers pause. Mainly because the behaviors of different biobrick parts are poorly characterized. Especially when it comes to parameters. While trying to model our circuit we quickly discovered that most of the parameters we found in literature where very specific to the publication and did not generalize to other circuits. In our opinion this is because scientist are most used to sharing results and not their raw data. If we could find a way to combine all the data available on a part into a single characterization maybe we could produce some more robust results.
Scaling up the computation
Cumulus uses a parameter optimization algorithm that is capable of evaluating a single parameter setting in the context of multiple experiments by simulating all the experiments with the same parameters and combining the score later on. All this simulating and comparing would consume large amounts of computational resources. It would be unreasonable to assume that every scientist can simple amass such grid-computing facilities for himself. So we also wanted to make cumulus an open platform on which everyone can share data. This is why we designed cumulus as a cloud enabled application with parallelisation as its core criterium.
Advantages of Cumulus
The combination of cloud based computation and generic modeling, yields a lot of advantages. Some of them are listed below:
Advantages of a cloud based application
- General public access, everyone can use everyone data, models, data and results.
- Sharing computational resources enables us the reap the economy of scale.
- The flexibility of a cloud application helps to avoid the resource wast of under-utilization.
Advantages of our generic modelling approach
- Exchangeable models, share not only you data but also the model which you think describes your parts best.
- Simulate your cells using a simple simulation engine
- Fit the parameters of you simulation to experimental data at the click of a button.
- Genetic algorithms allow cumulus to explore very large high dimensional parameters spaces.
- Reap the benefits of overlapping experiments. Our modelling system allows you to use data from two different experiments that share some, but not all, biobrick parts. Hence we can better improve the characterization of all parts in both experiments.