Performance and future of cellulose modeling

A complete discussion of all the issues surrounding the performance of molecular dynamics simulations is beyond the scope of this book and remains an active area of research in its own right. However, this chapter would not be complete without at least some discussion of the typical resource requirements for MD simulations of celluloses and the types of time scales that can be reached.

MD simulations are very computationally intense, typically requiring access to the world’s most powerful supercomputers in order to simulate sufficient timescales (10 ns to 1 ^s) for biological systems (10 thousand to several million atoms). The computational complexity lies in the fact that there are an extremely large number of pairwise interactions that must be calculated at each time step. This would typically be on the order of 20 million or more for a system of 80 000 atoms. Then to access information on a biological timescale, it is necessary to propagate the system through time. As mentioned above, the length of a time step is typically limited to about 2 fs if the motions of all heavy atoms are to be simulated. This means that to obtain 100 ns of trajectory data requires evaluation of 5 x 1010 time steps and each time step requires the 20 million non-bond energy evaluations.

Unlike Monte Carlo simulations, where each individual energy evaluation can be per­formed independently of other calculations, molecular dynamics simulations involve nu­merically solving the integral over time. This means that the next step of the MD trajectory cannot be computed until all previous steps have been computed in order. This makes computation in parallel difficult, requiring extremely low latency interconnects between processors and careful distribution of work. Even then, the distribution of work across mul­tiple CPUs is limited to a single time step. There is a multiple time step method in which the low-frequency motions are partially uncoupled from the high-frequency motions in such a way that low-frequency contributions to the dynamics are not calculated on every step, increasing performance as much as a factor of two. There is significant effort being expended in developing more efficient molecular dynamics software.