Molecular modeling

Molecular dynamics and molecular mechanics calculations have been used extensively to examine cellulose, often giving unexpected results. Because of inadequate information con­tained in fiber diffraction patterns noted above, all models for cellulose have been developed to some extent through modeling wherein constraints are imposed on the solution to com­plement inadequate data sets. The constraints most often used are dimensions of the unit cell together with assumptions regarding symmetry that allow the development of a recip­rocal space. These constraints are not unlike boundary conditions necessary for the solution of differential equations in the contexts of mathematical formulations of descriptions of specific physical phenomena.

The molecular dynamics (MD) simulations carried out recently by Matthews and cowork­ers represent the least constrained analyses (14). The only constraint used was an initial condition corresponding to the published structure of the Ip form. More recently, similar simulations were carried out with the structure of the Ia form as the initial condition and conclusions were essentially the same as those derived from simulation with the structure of the Ip form as the initial condition. During the course of the simulations, a number of structural fluctuations and changes occurred. Over the length of the simulations, aver­age unit cell dimensions shifted away from those reported on the basis of diffractometric measurements; here we use the term unit cell only to provide a basis for comparison with the published structures of the same forms. These dimensions varied with position in the aggregate relative to the surfaces and the chain termini. The results, averaged over these cellobiose units, are summarized in Figure 6.4 and compared with the crystallographic unit cell of the Ip form.

In the simulation, the aggregate was observed to undergo an expansion in which the value of lattice constant a increased from 7.784 to 8.470 A, while the b value decreased slightly from 8.201 to 8.112 A. The c value expanded significantly, from 10.380 to 10.512 A. In addition, the у angle decreased from 96.5° to almost orthogonal, у ~ 90°. The unit cell a-axis (corresponding to the distance between hydrogen bonded sheets) in this simulation differs considerably from that proposed in the crystallographic models. The terminology regarding lattice structures and unit cells is used here primarily to allow comparisons with the structures proposed on the basis of diffractometric measurements.

Another extremely significant change in the structure of the aggregate that occurred during the simulations is that many of the C6 primary alcohol groups underwent rotational transitions away from the conformations reported for the diffraction-based structure. This exocyclic group has three low-energy staggered conformations, which are named TG, GG, and GT, with the first letter in these labels specifying the position of the O6 atom as either trans or gauche with respect to the O5 atom, and the second letter specifies its relationship to the C4 atom (see Figure 6.5) (20). In both the Ia and Ip diffraction-based structures, all of these exocyclic groups are in the TG conformation. This is also one of the consequences of the constraints of symmetry.

image080

Figure 6.4 Left: the cellulose Ip crystal unit cell determined by fiber diffraction; right: the trajectory — averaged unit cell for the simulation of the diagonal crystal. Hydrogen atoms are omitted for clarity, and positions obtained by symmetry operations are transparent. (Reproduced in color as Plate 7.)

In the Ip diffraction-based structure, all these exocyclic groups are in the TG confor­mation. In this conformation, the exocyclic hydroxyl groups can hydrogen-bond along the chain or to adjacent chains in the same layer, but no hydrogen bonds between layers can occur. For those layers of the aggregate made up of the origin chains, there was little change in structure in the MD simulation from that of the diffraction structure, and the hydro­gen bonding pattern remained the same. This result is remarkably similar to the reported

image081

Figure 6.5 Nomenclature of primary alcohol conformation. The dihedral angle measured by C4-C5-C6- O6 is shown. (Reproduced in color as Plate 8.)

image082 image083 image084

image085Hydrogen bonding in

Figure 6.6 Single frames from the center chain layers illustrating three different hydrogen bond patterns. Left: the pattern is similar to the predominant pattern from the crystal structure, but the rotation to GG makes the HO2-O6 hydrogen bond across the glycosidic linkage impossible;center: the hydrogen bond pattern is very similar to the less occupied pattern from the crystal structure;right: hydrogen bonds from HO6 in a center chain to O2 in an origin layer chain, which is not shown for clarity. (Reproduced in color as Plate 9.) crystallographic hydrogen bond network in origin chains, where the O2 hydroxyl group was refined to just one of the two possible hydrogen bond positions (8). However, in the MD simulations, in every other layer in the interior of the aggregate, made up of the center chains in the diffraction-based structure, this primary alcohol group rotated from the starting TG conformation to the GG position. In this GG conformation, three rapidly interchanging hydrogen bond patterns were possible, as shown in Figure 6.6.

One of these patterns allowed hydrogen bonding between layers, which was not possible when all the hydroxymethyl groups were in the TG conformation. On the surfaces, where the anhydroglucose monomers were in direct contact with water, the hydrogen bonds to the freely diffusing water molecules helped to introduce considerable disorder into these primary alcohol conformations and promoted frequent transitions, but the interior portions of the aggregate developed two distinct patterns of hydroxymethyl conformations between the center and origin layers. Primary alcohol groups in surface chains alternate between facing toward the interior and facing the solvent, and the conformation of these surface groups corresponds to the local environment. Several NMR studies have determined that the conformations of surface cellulose chains are different from the interior, and as in the new simulation, contain both GG and GT rotamers (20-23). The presence of two rotamers is also consistent with the Raman spectra of Ia and Ip to be discussed further in a following section. In the spectra of both Ia and Ip the scissors vibration of the methylene group on C6 results in two bands in the region above 1450 cm-1; the methylene scissors vibration is

image086

Figure 6.7 Views of the central portion of one center chain and one origin chain from the middle of the diagonal crystal, illustrating the inter-plane hydrogen bonds which can occur after the center chain primary alcohol groups rotate to the GG conformation. Hydrogen bonds between layers are indicated with dashed lines. (Reproduced in color as Plate 10.)

the only one that gives rise to a band above 1450 cm-1. Thus, the occurrence of two bands is consistent with the presence of two rotamers.

In the GG conformation, the primary alcohol groups are essentially perpendicular to the average planes of the anhydroglucose rings and as a result are pointing up and down toward the origin chains of the layers above and below. In this conformation, the exocyclic groups can make good O6-O2 hydrogen bonds between layers. Since under normal conditions cellulose exhibits no tendency for layers to slip relative to one another, the existence of such stabilizing hydrogen bonds may not seem so implausible. However, in this conformation steric clashes between the center chain primary alcohol groups and the origin layers above and below force the center chains to tilt significantly with respect to the plane of their own layer. Such a tilt was also found by Heiner and Teleman (24).

Probably the most significant change that occurred during the simulations was that the aggregate quickly developed a small right-hand twist during the heating and equilibration interval and the twist remained relatively stable throughout the rest of the simulation. Figure 6.7 illustrates this twist, with the middle hydrogen-bonded sheet shown in detail. In this figure, the average twist angle for each successive glycosidic linkage is shown. These angles are defined as the dihedral angle for the four C1 carbon atoms illustrated as joined by the dark lines in the figure. Although this angle varies considerably near the non-reducing end, apparently because of edge effects, the twist in the middle of the chain is fairly constant at around 1.4—1.7° per linkage, with an overall twist for this short oligosaccharide segment of almost 9.9° calculated from the first and last rows (which includes considerable irregularity due to the highly frayed structure of the non-reducing ends).

Imposition of the constraint of the symmetry of space group P21 confines cellulose chains to an exact twofold helix, and this constraint can be satisfied by many combinations of torsion angles across the glycosidic linkage (reported either as = H1-C1-O-C4/ and ф H = C1-O-

C4/-H4/ or as фо = O5-C1-O-C4/ and фC = C1-O-C4/-C5/). However, the line connecting twofold helical structures for cellobiose in ф, ф space does not coincide with a free energy minimum (25, 26). Cellulose oligomers in solution are extended, but do not have a flat ribbon structure (27, 28). The preference of cellulose chains to adopt conformations away from a twofold helix is frustrated in a crystalline state by packing and hydrogen bonding requirements. Equilibrium organization of the aggregate has each of the individual interior chains departing slightly from the flat starting structure, on average forming a right-handed helix. The helix of each chain corresponds to the overall twist of fiber in a manner similar to the twist seen in protein p-sheets (29, 30).

In addition to the findings of the simulation studies, it is helpful to consider the source of the helical patterns at the level of the individual monomers in cellulose; it is now accepted that anhydrocellobiose is the repeat unit of structure in cellulose as it implicitly defines the glucosidic linkage as well. The results of the earliest conformational energy mappings available (31,32) show that two energy minima associated with variations in dihedral angles of glycosidic linkage correspond to relatively small left — and right-handed departures from glycosidic linkage conformations consistent with twofold helical symmetry. More recent all-atom conformational energy maps for cellobiose exhibit the same qualitative topology (25). Local minima also represent values of dihedral angles very similar to those reported for cellobiose and methyl p-cellobioside on the basis of crystallographic analyses (9, 10). The relationship between different conformations is represented in Figure 6.8, which was adapted (6) from a diagram first presented by Reese and Skerrett (31).

Figure 6.8 is a ф /ф map presenting different categories of information concerning con­formation of the anhydrocellobiose unit as a function of the values of two dihedral angles

image087

Figure 6.8 ф/ф map adapted from Ref. (9). (———— ) Loci of structures with constant anhydroglucose repeat

periods as noted in Angstroms. (…) Loci of structures of constant intramolecular bond O5-O31 distances.

(—- ) Contours of potential energy minima based on non-bonded interactions in cellobiose. J — cellobiose;

W — p-methylcellobioside;n = 2, the twofold helix line;n = 3, the threefold helix line;(R) right handed; (L) left handed. The Meyer and Misch structure is at ф = 180, ф = 0.

about bonds in the glycosidic linkage. ф is defined as the dihedral angle about the bond between C4 and the glycosidic linkage oxygen and ф as the dihedral angle about the bond between C1 and the glycosidic linkage oxygen. The parallel lines indicated by n = 3(L), 2, and 3(R) represent values of dihedral angles consistent with a left-handed threefold helical conformation, a twofold helical conformation, and a right-handed threefold helical confor­mation, respectively. A twofold helical conformation inherently does not have a handedness to it. Dashed contours represent conformations that have indicated repeat period per anhy — droglucose unit; the innermost represents a period of 5.25 A corresponding to 10.5 A per anhydrocellobiose unit. Two dotted lines indicate conformations corresponding to values of 2.5 and 2.8 A for the distance between the two oxygen atoms anchoring the intramolecular hydrogen bond between the C3 hydroxyl group of one anhydroglucose unit and the ring oxygen of the adjacent unit; values bracket the range wherein hydrogen bonds are regarded as strong.

The two domains defined by solid lines on either side of the twofold helix line (n = 2) represent the potential energy minima calculated by Reese and Skerrett for different confor­mations of cellobiose (31). Finally, the points marked by J and W represent the structure of cellobiose determined by Chu and Jeffrey (9) and that of methyl p cellobioside determined by Ham and Williams (10). The key point to be kept in mind with this diagram is that structures along the twofold helix line and with a repeat period of 10.3 A per anhydrocel­lobiose unit possess an unacceptable degree of overlap between the van der Waals radii of the hydrogen atoms on either side of the glycosidic linkage.

Figure 6.8 shows that the structure of glycosidic linkage in cellulose is not likely to coin­cide with the line representing twofold helical structures. Rather, it is likely to be on either side of the twofold helix line as are the structures of cellobiose determined by Chu and Jeffrey, designated (J) and of p-methylcellobioside determined by Ham and Williams des­ignated (W). However, because of the repeat distance per anhydroglucose unit, one would expect the glycosidic linkages in cellulose to be much closer to the twofold line than are the two dimeric structures. On the other hand, the SS 13C NMR spectra show a splitting of the resonances at C1 and C4. Thus, it seems plausible that values of the glycosidic di­hedrals in the cellulose chain might alternate between a small left-handed departure and a slightly larger right-handed departure from the twofold helix line. The net effect would be a slow, long-period, right-handed helical structure. Such an alternating pattern was ob­served in the stable equilibrium structure at the conclusion of the MD simulation (14). This pattern demonstrates that the long-period helical twist is a consequence of impor­tant characteristics of glycosidic linkages in cellulose rather than an artifact of a complex simulation.

From a broader perspective, a very important result of molecular modeling validates the approach represented by the theoretical model used. The finding that the cellulose aggregate is stable reflects that cellulose is insoluble in water beyond the octamer. The stability of the aggregate at equilibrium is not the result of any constraints or boundary conditions imposed on the solution of the equilibrium structure, but rather evidences that the molecular modeling has captured some essential distinctive properties of cellulose. Indeed, the measure of its true approximation of the nature of cellulose is that the insolubility is predicted for chains that are 12 anhydroglucose units in length. Furthermore, the results of the modeling are consistent with microscopic observations of long-period helical structures, and they explain the structure oftheHCH scissors vibration bands in the Raman spectra of Ia and Ip.

The results are also consistent with the effects of small-diameter fibrillation to be discussed further below.

It is important at this point to return to Figure 6.3, panel B and consider its implications. The application of a long period of 1200 nm to all of the fibrils is intended to allow comparison of the effects of lateral dimensions on the twisting. A period of 1200 nm is evidently too short a period for fibrils of Valonia and Halocynthia, considering that such periods are rarely observed in electron microscopy. On the basis of the observation of a long period of 1200 nm for Micrasterias, we anticipate that the period for a 20 by 20 nm fibril is likely to be 2500 nm or more. That dimension is 2.5 ^m and would be well beyond the field size in a high-magnification electron micrograph.

Another important point most obvious for the 20 by 20 nm fibril in panel B is that the center chain remains linear though it will be twisted by 90°. The other chains, however, develop some curvature so that the corner chains are obviously quite curved, and curvature increases with distance from the center. Thus, the resistance of the aggregate to inherent tendencies of the cellulose molecules to acquire a helical orientation increases with lateral dimension. The possibility of shear stresses developing within a fibril increases with lateral dimension also. This may well be why load-bearing structures of higher plants have fibrils with such small lateral dimensions. Because of small diameters, they are not likely to de­velop significant internal shear stresses. Because their association with neighboring fibrils is mediated by water, they can move parallel to each other when under load. Panel B in Figure 6.3 clearly suggests that as the lateral dimensions are reduced, the long-period helical twist can be more easily accommodated. The implications of the long-period twist for the subject of cell wall deconstruction will be discussed further in a following section.