Category Archives: Biomass Recalcitrance

Computational Approaches to Study Cellulose Hydrolysis

Michael F. Crowley and Ross C. Walker

8.1 Introduction

Molecular modeling is a process employing a powerful set of tools for probing the atomistic mechanisms of cellulose hydrolysis, and will feature prominently in efforts to harness cel­lulose as a biomass energy resource. It provides a bridge between theoretical concepts and proposed mechanisms and the experimental data. Molecular modeling is most powerful when used in a synergistic fashion with experimentation where hypothesis-driven compu­tational research is used to help explain theoretical observations and to drive the design of future experiments. Molecular modeling encompasses the entire range of computational approaches available to the molecular biologist and it is beyond the scope of this book to give a comprehensive overview of all the computational models that exist. Instead we will concentrate on the subset of molecular modeling termed molecular dynamics, which will prove crucial in advancing our understanding of the behavior of celluloses and cellulases on the atomistic scale.

Molecular dynamics (MD) is generally used as a virtual experimental tool to probe the structure, function, kinetic, and thermodynamic properties of substances. In the biomolec­ular field, it has been invaluable in validating structures and elucidating mechanisms of structural stability and conformational change and for understanding interactions between molecules, their ligands, and their constituent parts. Most of MD is based on classical molec­ular mechanics (MM) with a smaller amount of work on using quantum mechanics (QM) with molecular mechanics to produce hybrid, QM/MM, dynamics methods. Although many of the computational methods used in molecular dynamics studies of biomolecular systems are mature, having been extensively applied to proteins, small molecules, and to nucleic acids, there has been, until recently comparatively little interest in the use of MD methods for carbohydrates and even less so for cellulose. The thrust of this chapter will be to doc­ument the current state of MD methods and MM force fields with the intent of inspiring greater use of the tools for the study of cellulosic recalcitrance to complement experimental studies, to answer questions that are unapproachable by current experimental technology, and to provide to experimentalists a wish list of new experimental targets, mutations, and structural information. We will include the work already accomplished, and outline the currently available methods and the kinds of questions they can answer for systems of the

Biomass Recalcitrance: Deconstructing the Plant Cell Wall for Bioenergy. Edited by Michael. E. Himmel © 2008 Blackwell Publishing Ltd. ISBN: 978-1-405-16360-6

size, complexity, and chemical nature of cellulose and the enzymes and other biomolecules that interact with it.

Experimental investigation of hydrolysis

In order to investigate the relative rates of hydrolysis and dehydration, experiments were conducted on xylose, xylan, and xylobiose using a small, glass reactor that was heated with microwave energy [CEM-Discover]. In these experiments, products were measured using high pressure liquid chromatography (HPLC). The microwave reactor system con­sists of an 8 mL glass tube enclosed inside the cavity of a microwave heating unit. Ex­periments were conducted with 2 mL of solution containing the substrate in an aqueous solution of sulfuric acid (1.2% by weight). The tube was fitted with a Teflon-coated cap and a pressure sensor, which are designed to contain pressures up to 250 psi. The tubes contained a Teflon-coated stir bar and the temperature was measured with an optical py­rometer. Batch experiments were conducted in which the samples were heated to a fixed temperature.

Hydrolyzates were subjected to chemical analysis using HPLC to determine the con­centrations of xylose, xylobiose, and xylose degradation products present in the reaction solutions. The solutions were analyzed using an Agilent 1100 series HPLC with an HPX — 87H column and a precolumn (Bio-Rad Laboratories) operated at 65°C. The eluant was 0.01N H2SO4 flowing at 0.6 mL/min. Samples and standards were injected (10 ^L) onto the column after filtering through a 0.45 ^m nylon membrane filter (Pall, Acrodisc Sy­ringe filter). Solute concentrations were measured with an Agilent 1100 refractive index detector controlled to 45° C and a diode array detector. The detectors were calibrated with a set of four standards for all solutes except xylobiose, which had a single-point calibra­tion. The HPLC was controlled and data was analyzed using Agilent Chemstation software (rev A.09.03).

Before the microwave heating system was used for kinetic measurements of xylobiose and xylan decomposition, an accurate temperature in the reactor was obtained. The provided optical pyrometer measures the infrared light emitted from the reactor and could provide an inaccurate temperature if the walls of the reactor were cooler than the solution. A more accurate technique for measuring the temperature would be to use a chemical reaction with known activation energy (chemical thermometer). In this study, we used the thermal decomposition of xylose in acid solutions as our chemical thermometer. We measured the decomposition of xylose at a fixed nominal temperature and compared the measured rate constant to the values reported in the literature to extract an effective temperature. The relationship between the rate constant for the decomposition of xylose and the temperature has been reported (19) to be

k = 0.0453a 8ухСце-3570( т ) (Equation I)

where a is the ratio of the density of xylose solution to that of the solution without xylose (a = 1), 8 is the specific gravity of water at a given temperature relative to the specific gravity at 30°C, yx is a correlation constant that was empirically determined (yx = 0.95), Cn is the acid concentration, and т is the absolute temperature.

Nominal temperature (°C)

Rate constant (s 1)

Effective temperature (°C)

125

3.7 x 10-5

117

135

2.5 x 10-5

134

145

1.0 x 10-4

149

155

3.7 x 10-4

163

165

6.7 x 10-4

170

175

2.9 x 10-3

188

Table 9.2 Measured rate constants and temperature of microwave reactora

a Determined from Equation I.

These data were then used in Equation I to determine the actual reactor temperatures, which are shown in Table 9.2. As Table 9.2 shows, the effective temperature is typically slightly higher than the nominal temperature measured by the optical pyrometer. Since the microwaves heat the solution directly, it is reasonable that the glass reactor tube would have a lower temperature than the solution. The optical pyrometer measures the temperature on the surface of the glass and it is not surprising that the nominal temperature is lower than the effective temperature in the solution.

Information from metagenomics

The extreme oxygen sensitivity and fastidious growth requirements of many rumen mi­croorganisms mean that they may often be difficult to culture, or even unculturable. At the very least, the task of completing the description of rumen microbial diversity through cultivation looks daunting. An attractive alternative, therefore, is to build metagenomic li­braries of rumen DNA that can be screened for activities of interest, or randomly sequenced to gain information on the genes present. This approach has been successfully applied to the recovery of genes encoding esterases, amylases and cellulases, xyloglucanases from the rumen (80, 81). Most of the glycoside hydrolases so far recovered appear to have simple domain structures, resembling those from the Prevotella group rather than those typical of known cellulolytic bacteria. This is most likely to reflect the relative ease of recovery of planctonic, rather than surface-attached bacteria, and perhaps also the relative ease of lysis
of Gram-negative cells. It may also reflect greater numbers of secondary utilizers of plant polysaccharides compared with primary degraders within the community. Nevertheless, the value of the approach is demonstrated by the fact that the collection includes representatives of new enzyme families. Information on sequence diversity can also be obtained without the need for library construction, either through amplification of specific genes by degenerate PCR, or directly by 454 sequencing.

Comparative sequencing and quantification of rRNA

Measuring microbial diversity typically involves sequencing individual 16S rDNA gene se­quences, to obtain species-level resolution. The power of the approach comes from the ability to amplify, clone, and analyze homologous regions of 16S rDNA from small amounts of sample DNA. Some studies focus on only a small portion of the 16S rDNA, while others survey the entire gene sequence. Sequencing the entire 16S rDNA enhances the accuracy of estimating the species variability of microbial communities, and is preferred when broad comparative determinations are preformed.

The question of whether two populations of microorganisms have different numbers of species and the level ofgenetic diversitycan be answered bycomparing the relationships and degrees of divergence among sequences. Depending on amplification and cloning conditions the technique can overrepresented some species while underrepresenting others. There is also a relatively small potential for error from sequence variation due to PCR replication errors. Variations of the 16S approach include denaturing gradient gel electrophoresis and in situ hybridization.

Bacterial populations, including soil communities, have been characterized using rRNA intergenic spacer analysis (RISA) which determines the variability in the length of the in­tergenic spacer between the small (16S) and large (23S) ribosomal subunits. The method has been automated (ARISA), and the sensitivity increased by the use of fluorescence — tagged oligonucleotide primers for PCR amplification and for subsequent electrophoresis in an automated capillary electrophoresis system. Fungal ARISA makes use of the length polymorphism of the nuclear ribosomal DNA (rDNA) region that contains the two in­ternal transcribed spacers (ITS) and the 5.8S rRNA gene. Comparative sequencing and quantification of rRNA genes using universal and phylogenetically specific primers has become an established method for detecting and characterizing subgroups of prokary­otic and eukaryotic microorganisms. For instance, constructed 16S and 18S rRNA gene libraries have been constructed to examine the effects of elevated CO2 levels on the composition of microbial communities associated with the rhizosphere of trembling aspen (88). However, a thorough analysis of complex microbial communities using rRNA gene-based libraries requires a huge sequencing effort. To overcome this sequenc­ing limitation, several rapid high-throughput DNA-based molecular methods for rRNA gene-based analysis of microbial communities have been developed. These methods, includ­ing terminal restriction fragment length polymorphism (T-RFLP) and denaturing gradient gel electrophoresis (DGGE), provide a DNA fingerprint of the microbial community present in the sample (89-92). While T-RFLP and DDGE are sensitive methods for differentiating between microbial communities, these methodologies do not provide actual DNA sequence information unless the resolved bands are recovered and sequenced (93).

Cellulose

Cellulose microfibrils are insoluble cable-like structures that are typically composed of about 36 hydrogen-bonded glucan chains each of which contains between 500 and 14 000 [3-1,4- linked glucose molecules. Cellulose microfibrils comprise the core component of the cell walls that surround each cell. Studies from mutants deficient in secondary cell wall cellulose show very irregular deposition of non-cellulosic polysaccharides and lignin (4). Thus, it is apparent that cellulose is a central scaffold of cell walls.

The cellulose chains in microfibrils are parallel, and successive glucose residues are rotated 180°, forming a flat ribbon in which cellobiose is the repeating unit. The parallel chains are compatible with evidence that the chains in a microfibril are made simultaneously (3). The cellulose chains are held in a crystalline structure by hydrogen bonds and Van der Waals forces to form microfibrils. It is not yet known to what extent the “crystallization” of the nascent glucan chains into cellulose microfibrils is facilitated by proteins other than the catalytic enzyme. Jarvis (5) has shown that the two main forms of cellulose (i. e., cellulose Ia and Ip) can be interconverted by bending. He suggested that the sharp bend that is thought to take place when cellulose emerges from the rosette and becomes appressed to the overlying cell wall may be sufficient to induce the interconversion. Additional forms, which are primarily of interest in the context of industrial uses of cellulose, can be produced from natural cellulose by extractive treatments. For instance, in cellulose II, the chains are antiparallel — something that is unlikely to occur in native cellulose. Cellulose I is converted to cellulose II by extraction under strongly alkaline conditions.

The molecular weight of the individual glucan chains that comprise cellulose microfibrils has been difficult to determine because the extraction may lead to degradation. Analyses of secondary wall cellulose in cotton suggest a degree of polymerization (DP) of 14 000-15 000 (6). Primary wall cellulose appears to have lower molecular weight. Brown (7) reports a DP of 8000 for primary wall cellulose. However, Brett (6) reported a low molecular weight fraction of ~500 DP and a fraction with a DP of 2000-4000. Brett (6) suggested that the low molecular weight fraction may be chains at the surface of microfibrils whereas the high DP fraction maybe chains in the microfibril interior. Since a DP of 2000 corresponds to about 1 ^m of length, the implication is that the primary wall cellulose fibrils, which are frequently observed to be much longer than 1 ^m, must be composed of chains with breaks at various locations along the fibrils. As noted below, this is compatible with genetic evidence that a cellulase is required for cellulose synthesis in both plants and bacteria (8, 9). Whatever the exact length, it is apparent that in some cells the fibrils can be extremely long relative to other types of biological macromolecules.

Based on electron micrographs, the width of cellulose fibrils varies from about 25 to 30 nm in Valonia and other green algae, to about 5-10 nm in most plants (10). The variation in size may indicate that cellulose microfibrils from different sources contain different numbers of chains, and it may reflect variation in the kind or amount of hemicellulose coating on the fibrils. In a study of onion primary wall by solid state NMR (10), the spectral interpretation was consistent with the idea that the 8 nm wide microfibrils were composed of six 2-nm fibrils, each containing about 10 chains. Herth (11) estimated by electron microscopy that the microfibrils of Spirogyra contained 36 glucan chains. Thus, the measurements are generally consistent with the idea that each of the six globules in a rosette is composed of a number of subunits that synthesize 6-10 chains that hydrogen bond to form the 2 nm fibrils. Six of these 2 nm fibrils then bond to form the microfibrils.

The analyses of cellulose structure indicate that cellulose synthase is a highly processive enzyme, that it has many active sites that coordinately catalyze glucan polymerization, that alternating glucan units are flipped 180°, and that interspecies variation exists in the number of glucan chains per fibril, or possibly in the kind or amount of hemicellulose. What is not clear is whether the enzyme participates in facilitating the hydrogen bonding of the glucan chains or whether proximity of the glucan chains as they emerge from the enzyme is sufficient to cause formation of the highly ordered microfibrils.

Cellulose synthase can be visualized by freeze fracture of plasma membranes in vascular plants as symmetrical rosettes of six globular complexes approximately 25-30 nm in diame­ter. The rosettes have been shown to be cellulose synthase by immunological methods (12). The only known components of cellulose synthase in higher plants are the CESA proteins. The completion of the Arabidopsis genome sequence revealed that Arabidopsis has ten CESA genes that encode proteins with 64% average sequence identity (13, 14) and other species have been found to have similar numbers of CESA proteins (3). The proteins range from 985 to 1088 amino acids in length and have eight putative transmembrane (TM) domains. Two of the TM domains are near the amino terminus and the other six are clustered near the carboxyl terminus. The N-terminal region of each protein has a cysteine-rich domain with a motif that is a good fit to the consensus for a RING type zinc-finger. RING fingers have been implicated in mediating a wide variety of protein-protein interactions in complexes (15). Otherwise, the N-terminal domain is structurally heterogenous among the ten CESAs in Arabidopsis. The average overall sequence identity of the amino terminal domains is 40% compared with an average overall identity of 64%.

A large “central domain” of approximately 530 amino acids lies between the two regions of transmembrane domains and is thought to be cytoplasmic. Using this feature to anchor the topology of the protein indicates that the N-terminal domain is also cytoplasmic. The central domain is highly conserved among all the CESA proteins except for an approximately 64-91 residue region of unknown significance where there is weak sequence identity. The domain contains a motif (Q/RXXRW) that is associated with bacterial cellulose synthases and other processive glycosyltransferases (16), such as chitin and hyaluronan synthases, and glucosylceramide synthase (17). Additionally, a DXXD motif and two other aspartate residues have been associated with this class of enzymes and are referred to collectively as the D, D,D, Q/RXXRW motif. Site-directed mutagenesis experiments of the chitin synthase 2 of yeast showed that the conserved aspartic acid residues and the conserved residues in the QXXRW motif are required for chitin synthase activity (18). Similarly, Saxena etal. (19) replaced the aspartate residues in the A. xylinum cellulose synthase and found that they were required for catalytic activity.

Analysis of mutants with defects in secondary wall cellulose has revealed that three separate CESA proteins are required in the same cell at the same time (20) and that the proteins physically interact (21). Thus, it appears that within a cell type there is a single type of complex containing three types of CESA subunits. A detailed summary of the properties of mutations that alter cellulose accumulation has been published recently (3). In brief, null mutations in several of the primary wall CESA proteins are lethals, whereas others are not, presumably because of redundancy. Mutations that eliminate secondary cell wall cellulose are not lethal but impair the structural integrity of vascular cells. In addition to mutations in CESA genes, a number of other proteins have been implicated in the overall process but the role of these proteins is not understood.

SLOPPY, a general UDP-sugar pyrophosphorylase

This section on NDP-sugar biosynthesis begins with a description of an enzyme we identified and characterized that does not obey the general rule of enzyme specificity (at least not as we were accustomed to it with other enzymes we had purified in the past). This enzyme, named SLOPPY, is responsible for the synthesis of at least six different UDP-sugars and is unique to the plant kingdom. No other genes with a high sequence similarity to Sloppy have been identified in other organisms to date.

D-Fructose-6-P, a product of photosynthesis, is a central precursor for all monosaccharide residues in plants (403). Using labeling experiments, however, it was also elegantly shown that plant cells can readily take up other free sugars such as Rha, Glc, Gal, Xyl, GalA, GlcA, Ara, and Fuc, and incorporate them into polysaccharides. It was assumed that a free sugar was first phosphorylated with ATP and then uridylated or guanylated with UTP or GTP to form the corresponding NDP-sugar. Indeed, numerous kinase and pyrophosphorylase activities were isolated in the late 1950s and early 1960s that catalyzed such reactions. The kinases were never purified and it was not explicitly clear ifdifferent kinases catalyzed the C — 1 phosphorylation of each unique sugar or if one kinase phosphorylated several sugars. We discuss the different kinase activities and specificities below. Similarly, it was not clear if the subsequent pyrophosphorylation of each sugar-1-P by pyrophosphorylase was specific or not (411,412). Three independent research groups identified an enzyme in pea (413) and in Arabidopsis (414, 415) that can pyrophosphorylate various sugar-1-phosphates with UTP. The recombinant Arabidopsis protein (At5g52560), termed SLOPPY, has a higher affinity (e. g., lower Km) for GlcA-1-P but it also catalyzes the conversion of Glc-1-P, Gal-1-P, Xyl-1- P, Ara-1-P, and GalA-1-P to their respective UDP-sugars (414). The enzyme is very efficient and specific for the production of UDP-sugars and shows no detectable activity when TTP, GTP, ATP, CTP are substituted for UTP. Although Sloppy has broad sugar-1-P specificity, it cannot accept GalNac-1-P.

The existence of a non-specific pyrophosphorylase raises basic questions. What is the source of free sugars in plants? Do the free sugars contribute significantly to the flux of NDP-sugars for wall biosynthesis pathways? Are free sugars generated inside or outside the cells? If they are imported inside, are they derived from long-distances, cell-to-cell transport, or directly from recycled wall? These are central questions that both need to be addressed and obviously raise more questions. If indeed sugars are recycled from glycan degradation, are there sugar-specific transporters? Recently, a plasma membrane sugar transporter (AtPLT5) was functionally identified in Arabidopsis (416). The transporter is a member of a multigene family and seems to be a “non-specific transporter” since competition assays indicate that AtPLT5 can transport Xyl, Rib, Ara, Glc, and myo-inositol, but not sucrose. Unfortunately, the transport of other sugars such as Gal, Rha, and Fuc was not determined for AtPLT5. But it is likely that other sugars are transported either by this, or other, transporters.

The “recycling” of free sugars into the NDP-sugar pool was termed the “salvage path­way” (403). The relative amount of free sugars released from glycolipids, glycoproteins, wall polysaccharides, and glycosides of small metabolites is hard to quantify. Hence, at this time it is not possible to judge the relative contribution of the salvage pathway to the flux of NDP-sugars versus the carbon flux derived from photosynthesis. In the subsequent subsec­tions we will describe the synthesis of specific NDP-sugars.

Raman spectra

The Raman spectra of two forms of native celluloses, Ia and Ip, have been discussed in detail elsewhere (13), but are presented here in support of evidence for a helical model for native biological structures. Furthermore, these forms are presented to support findings from the molecular modeling program that two different rotamers of the exocyclic group at C6 occur in the native state. Finally, spectra are presented to illustrate the effects oflateral dimensions on spectral resolution.

The spectra presented in Figure 6.9 were recorded for samples of the alga Valonia ventricosa and the tunicate Halocynthia roretzi. The first is an alga wherein the fibrils of cellulose are 65% Ia and 35% Ip. The tunicate appears to be predominantly Ip. The importance of these spectra derives because fibrils in both instances are approximately 20 by 20 nm in lateral dimension. Thus, resolution of the spectra is sufficient to allow confident discussion of their interpretation. Two features of the spectra are noteworthy. The first supports the observation based on molecular modeling that two distinct rotamers of the exocyclic group at C6 occur in the native form, which is evident from the appearance of two distinct bands above 1450 cm-1. Presently, it is not possible to associate the individual bands with the corresponding rotamer, but there is little question that two distinct rotamers occur. The relative intensities of the two bands differ in the spectra of the two forms. In the spectrum of the predominantly Ia Valonia, the lower frequency band is higher than it is in the spectrum of the predominantly Ip Halocynthia. The occurrence of two bands above 1450 cm-1 is

image088

250 450 650 850 1050 1250 1450

cm-1

image089

cm-1

Figure 6.9 The Raman spectra of tunicate (Halocynthia roretrzi) and Valonia ventricosa celluloses in the Raman-active fundamental regions.

unique to the two forms Ia and Ip. The spectra of celluloses II and III both have a single sharp band in this region (33).

Since questions have been raised within the cellulose science community regarding sen­sitivity of Raman spectra to molecular conformation, we present here some of the con­siderations that persuade us. Raman spectroscopy is a branch of vibrational spectroscopy complementary to infrared (IR). Raman spectra are no less sensitive to perturbations of molecular structure or environment than IR and are indeed better suited to studies of bio­logical systems because of the very low scattering coefficient of water. Both IR and Raman spectroscopy involve transitions between molecular vibrational states. The key difference is that Raman spectra are more sensitive to vibrational transitions involving highly covalent

image090

Figure 6.10 Raman spectra of Cladopheara glomerata in its native state I, after conversion to cellulose III in liquid ammonia (IIII) and after recovery in the I form by boiling in water (IIII) [Ref. (12)].

bond systems, whereas IR spectra are more sensitive to transitions involving highly polar systems of bonds.

The basis for establishing sensitivity of Raman spectra to molecular conformation was es­tablished through extensive normal-coordinate analyses of six classes of model compounds related to saccharides (34-42). The most persuasive are analyses of vibrational spectra of the inositols (34, 39). The inositols are cyclohexane hexols, differing from each other only in the distribution of hydroxyl groups between axial and equatorial orientation and their positions relative to each other. Differences in their spectra leave very little doubt that for molecules where the skeletal structure is made up of C-C and C-O bonds, for which reduced masses, bond energies, and force constants are similar and the coupling of vibrational modes is high, the individual spectra are determined by organization of atoms in space relative to each other within the molecule.

For demonstration of the sensitivity of Raman spectra to conformation particularly in the context of celluloses, we present in Figure 6.10, Raman spectra of cellulose from Cladophera glomerata, which is a fresh water alga that has fibrils very similar to those of Valonia, both with respect to cross section and the balance between Ia and Ip. Spectrum (I) is for the alga in its native state after purification with an acid chlorite treatment; it is very similar to the spectrum of Valonia presented in Figure 6.9. Spectrum (IIII) is recorded after conversion to cellulose III by treatment in anhydrous liquid ammonia at -30° C; the very high level of order is retained by allowing the ammonia to evaporate gradually at ambient temperature.

Spectrum (Ini) is after recovery of cellulose I primarily in the Ip form from cellulose III by boiling in water.

Three comparisons in Figure 6.10 are noteworthy. First are the significant differences between the spectra of native cellulose I and IIII; the key difference between celluloses I and III are differences in conformation. Second, the spectrum of IIII recovers similarity to that of the alga but is now less well resolved because of fibrillation accompanying the conversion back to cellulose I first noted by Chanzy and coworkers (43). Lateral dimensions of fibrils of cellulose IIII range between 3 and 6 nm, which are in contrast to the lateral dimensions of the original Caldophera cellulose at 20 by 20 nm. The spectrum of IIII is almost indistinguishable from that of cotton. Thus, lateral dimensions of nanofibrils are clearly very important to resolution of bands in the spectra. Most of the bands that are very sharp with relatively low bandwidth for cellulose I from the native Caldophera with 20 by 20 nm fibrils are now broadened considerably in the spectrum of IIII. Chanzy and coworkers found a similar effect with SS 13C NMR spectra. Finally, the Raman spectrum of tunicate Ip cellulose shown in Figure 6.9 is much more similar to those of the Ia algal celluloses in Figures 6.9 and 6.10, than to that of the fibrillated Ip sample of IIII.

We conclude that spectra in Figure 6.9 confirm that conformations of the Ia and Ip forms are almost identical and that results of the diffractometric studies reflect two erroneous assumptions. The first, by Atalla and VanderHart that Ia and Ip are “two distinct crystalline forms” rather than simply “two distinct forms,” the second by the authors of diffractometric studies in further assuming that Ia and Ip belong to different crystallographic space groups, which is also in contrast to the findings of Sugiyama and coworkers (12).

CAD

Catalyzing the final step in monolignol 1-5 biosynthesis, the effects of downregulation and mutating bona fide CAD genes have been extensively studied and reported upon [see Anterola and Lewis (77), for a comprehensive discussion and analysis]. The common phe­notype resulting from CAD mutation/downregulation is that of a red-brown coloration in the xylem region (57, 71, 172, 235-237) and Figure 7.12G shows this effect in a stem cross-section from a double CAD mutant (cad-4 cad-5) in Arabidopsis (57, 71). As men­tioned above, this coloration has been known for nearly eight decades (58) in the brown midrib mutants, with the bm1 associated with expression of the CAD gene (67). This red — brown coloration was also reported as being due to formation of an abnormal “wine-red” lignin in tobacco (236, 237), with this being proposed to have good potential for furniture staining, dyes, and so forth. Moreover, the results from such studies and other analyses of presumably lignin-enriched isolates from CAD downregulation/mutation of tobacco were rationalized by several investigators as further evidence for random coupling/combinatorial biochemistry, and thus of lignin’s composition not being particularly important (173,226).

Our own data and the interpretations thereof, provide very different findings and insights. More importantly, they give an evolutionary perspective as to why lignins are essentially monolignol (1, 3, and 5) derived (and, partially from monolignol esters 30-32 in grasses as well). Specifically, the recent findings now help explain why lignins are not formed from, for example, p-hydroxycinnamaldehydes 19, 21, and 23.

7.6.2.2.1 "RED LIGNIN": A MISNOMER

Comprehensive analyses and reassessment of reports of “red lignin” in tobacco, (235-237) resulting from CAD downregulation, established that there was no “red-lignin” as such (177). Instead, it was simply a pigment [mainly sinapyl aldehyde (23) derived] in near trace amounts in tobacco that could readily be removed under conditions generally used for floral pigment removal (i. e., 0.5% HCl in MeOH) (177). Similar treatment of Arabidopsis “red xylem” also resulted in facile removal of this coloration, with concomitant release of sinapyl aldehyde (23) (71). No evidence was obtained, though, this red pigment was an integral part of the polymeric lignin. Interestingly, the red coloration could also be reconstituted on either preextracted tobacco xylem tissue cross-sections and/or polyamine TLC plates, by dipping either into a dilute solution of sinapyl aldehyde (23) (177). Furthermore, Bernard-Vailhe et al. (234), using modified Bjorkman procedures to isolate lignin derivatives from both wild type and CAD downregulated tobacco xylem, had also noted that the red coloration was completely removed from the lignin-derived preparations. These data also contrast with a contribution by Boerjan and coworkers (238) who reported that the “red xylem” coloration in tobacco could not be removed by treatment with sulfuric acid, methanol, butanol/HCl, acetyl bromide, or triethylene glycol. Such an observation, therefore, needs to be independently confirmed, as it appears incongruous with the findings using both tobacco and Arabidopsis.

Carbohydrate force fields

The models, above all, require a force field. With the exception of the all-atom model, each model uses a force field that is custom made for the particular molecular model. Here we restrict our discussion to the all-atom force fields, which are the most commonly used and generally considered to be the most accurate. Thus from here on, when we specify a force field, we mean an all-atom force field. As mentioned above there are a number of mature and well established force fields for modeling amino acids and nucleic acids.

The introduction of usable carbohydrate force fields to the AMBER (34, 35), CHARMM (36, 37), OPLS (38, 39), and GROMOS (40) force fields occurred after the force fields were already established and verified for proteins, nucleic acids, and lipids. Carbohydrate research is driven mainly by research in food science, and has largely concentrated on starches and glycosylated amino acids with little attention to cellulose. The small amount of unambiguous experimental data, especially structural data, contributes to the reluctance to model cellulose since the force fields cannot be verified as being appropriate for cellulose modeling in their current forms, and a more complex reparameterization may be necessary to reliably simulate cellulose. The molecular structures associated with cellulose are quite large in terms of current molecular modeling capabilities and require large computational resources. Although the current carbohydrate force fields have been carefully constructed for small molecules and carbohydrates, the force fields have not been tested extensively for such large structures in the same way that the minute details of force fields for proteins have been adjusted to reproduce known structures and known probabilities ofalpha helices and beta sheets. Few researchers are willing to apply serious intellectual or computational effort toward such a speculative endeavor. On the other hand, the same lack of unambiguous experimental data empowers modelers to simulate cellulose and its interactions to suggest possible structures and behaviors as well as eliminate highly unlikely ones and to attribute structural features to their underlying physics, even if the modeling is crude. The stage is set for a significant contribution of MD to the understanding of cellulose structure, function, and behavior as new experimental techniques and data are available, and larger and faster computers are accessible to MD modelers.

Debranching enzymes (accessory enzymes)

Glycosidic side groups connected to xylan and glucomannan main chains are primarily removed by a-glucuronidase, a-arabinofuranosidase, and a-D-galactosidase. Acetyl and hydroxycinnamic acid substituents bound to hemicellulose are removed by acetyl xylan esterases and ferulic/coumaric acid esterases (Figure 10.1). These are clearly different types of side-group cleaving enzymes. Some of them are able to hydrolyze only substituted short- chain oligomers, which first must be produced by the backbone depolymerizing endoen — zymes (xylanases and mannanases). Others are capable of debranching intact polymeric substrates. Most accessory enzymes of the latter type, however, prefer oligomeric substrates. The synergism between different hemicellulolytic enzymes is also observed by the acceler­ated action of endoglycanases in the presence of accessory enzymes. Side groups that are still attached to oligosaccharides after the hydrolysis of xylans and mannans by xylanase or mannanase, respectively, restrict the action of p-xylosidase and p-mannosidase.

10.4.1 a — glucuronidase

a-glucuronidase (3.2.1.139) catalyzes the release of glucuronic acid or 4- O — methylglucuronic acid from xylan. Reports of specific substrate specificities vary from activity on long-chain xylans to a requirement for attachment to a terminal non-reducing endxylose (64,65). At least one membrane-bound bacterial enzyme exhibited activity specif­ically on soluble xylan-derived oligomers (66). However, synergy between a-glucuronidase and endoxylanase on wheat xylan has been shown to be significant with simultaneous activity yielding the highest release of 4- O-methylglucuronic acid (67).

10.4.2 a — arabinofuranosidase

a-arabinofuranosidase cleaves arabinose side chains from the xylan main chain. Many en­zymes in this class have demonstrated activity on pectin, arabinan, and arabinoxylan with the preferred chain length and side-group assignments also varied (13, 68, 69). A com­prehensive review can be found here (69). Synergy with xylanases has been reported, as has synergy with ferulic acid and acetyl xylan esterase (2, 13, 65, 67, 70). As ferulic acid is linked to the arabinose, which in turn is linked to the xylan chain, these synergies are not unexpected. The substrate specificity of this enzyme class is still somewhat muddled (65).