Category Archives: Biomass Recalcitrance


Xyloglucans exist as cell wall components in most species and as storage polymers in seeds of some species (94). Xyloglucan (XG) comprises 20-25% of primary walls of dicots but graminaceous monocots typically contain much less. XG is defined by a (3-1,4-glucan back­bone in which many glucosyl residues contain a-1,6-linked xylose branches. Xyloglucan from pea had an average molecular mass of 330 kDa representing a backbone of about 1100 glucose residues of about 500 nm in length (95). In many species the xylose residues are further substituted with (31,2-linked galactose which may in turn be linked at the 2-position

Table 5.1 Common elements of single letter code for xyloglucan structure

Code Structure represented

G p—d-GIc p*-a

X a—D-Xylp-(1^6)-p-D-Glcp*-

L p—D-Galp-(1^2)-a-D-Xylp-(1^6)-p-D-Glcp*-

F a—L-Fucp-(1^2)-p-D-Galp-(1 ^2)-a-D-Xylp-(1^6)-p-D-Glcp*- a*- D-Glucose in chain or at reducing terminus. See Fry and coworkers (97) for details.

to a-L-fucose or arabinose (94). X-ray fiber diffraction studies of tamarind XG indicated a twofold helix similar to cellulose (96). A single letter code has been developed to describe the structure of xyloglucans (Table 5.1) (97).

Most species have an XXXG type of XG. However, members of Poacea and Solanaceae have an XXGG type in which a pair of arabinose residues replace fucose (98, 99). In most monocots XG contains less xylose and galactose and does not contain terminal fucose. The structure and molecular distribution of the side chains varies in different plant tissues and species (100-102).

XG maybe extensively acetylated (103). In Sycamore cells, the O-2-linked-p-D-galactosyl residue of the nonasaccharide was found to be the dominant site of O-acetyl substitution in XG. Both mono-O-acetylated and di — O-acetylated p-D-galactosyl residues were detected. Thedegreeof O-acetylation ofthe p-D-galactosyl residue was estimated to be 55-60% at O-6, 15-20% at O-4, and 20-25% at O-3. Approximately, 50% ofthe p-D-galactosyl residues were mono — O-acetylated, 25-30% were di — O-acetylated, and 20% were not acetylated. In tomato (Lycopersicon esculentum), O-acetyl substituents were located at O-6 of the unbranched backbone p-D-Glcp residues, O-6 of the terminal p-D-Galp residue, and/or at O-5 of the terminal a-L-Arap residues (104). Acetylation of XG does not affect the degree to which XG hydrogen bonds to cellulose in vitro (100) and the role of acetylation is unknown. Similarly, the enzymes that acetylate XG are unknown. O-acetylation of galactose residues was considerably reduced in Fuc-deficient mutants (atfutl, murl, and mur2) that synthesize XG containing little or no Fuc (105). These results suggest that fucosylated XG is a suitable substrate for at least one O-acetyltransferase in Arabidopsis.

Immunoelectron microscopy using antibodies against XG indicates that XG is localized to the cellulose-containing region of the cell wall (106). Hayashi (94) proposed that XG does not have covalent cross-links to other components or if there are links they must be alkali-labile linkages such as O-esters. However, Thompson and Fry (107) have observed cross-links between XG and pectins in Rose cells. Brett and coworkers (108) have also observed such cross-links and found that they form in the Golgi. Feruloyl esters of XG have also been observed in maize cell cultures (109).

Pure XG binds to cellulose in vitro inapH-dependent manner (110). Levy etal. (111) have presented evidence that the structure of the XG side branches may facilitate the binding of XG to cellulose. Native XG-cellulose complexes contain higher ratios than can be obtained in vitro, suggesting that XG maybe intercalated into the cellulose microfibrils (110). Also, mild alkali does not completely dissociate the complex and concentrated alkali (e. g., 4M KOH) is required to completely extract XG. The proposed function of XG binding to cellulose is to prevent aggregation of cellulose fibrils (110) but because single strands of XG maybe hydrogen bonded to different cellulose microfibrils (112-114); it may also provide some degree of crosslinking. Thus, XG hydrolysis maybe required for growth. Based on the com­bined chemical and cytological evidence, Pauly and coworkers (100) have developed a model for the cellulose/XG network that posits that XG can have three configurations; hydrogen bonded to the surface of cellulose, cross-linked, and embedded within the microfibril. They propose that the cross-links are the domain that is accessible to enzymes such as xyloglucan endotransglycosylases that are thought to play a role in cell wall expansion. This is supported by observations of XET-mediated incorporation of fluorescent XG fragments into XG in expanding cell walls (115). Pauly and coworkers (100) also note that it is not clear to what extent the various XG structures participate in determining the nature of the XG-cellulose association.

The first progress in defining the genes involved in XG synthesis was the identification of the fucosyltransferase that adds the terminal fucose to XG side chains. A 60-kDa fucosyltrans — ferase (FTase) that adds this residue was purified from pea epicotyls (116). Peptide sequence information from the pea FTase allowed the cloning of a homologous gene, AtFUT1, from Arabidopsis. AtFUT1 expressed in mammalian COS cells resulted in the presence of XG FTase activity in these cells. AtFUT1 shows very little identity with FucTs from other organ­isms. AtFUT1 andPsFUT1 (the peaXyG FucT homologue) are 62.3% identical (117). Both enzymes contain motifs that had been identified in other FucTs but combine these motifs in a unique manner (117). Three motifs had been identified in (1^2)a — and (1^6)a-FucTs. Motifs I and II had been present in both (1^2)a — and (1^6)a-enzymes, but a particular version of motif III had appeared to be characteristic of each group. AtFUT1 and PsFT1, however, contain a hybrid motif III that has features of both the (1^2)a — and (1^6)a — versions. There are ten genes in Arabidopsis with identity of encoded amino acid sequences to AtFUT1 ranging from 35 to 73.8% (118). The AtFUT1 gene was found to correspond to the fucose-deficient mur2 mutant of Arabidopsis (119).

The galactosyltransferase that contributes to the synthesis of XG side chains was identi­fied by map-based cloning of the MUR3 gene of Arabidopsis, which had previously been identified based on a screen for variation in cell wall polysaccharide composition (120). MUR3 belongs to a large family of Type II membrane proteins that is evolutionarily con­served among higher plants. The enzyme shows sequence similarities to the glucuronosyl transferase domain of exostosins, a class of animal glycosyltransferases that catalyze the syn­thesis of heparan sulfate, a glycosaminoglycan with numerous roles in cell differentiation and development. Arabidopsis has ten genes encoding proteins with significant sequence similarity to the MUR3 xyloglucan GalT (121).

One of the XG xylosyltransferases (XT1) was identified from Arabidopsis based on se­quence similarity to the fenugreek mannan a-1,6-galactosyltransferase (122). Expression of the gene in Pichia pastoris resulted in a protein with cello-oligosaccharide-dependent xylosyltransferase activity. Characterization of the products obtained with cellopentaose as acceptor indicated that the pea and the Arabidopsis enzymes transfer xylose mainly to the second glucose residue from the non-reducing end in an a(1,6)-linkage to the glucan chain. Arabidopsis has seven related genes, some of which may catalyze addition of xylose to other positions in the repeating unit of XG.

In vitro assays of the glucan synthase involved in synthesis of the XG backbone exhibit maximal activity only if both UDP-glucose and UDP-xylose are present, suggesting that the glucan synthase acts in concert with a xylosyltransferase that adds side chains (94, 123).

The glucan synthase extends existing XG by addition to the non-reducing end but cannot be primed with exogenous primers (94). Also the enzyme does not add xylose to preformed glucans.

Recently, Cocuron and coworkers (124) have obtained evidence that proteins of the cellulose synthase-like 4 (CSLC4) family catalyze synthesis of the XG backbone (99). They expressed CSLC4 genes from Arabidopsis and tamarind along with the Arabidopsis XT 1 gene in Pichia pastoris and observed the formation of p-1,4-linked glucan. However, they were unable to detect XG synthase activity in extracts from the cells. Mutations in the Arabidopsis CSLC4 gene are deficient inxyloglucan, supporting the proposed role inxyloglucan synthesis (Milne and Somerville, unpublished). If substantiated by further work, the work of Cocuron and coworkers (124) appears to represent a long-awaited breakthrough in understanding the synthesis of XG (99). The identification of the genes involved in XG synthesis should pave the way for an analysis of how the amount of XG is regulated and what the consequences are to plant growth and development and the properties of cell walls of genetic variation in the amount of XG.

Some information about genetic variation in XG is available from analysis of Arabidopsis mutants that were recovered by screening for alterations in total cell wall sugar compo­sition (125, 126). Comparison of the mechanical responses of mur2 (AtFUT1) and mur3 (XG galactosyltransferase), indicated that galactose-containing side chains of xyloglucan make a major contribution to overall wall strength, whereas xyloglucan fucosylation plays a comparatively minor role (127). Thus, it seems unlikely that it will be possible to develop biomass feedstock crops with significant alterations in the structure of XG without also making compensating changes in another cell wall component. Because Arabidopsis has a number of CSLC genes, it has not yet been possible to develop mutant plants with major reductions in the amount of XG to assess the phenotypic consequences of such alterations.

UDP-а — D-galactose (UDP-Gal)

Galactose is a major constituent of diverse pectic polysaccharides including RG-I. The sugar donor, UDP-Gal, is produced by (i) phosphorylation of C-1 of D-Gal by galactokinase (414) followed by pyrophosphorylase-catalyzed conversion of a-D-Gal-1-P and UTP to UDP — Gal, and by (ii) UDP-Glc-4 epimerase (UGE as described above) that reversibly converts UDP-Glc to UDP-Gal.

1 D-galactokinase activity was isolated by Neufeld and coworkers in 1960 from mung bean (412). The D-GalK is membrane bound and the activity, unlike L-AraK activity, could not be solubilized with digitonin. The galactokinase gene (GalK, Gal1, At3g06580) was cloned from Arabidopsis by functional complementation of yeast (446) and E. coli (445) galK mutants that are unable to metabolize galactose. While the Gal1/GalK clone was able to complement the yeast mutant, a definitive substrate specificity of the recombinant Arabidopsis enzyme (GalK) will provide information on whether other “recycled sugars” are substrates. Sequence alignment of various sugar kinase proteins shows that the Ara­bidopsis GalK shares amino acid sequence similarity (45%) to GalK2, a human kinase with phosphorylation preference to GalNAc (446). A meaningful alignment could not be obtained between the Arabidopsis GalK with the human GalK1 whose true substrate is Gal (447). Whether At3g06580 encodes a Gal-1-P kinase activity and/or GalNac-1-P kinase activity remains to be determined biochemically. The subsequent pyrophosphorylation of Gal-1-P to UDP-Gal is likely mediated via “Sloppy (416),” the non-specific UDP-sugar pyrophosphorylase (At5g52560).

2 The second route to form UDP-Gal is with UDP-Glc-4-epimerase (as described above). In humans and fungi, UDP-Gal is synthesized by uridylyltransferase activity (GalT). GalT transfers UMP from UDP-Glc onto Gal-1P forming Glc-1-P and UDP-Gal. Such activity and corresponding genes have not yet been described in plants.

Lignin pathway evolution, deposition, and function in vascular anatomical development

7.2.1 Vascular plant diversification and iignification

The highly diversified terrestrial (plant) environment found today originated with the shift of photosynthetic organisms from water to land some 475 million years ago (1-3). After small flora with prevascular water-conducting cells had been established in coastal wetlands, plants next developed tracheids (xylem) (22, 23), and the ability to fortify these cell walls with lignin. This evolutionary step thus gave plants a profoundly different and more efficient way to control water uptake, use and storage, thereby allowing them to become indepen­dent of wetlands and diversify onto most landscapes on earth. Further evolution of the phenylpropanoid pathway also led to the development of different types of both lignins and lignified cells, which in turn allowed plants to evolve arborescent growth in order to compete for light and space. By developing diverse height and water usage strategies, vascular plants thus created a myriad of environments that other organisms were then able to inhabit and co-evolve within. While this diversity in cell wall structure is still not fully understood, it reflects a rich source of genetic information that is central to finding methods to alter plant structure for human use without adversely affecting the subject plants (24).

The evolutionary appearance of plant vascular anatomy is quite well represented in both the fossil record and in living plants. General trends in tracheid (i. e., water-conducting cells) secondary cell wall thickenings related to the development of lignification can be seen throughout the vascular plant lineages including the extant lineages shown in Figure 7.4 (in the order their ancestors appeared in the fossil record) (2, 25-27): 1) Lycopodiophyta, represented by Lycopodium tristachyum, 2) Equisetophyta and Psilotophyta, with Psilotum nudum as an example, 3) Filicophyta, with Pteridium aquilinum representing higher ferns, 4) gymnosperms, e. g., loblolly pine (Pinus taeda), and 5) angiosperms, as represented by alfalfa (Medicago sativa). Tracheids with simple (limited) secondary thickenings, such as the annular and helical formations of proto — and meta-xylem are found in all these plant groups, with the reticulate thickening form found more often in the older lineages, such as the Filicophyta (arrowheads in corresponding images, right-hand column, Figure 7.4). Secondary cell wall thickenings with scalariform pitting also appeared throughout vascular plants, but with greater prevalence within the earlier lineages (through the Filicophyta); scalariform pitting has a ladder-like appearance due to a broad surface area of secondary thickening with elongated pits (arrowhead, Figure 7.4). This form is intermediate to the an — nular/helical forms described above and the more continuous secondary cell wall thickenings with simple (to more elaborate; not shown) circular pitting of tracheids in gymnosperms (e. g., Pinus taeda) and vessels of (secondary growth) angiosperms (28), such as in alfalfa (see corresponding simple pitting figure). Additionally, non-water conducting cell types with variable amounts of lignification/secondary cell walls include sclerenchyma and other structural fibers, these bearing an important mechanical function in stem and branch sup­port. Thick-walled sclerenchyma are also found in some species of the Lycopodiophyta (e. g., the sclerified cortex of L. tristachyum), Psilotophyta (e. g., the sclerified outer cortex of P. nudum), Filicophyta (e. g., the sclerified hypodermis of P. aquilinum), gymnosperms, and angiosperms. Fibers with true lignin are found only in the gymnosperms and angiosperms (e. g., sclerified fibers of M. sativa, Figure 7.4).

image106 image107

Figure 7.4 Lignified vascular and sclerenchyma anatomical structures in extant plant lineages. Proto — and metaxylem of the major plant lineages: Lycopodiophyta (e. g., Lycopodium tristachyum), Psilophyta (e. g., Psilotum nudum), Filicophyta (e. g., Pteridium aquilinum), gymnosperms (e. g., Pinus taeda), and angiosperms (e. g., Medicago sativa) have very similar annular and helical secondary cell wall thickenings. Some members of each group, but especially the Filicophyta, have reticulate tracheid cell wall thickenings, a more complex form than the helical form. Secondary cell wall thickenings with scalariform pitting may be found in all plant lineages and represent an intermediate form to the simple pitting found in higher plants (i. e., gymnosperms and angiosperms). Brightfield microscopy images of L. tristachyum, P. nudum, and P. aquilinum were taken of hand-cut fresh sections stained with phloroglucinol-HCL to reveal patterns of phenolic deposition (red) in the xylem and sclerified fiber cells. Epifluorescent confocal images of P taeda and M. sativa were made using cryosections of fresh tissue stained with a combination of acridine orange and ethidium bromide to reveal patterns of lignification. Examples of secondary cell wall thickenings were recorded using brightfield microscopy and hand-cut longitudinal sections of fresh unstained xylem from Equisetum telmateia (a member of Equisetophyta for annular and helical examples), Ophioglossum reticulatum (a member of the Filicopsida, for the reticulate example), P. aquilinum (another member of the Filicopsida, for the scalariform-pitted example), and M. sativa (an angiosperm, for the example of vessel secondary cell wall thickenings with simple pitting). (Reproduced in color as Plate 16.)

The physiological functions of the lignins are thus quite distinct from those of either the cell wall celluloses or hemicelluloses. They are formed as biopolymeric entities within the cell wall, and help to reinforce the plant walls of the vasculature. In this way, they enable vascular plants to both form their water/nutrient conducting cells and also to provide a means of withstanding compressive forces acting on the overall plant body. Additionally, generation of the lignified plant cell wall matrices provides a relatively formidable physical barrier to opportunistic pathogens and to other encroaching organisms.

While the “lignin” chemistry of “primitive” plants is still yet very poorly understood, it is well established that lignin composition in higher plants is essentially only derived from the three monolignols 1, 3, and 5 (Figure 7.1) as indicated earlier (5, 29-31). To a much lesser extent, lignification can also involve some limited participation of related p-hydroxycinnamyl alcohol-monolignol esters 30-32 (Figure 7.2A), such as in grasses (5, 30, 31). This conclusion follows many detailed and exhaustive analyses of a large number of different plant species over a period encompassing 5-10 decades (32, and references therein); this established that a very strong evolutionary pressure had emerged to form lignins from these moieties and not from other non-monolignol phenolics (33). Interestingly, two of the monolignols, p-coumaryl (1) and coniferyl (3) alcohols, which differ only in methoxylation substitution pattern at carbon-3, are the precursors of lignins in some of the primitive (extant) plants, e. g., Psilotum nudum, as well as gymnosperms, e. g., loblolly pine (Pinus taeda) (34, 35). These moieties thus became the p-hydroxyphenyl (H) and guaiacyl (G) aromatic ring components of their lignins (Figure 7.2B). Various gymnosperms are, of course, widely employed as commercial sources of wood for lumber and pulp/paper production, respectively, e. g., loblolly pine, black spruce, etc.

The subsequent evolution of the angiosperms (flowering plants), by contrast, resulted in additional forms of lignified cell-wall architecture (e. g., true vessels and various fiber types), as well as the elaboration of the monolignol-forming pathway to afford the dimethoxylated monolignol, E-sinapyl alcohol (5), and ultimately the syringyl(S) aromatic ring component of the lignin biopolymers (36) (Figure 7.2B). Interestingly, S-moieties have also been re­ported to occur in some extant primitive Selaginella plant species (37-39), as well as in other isolated non-angiosperm families including the Dennstaedtiaceae (40) and Podocarpaceae [reviewed by Gibbs (41)]. While the evolutionary significance of such observations is not yet well understood, they may represent an example of convergent evolution.

The angiosperms are also widely used by humanity; for example, many hardwoods are utilized for lumber/pulp and paper production, whereas others (e. g., rice, corn, soy­bean, etc.) are food crops. In addition, three other angiosperms, thale cress (Arabidopsis thaliana), hybrid poplar (Populus sp.), and alfalfa (Medicago sativa) are currently of con­siderable scientific interest: the first as a “model” plant species, the second as a possible vehicle for generating biotechnologically modified fast-growing woody crops for fiber, bio­fuel applications, cellulose to ethanol, etc., and the third as a source of animal nutrition/ feedstock.

Localization of lignin in xylem and fiber cells has also previously been studied in a limited number of gymnosperm and angiosperm species (36,42-49). The earlier work indicated that lignification in gymnosperms begins in the cell wall corners and then proceeds throughout the cell wall, where p-coumaryl alcohol (1, H-unit) is differentially laid down in the cell corners/middle lamella and the coniferyl alcohol (3, G) residues are mainly in the secondary wall layers (42, 44, 46, 47). Syringyl moieties, by comparison, in angiosperms are deposited


Figure 7.5 Lignification and cell wall development in loblolly pine cambial tissue (A and B) show de­position of lignin in both transverse and longitudinal sections (Y. Nakazawa, unpublished). In Fig. 5A, lignin is differentially initiated at sites a, a’, b, b’ and c, c’. In Fig. 5B, the particular cell has initiated lig­nification at cell corners c, c’, but not yet at a, a’. Lignification is initiated at 3 points, c, c’, b, b’ then a, a’ in a symmetrical but differential manner. [Cellulose/hemicellulose regions are visualized in green, and lignin is orange/yellow.] (C) Idealized depiction of differential lignin deposition in one tracheary cell type. ^ in A and • in C = lignin initiation sites. A dual-stain method using acridine orange (AO) and ethidium bromide (EB) was used to visualize lignin as a red-orange epifluorescence. The merged confocal images were individually recorded using detection filters specific to 488 nm (Emission) and 522 nm (Excitation) for AO and 568 nm (Ex.) and 598 nm (Em.) for EB. (Reproduced in color as Plate 17.)

mainly in fiber cell walls with mixed guaiacyl-syringyl lignins found in vessel cell walls (36, 45).

More specifically, the process of lignin deposition occurs in a highly organized and con­trolled manner, whereby the cells undergo a polarized deposition of the lignin monomers at different rates in different cell corners (as in loblolly pine developing stems, Figure 7.5A, Y. Nakazawa etal., unpublished, this laboratory). Thus, as incipient xylem and sclerenchyma differentiate, lignin initiation sites at the cell corners/S1 sublayers of the lignifying matrix develop during maturation (see Figure 7.5A). Lignification is extended uniformly, in a con­tinuous thread-like pattern, down the entire length of the developing tracheid toward the cambium (Figure 7.5B). This, in turn, has important biological ramifications, in terms of both monolignol transport and monolignol (radical) alignment, and is apparently consis­tent with the proposed template polymerization process (discussed below) on preformed primary lignin chains (31, 50-52). Additionally, cell corners furthest from the cambial zone undergo both lignin initiation and lignification prior to those adjacent to the cambium, with the enlarging lignified domains in the former again being uniformly evident down the length of the tracheary element. The exact biochemical processes occurring at the lignin initiation sites is currently a subject of considerable scientific interest, as regards overall control of lignin macromolecular configuration (29).

Figure 7.5C thus shows an idealized lignification model in one cell adjacent to the cam­bium, whose cell corners are specified as a, a7, b, b7, and c, c7, respectively. In this diagram, only cell corners (c, c7) closest to the xylem have begun to lignify. Lignin deposition at these furthermost cell corners then extends symmetrically along the S1 sublayer from points c and c! until the two developing zones coalesce, as well as concomitantly extending upwards to the next two adjacent cell corners (i. e., b, b7 in this case). Lignin deposition is subsequently initiated at points b, b7 and symmetrical deposition occurs again in a likewise manner. Finally, lignin deposition in the remaining two corners (a, a7) closest to the cambium is initiated at the last phase, and these zones also begin to expand uniformly — albeit in a delayed man­ner — until eventually coalescence of lignin within the entire wall is achieved. [Figures 7.5A and 7.5B depict the asymmetry in cell wall thickness/lignin deposition as cell wall develop­ment continues, i. e., wall c, c7 is thicker and more heavily lignified than a, a7 at this stage.] Additionally, when cell wall development has occurred, the adjacent unlignified cell (closest to the cambium) is next “conscripted” to undergo lignin assembly/cell wall thickening as before. Such observations thus do not appear to be in agreement with Freudenberg’s original hypothesis of (random) diffusion of monolignol (glucoside) precursors into the cell walls undergoing lignification (53, 54).

In addition to their structural roles in plant stems, lignins provide a physical barrier to opportunistic pathogens (31), and various specialized structures throughout the plant body (e. g., trichomes, harboring an arsenal of plant defense compounds) apparently contain a lignified base. To put this into a more holistic perspective, Figure 7.6 illustrates some of the various tissues and cell wall types that are considered to contain lignified elements in Arabidopsis; these can be readily visualized through expression of the GUS-cinnamyl alcohol dehydrogenase (AtCAD4 and 5) promoter fusion product in the vascular apparatus (55), with CAD encoding the final step in monolignol biosynthesis (31). In this regard, it should also be noted that in this species these two CAD genes (AtCAD4 and 5) are considered to be largely responsible for the penultimate step(s) leading to lignification (discussed later) (56, 57). The main point is that any adverse effect on stem lignification could thus also potentially disrupt other physiological processes in these different tissues and organisms, i. e., whether in terms of structural support and/or in defense system impairment.

Interestingly, for almost three quarters of a century, various lignin mutants (beginning with the brown-midrib mutants in maize) have been described (58-67). Essentially, none currently find application as commercial cultivars, because of the deleterious effects, for example, on the overall plant vasculature, the reproductive system, and so forth. However, recent studies have begun to shed important light on the various genes and enzymes involved in each mutation, and how they affect lignin deposition processes proper. With the recent advances in lignin pathway modulation, biomechanics approaches are also beginning to be increasingly applied in order to begin to correlate such modulations with that of alterations


Figure 7.6 Cinnamyl alcohol dehydrogenase gene expression in the vascular apparatus of Arabidopsis thaliana (55). Selected examples of GUS-visualized expression of AtCAD5 (A, C-F) and AtCAD4 (B). (A) In vascular apparatus, including hydathodes of 2-week-old leaf tissue;(B) at the base of trichomes;(C) in primary and secondary roots;(D) in sepal/petal veins, style, anthers, and stamen filaments of the flower; (E) in the abscission and style regions of the silique;and (F) in vascular cambium, interfascicular cambium, and the developing xylem of the inflorescence stem. (Reprinted from Phytochemistry, vol. 68, Kim, S.-J., Kim, K.-W., Cho, M.-H., Franceschi, V. R., Davin, L. B. & Lewis, N. G., Expression of cinnamyl alcohol dehy­drogenases and their putative homologues during Arabidopsis thaliana growth and development: Lessons for database annotations? pp. 1957-1974, Copyright 2007, with permission from Elsevier.) (Reproduced in color as Plate 18.)

of plant structural integrity (68-72). As more comprehensively discussed below, this is an essential first step toward understanding to what extent lignin composition and content can be manipulated.

The enigma of monolignol radical generation

A key factor in ordered lignin macromolecular assembly/configuration is in temporal and spatial control over monolignol radical generation (discussed below as regards template — facilitated polymerization). Since the 1950s, various oxidases have been implicated as having roles in lignification — these have included peroxidase (12, 53, 54, 171, 254—257), laccase (12, 53, 54, 258—266), combined peroxidase and laccase (12, 53, 54, 254, 267), coniferyl alcohol oxidase (268—271), (poly)phenol oxidase (272—278), and cytochrome oxidase (279). Their putative involvement has generally relied upon the ability of such enzymes to oxidize monolignols in vitro, even though none of these enzymes have yet engendered formation of products in vitro that duplicate faithful facsimiles of lignin structure; in some cases, they do not even generate formation of polymeric products. [The reader is again encouraged to review the historical developments as regards consideration of each of these oxidases and their potential for lignification and the mechanistic questions that they raise (31)].

Enhanced sampling and free energy methods

In addition to using analysis methods to look for scientifically relevant information in molecular dynamics trajectories it is also possible to increase the sampling rate, or force a reaction event to occur. Such approaches are termed enhanced sampling methods and if used carefully can be used to test a number of hypotheses about reaction mechanisms.

Restraints. Restraining potentials are used to keep a system within a defined region of con­figuration space. Most restraints take the form of a harmonic potential placed on some defined quantity such as distance between two atoms, distance between two parts of a molecule, radius of gyration, or root mean square distance (RMSD) from some reference state. When restraints are used, the states of the system are sampled for configurations that are near the minimum of the restraining potential. NOE distances, from Nuclear Mag­netic Resonance experiments, can be used as restraints to refine structures by exploring configurations but keeping within the defined NMR structural information. Structural changes can be followed or initiated by slowly changing restraining potentials from an initial state to a final state, thus achieving a change that might never occur with a simple MD simulation. Ensemble averages can be extracted from the biased runs to calculate thermodynamic properties, so simulations using restraints are not only exploratory but also for collecting data in state space that are rarely or never visited in normal MD runs. Steered MD. This term refers to forcing MD simulations to follow a trajectory that is biased by a force or biasing potential much like a restraint but with many variations and not necessarily related to a well-defined potential. One clear example is modeling an Atomic Force Microscopy experiment using either a constant pulling force or constant pulling velocity where one end of a molecule is fixed while the other end is pulled in a particular direction. Another example is targeted MD (TMD) (61) in which a final structure is the target and a force is applied to the system, such as a force on the RMSD from the final structure or the Euclidean distance of some atoms from some target, until the target is reached. These methods rarely provide accurate or precise energetic or thermodynamic data, but the approximate data is very useful for probing unknown pathways and providing insight into designing more accurate simulations of the processes or structures of interest. The method of Jarzynski (62, 63) can be used with these methods to gain ensemble statistics if a large number of pulling or targeted simulations are performed, but often this method is no less computationally demanding than the slower sampling methods due to the sheer number of trajectories that must be run. In a very practical sense, this method is very useful to bring a system to a structural state of interest, for which there

is no crystal or NMR structure, such as docking a ligand into a site allowing the nearly natural reconfiguration of the receptor in the process.

Nudged elastic band. The nudged elastic band (NEB) method provides a method for locating low energy transition pathways in biological systems. In NEB (64, 65), the minimum en­ergy path for a conformational change is quantified with a series of images of the molecule describing the path. The images at the endpoints remain fixed in space while each image in-between is connected to its immediate neighbors by “springs” along the pathway that act to keep each image from sliding down the energy landscape onto adjacent images. By running simulated annealing, followed by quenched MD, it is possible to freeze out these images equally spaced along a low energy reaction pathway. The advantage of the NEB method over traditional transition path sampling calculations is that it does not require an initial hypothesis for the pathway. Such simulations, when coupled with QM/MM approaches to allow for bond breaking and formation, will likely prove extremely useful in studying the catalytic action of cellulases acting on cellulose substrates. In particular, such approaches may offer insights into the processive nature of such enzymes.

Understanding cellulases

Most cellulase assays measure the production of reducing sugars from a high molecular weight form of cellulose, as every cleavage event produces a newreducing end. Endocellulases reduce the viscosity of carboxymethylcellulose (CMC), so another way to assay them is to measure the decrease in viscosity of CMC (12). Cellulases also can be assayed by measuring the increase in the number of cellulose particles produced by a cellulase incubated with a cellulose preparation of uniform particle size (13). This assay gave different kinetics of hydrolysis than measuring reducing end increase as it was linear with time and enzyme. It needed particles that were larger than 100 ^m in diameter, as smaller particles did not show an increase in number even though they gave an increase in reducing ends (14).

There are two known cellulase mechanisms: hydrolysis with retention of the stereochem­istry of the anomeric hydroxyl group, and hydrolysis with inversion of the anomeric hydroxyl group (15). One important difference between these mechanisms is that most retaining enzymes can catalyze transglycosylation as well as hydrolysis, while no known inverting enzymes catalyze transglycosylation (16). Cellulases are named by the family number asso­ciated with their catalytic domain (CD) followed by a capital letter that is assigned based on the order in which family members were discovered in a given organism, with A being used for the first (17).

Cellulases are currently the third largest industrial enzyme product worldwide, by dollar volume, because of their use in cotton processing, paper recycling, as detergent enzymes, in juice extraction, and as animal feed additives. However, cellulases will become the largest volume industrial enzyme, if ethanol, butanol, or some other fermentation product of sug­ars produced from biomass becomes a major transportation fuel, as seems likely. Currently, industrial cellulases are almost all produced from aerobic cellulolytic fungi, such as Hypocrea jecorina (Trichoderma reesei) or Humicola insolens. This is due to the ability of these organ­isms to produce extremely large amounts of crude cellulase (about 130 g/L), the relatively high specific activity of their crude cellulase on crystalline cellulose, and the ability to ge­netically modify these strains to tailor the set of enzymes they produce, so as to give optimal activity for specific uses.

Most aerobic cellulolytic microorganisms secrete a set of individual cellulases, which contain a carbohydrate-binding module (CBM) joined by a flexible linker peptide to the CD, and additional domains are often present. In some cellulases, the CBM is N-terminal to the CD, while in others it is C-terminal and the location probably does not affect its function. In contrast, most anaerobic microorganisms produce large (>1 million MW) multienzyme complexes, called cellulosomes, which are usually bound to the outer surface of the microorganism (18, 19). Only a few of the enzymes in cellulosomes contain a CBM, but the scaffoldin protein to which they are attached does contain a CBM, which binds the complex to cellulose. In both aerobic and anaerobic organisms, certain cellulases can act synergistically on crystalline cellulose with the specific activity of some mixtures being more than ten times that of any single cellulase in the mixture (20). Even though cellulose is a homopolymer of glucose, with only a single type of linkage (p-1-4), and with the disaccharide cellobiose being the repeating unit, cellulases are very diverse in their structures, mechanisms, and sequences.

Liquid hot water percolation pretreatment

Pressurized liquid hot water that is percolated or otherwise forced through a packed bed of biomass particles has been shown to result in high removal of both hemicellulose and lignin, with high recovery of hemicellulose-derived sugars (primarily in oligomeric form) and high digestibility of the resulting pretreated solids (47, 48). These processes may be difficult to commercialize due to the high volumes of liquid required to sustain a continuously-flowing percolation process, although some efforts to address the high liquid volume requirements using intermittent-flow approaches have been investigated (16). Nevertheless, percolation pretreatment techniques are useful in research applications to generate pretreated solids with a wide range of hemicellulose and lignin removal extents for enzymatic hydrolysis and related compositional and ultrastructure studies.

14.5.2 Acidic pretreatments Dilute acid batch/co-current pretreatment

Dilute acid pretreatment is probably the most thoroughly investigated biomass pretreatment technique. A variety of acidic catalysts have been investigated in numerous batch/co-current dilute acid pretreatment reactor designs on a wide range of woody, herbaceous, and agri­cultural residue feedstocks. For cost reasons, most dilute acid pretreatment studies have utilized sulfuric acid or gaseous sulfur dioxide (in steam explosion applications), although several processes that utilize nitric, phosphoric, hydrochloric, or carbonic acid have also been investigated. Dilute acid pretreatment studies find wide distribution in the published literature and have been summarized in several review articles (10-15, 49, 50).

Dilute acid batch and co-current pretreatments are generally aimed at achieving near­complete solubilization of the hemicellulose fraction of biomass while also achieving high yields of hemicellulose-derived sugars. Many processes seek to directly achieve monomeric sugar formation, although care must be taken to prevent excessive sugar degradation of monomeric sugars. If performed properly, dilute acid pretreatment can be very effective at achieving reasonable monomer sugar yields via both hemicellulose hydrolysis and enzymatic digestion of the cellulose in the resulting solids across a range of biomass feedstock types (51). However, in batch or co-current mode, there will likely be some degradation losses of hemicellulose-derived sugars and possibly a requirement for conditioning the hydrolyzate liquid fraction prior to fermentation. Dilute acid pretreatment approaches have been tested in continuous co-current pilot scale reactor systems (52, 53) and have been the subject ofin — tensive process simulation and economic analysis for potential commercial-scale operations using this pretreatment approach (54).

Xylogalacturonan synthesis

HG may contain regions that are substituted with (3 — D-xylose linked to C-3 of GalA (185,186, 193, 267). Such regions of xylosylated HG are referred to as xylogalacturonan (XGA) and have been most frequently identified in reproductive tissues of plants including apple (184, 193, 197), cotton and watermelon (185), and pea (342), but also in carrot (186). However, xylogalactruonan has also been detected in Arabidopsis leaves and stems (187), albeit it in lower levels than in reproductive and storage tissues such as apple and potato. No gene for XGA:xylosyltransferase (XGA:XylT) has been unambiguously identified. XGA:XylT activity was identified in studies of apiogalacturonan synthesis (341, 342). Although the product produced was not characterized in detail, at least some of the radioactive xylose appeared to be incorporated into apiogalacturonan and/or HG.

Interestingly, Nakamura and coworkers (263) has reported that in soybean some XGA maybe further elongated with p-1,4-linked xylose residues yielding p-1,4-xylans of up to seven xylosyl residues in length. Such results suggest that HG may, at least in soybean, be a primer or acceptor for a glycosyltransferase or a transglycosylase that establishes a link between pectin and the hemicellulose xylan. Such a linkage would be consistent with the characteristics of the Qual mutant (mutated in GAUT8) (260) and the irx8 mutant (mutated in GAUT 12) (132).

UDP-L-arabinose pyranose (UDP-Ara)

Arabinose (Ara) is an important sugar in plant walls and with a few exceptions, the pre­dominant form of Ara in plant glycans is the furanose configuration (Ara). However, some polysaccharides, RG-II, for example, carry both forms of the arabinose moiety, i. e. Ara — furanose and Ara-pyranose. UDP-Ara (pyranose form) was identified in all plant extracts and is synthesized by (i) the sequential phosphorylation of L-Ara at its C-1 by a membrane associated L-arabinokinase (412) followed by a pyrophosphorylase that converts L-Ara-1- P and UTP to UDP-Arap, and (ii) a membrane-bound UDP-Xyl-4 epimerase (UXE) that converts UDP-D-Xyl to UDP-L-Arap.

1 Neufeld and coworkers (411, 412, 468) isolated sugar-kinase activities from different sources of tissues. A membrane fraction from bean was shown to catalyze the C-1 phos­phorylation of L-Ara to p-L-Ara-1-P. The same membrane preparation phosphorylated D-Gal to a — D-Gal-1-P. However, the GalK kinase and the AraK kinase are different enzymes since AraK requires divalent ion for activity (Mg[1]+, Mn2+) whereas, the GalK kinase re­quires no additional divalent ion for activity (412). In addition, treatment of membranes with digitonin solubilizes the L-AraK activity but not the D-GalK activity. The AraK is specific for the L-form since D-Ara (that is common in prokaryotes) is not a substrate.

The aral mutant from Arabidopsis, in the At4g16130 locus, has reduced ability to metabolize arabinose and lacks Ara-1-P kinase activity (479). Bioinformatic analysis suggests that Ara1 belongs to a large family including galactokinase, homoserine kinase, mevalonate kinase, and phosphomevalonate kinase (GHMP kinases). The Ara1 protein is speculated to be a Type Ia membrane protein. If the topology is correct, it would be interesting to know whether the catalytic domain is facing the cytosol or the lumen. Direct biochemical assays and substrate specificity studies of the encoded Ara1 gene were not performed. The subsequent pyrophosphorylation of a-L-Ara-1-P to UDP-Ara can be mediated by “Sloppy,” the non­specific UDP-sugar pyrophosphorylase (413).

UDP-Xyl 4-epimerase) was shown to convert reversibly UDP-D-Xyl to UDP-Ara (480). Bioinformatic analysis suggests that Uxe1 is a Type II membrane protein whose catalytic domain is facing the lumen. A Uxe1-GFP chimera was localized to the Golgi apparatus (480). Two isoforms (UXE1, At1g30620 and UXE2, At2g34850) that share 83% aa sequence identity to each other exist in the Arabidopsis genome; two isoforms are in the rice genome and three UXE isoforms were isolated for barley (Hordeum vulgare) (Zhang and Fincher, unpublished). Since several GTs were able to transfer the Ara (pyranose) from UDP-Ara into plant glycans (271, 397); it remains a puzzle when the Ara acquires the furanose configuration. One can predict that the Ara/-donor has not yet been identified. However, this is unlikely since the mur4 mutant (involved in the synthesis of UDP-Ara pyranose), lacks glycan consisting of Ara/. This could imply that during the arabinosyltransferase catalyzed reaction the Arap is altered to the Ara/ form. A specific mutase may exist to convert the Arap to Ara/ on the glycan itself, similar to the conversion of GlcA to IdoA in proteoglycans (481). Alternatively, UDP-Ara (furanose) is made in plants as recently confirmed and described in the following section.

Sinapyl alcohol dehydrogenase

Another distinct dehydrogenase, this time from aspen, was claimed to be specific for sinapyl alcohol (5)/syringyl lignin formation (143). This report from the Chiang laboratory was quite unexpected. This was because its actual broad substrate versatility for cinnamyl alde­hydes 19-23 eliminated it as being biochemically-specific for sinapyl alcohol (5)/syringyl lignin formation (77). That is, and as previously noted for all the CADs proper, this enzyme was also substrate versatile. Thus, any substrate specificity, if it exists at all, would presum­ably be a result of compartmentalization, i. e., where sinapyl aldehyde (23) and the so-called “sinapyl alcohol dehydrogenase” were co-localized. Since this has not been established, the physiological role of this dehydrogenase remains unknown at present. More recently, an X-ray crystal structure for “SAD” was also obtained (144), but which contained a vastly different substrate-binding pocket (in terms of both size and amino acid composition) to that of a bona fide CAD, i. e., AtCAD4/5 (133). Specifically, only 2 of the 12 amino acid residues which constitute the CAD substrate-binding pocket were conserved in “SAD,” with the substrate-binding pocket for the latter also being considerably larger (133). Taken to­gether, these data suggest an alternative biochemical function for “SAD,” as also provisionally suggested by Bomati and Noel (144).

Furthermore, in Arabidopsis, no evidence for any requirement for a “SAD” was obtained, since >94% of all monolignol 1-5 formation for lignification was carried out by AtCAD4/5, both of which share considerable homology (74-83% similarity) to that of bona fide CADs (56). Moreover, the dehydrogenases of highest similarity to the “sinapyl alcohol dehydroge­nases” (i. e., At4g37970 and At4g37980) did not reduce the p-hydroxycinnamaldehyde 19-23 to afford the monolignols 1-5 to any considerable extent in vitro (56). Thus, as for the other dehydrogenases described above, clarification of their biochemical and physiological roles need to be established as well.

A rice mutant, gold hull and internode (gh), was also first described as early as 1917, and characterized by a reddish-brown color in the hull and internode but not in the midrib. Since then a series of mutants were identified, i. e., gh1-gh4. Zhang et al. (145) recently characterized the gh2 mutant and showed that the GH2 gene encodes a CAD. The substrate versatile kinetic properties of the recombinant GH2 proteins were determined showing Kenz values for coniferyl (21) and sinapyl (23) aldehydes of ~289 000 versus ~ 162 000 M-1 s-1, respectively, these being in relatively good agreement with kinetic values for AtCAD5 in Arabidopsis (56). The recombinant mutant gh2 protein, however, lost the corresponding activities. Additionally, analyses of CAD activity in all tissues (i. e., panicle, hull, blade, midrib, sheath, internode, and root) of both wild type and gh2 mutant plants showed that the formation of coniferyl alcohol (3) was greatly reduced in the roots, internodes, hulls, and panicles of the gh2 mutant, with no detectable formation of sinapyl alcohol (5). On the other hand, the gh2 mutant had apparently little to no effect on overall lignification (estimated at a 5-6% reduction). Taken together there is no evidence thus far that “SAD” has the specific biochemical/physiological functions reported earlier (143).