Category Archives: Biomass Recalcitrance

Cellulose microfibrils

Here we first review evidence regarding the native state. Perhaps the most informative observations from a quantitative perspective are those of Hanley et al. regarding the alga Micrasterias denticulata (15) and those of Haigler regarding celluloses formed by the bac­terium Acetobacter xylinum (16). In both instances, a long-period helical form is observed.

A micrograph of Micrasterias denticulata provides an excellent demonstration of the regularity of the periodicity. It is shown in Figure 6.1, where the period of about 1200 nm is evident. Panel A shows the micrographs of the fibril, while panel B provides the authors’ rationalization of the appearance of linear segments connected by highly deformed segments within which the 180° turning occurs. The linear segments are about 600 nm each, so that a helical twist of 180° occurs over 600 nm, and two linear segments totaling about 1200 nm corresponding to the full period wherein a complete turn of 360° occurs.

A similar periodicity has been observed by Haigler in her pioneering studies of the struc­tures of bacterial celluloses from Acetobacter xylinum and their response to different pertur­bations of their growth environment (16). The period consistently observed is about 600 nm,




Figure 6.1 A fibril of Micrasterias denticulata as observed in panel A, and the rationalization of the appearance of the dehydrated sample (B) proposed by the authors (15). (Courtesy of Professor Derek Gray, Paprican and McGill University, Montreal, Quebec.)



which has also been reported for celluloses formed by A. xylinum (17). A key point is that the long period of the helical biological structure appears to depend on lateral dimensions. The lateral dimensions of Micrasterias denticulata fibrils are approximately 10 by 20 nm, whereas the lateral dimensions of bacterial cellulose microfibrils are approximately 6-7 nm. Thus, one would expect that the period of the nanofibrils in higher plant cell walls, which have lateral dimensions of the order of 3-5 nm, would be significantly less than 600 nm.

The conclusion regarding the period of higher plant fibrils has been confirmed by the most recent comprehensive molecular modeling studies carried out for hydrated aggregates of cellulose molecules (14). It is also confirmed by atomic force microscopic images of maize parenchyma cell walls shown in Figure 6.2. Here we see that the fibrils appear to vary in width as indicated by the arrows. The variation in width is indicative of an ellipsoidal cross section as


Figure 6.2 An atomic force microscope (AFM) image of the surface of maize parenchyma cell. The large arrow shows a single microfibril at its narrow point. The two smaller arrows show where it is atop another microfiber. Measurement of the elevation provides an approximate dimension. [Adapted from Himmel, M. E., Ding, S.-Y., Johnson, D. K., Adney, W. S., Nimlos, M. R., Brady, J. W. & Foust, T. D. (2007) Biomass recalcitrance: engineering plants and enzymes for biofuels production. Science, 315, 804-807.]

suggested by the ellipsoidal polyhedral model proposed by Ding and Himmel (18). The sep­aration of the narrower sections along an individual fibril appears to be of the order of 200­250 nm. The lateral dimensions of the ellipsoidal polyhedron are approximately 3 by 5 nm.

Before discussing the implications of the molecular modeling studies and patterns of fibril aggregation, it is helpful to clarify the questions regarding symmetry and helical organization with the aid of geometric models of the constrained crystallographic structures and the unconstrained structures manifesting the long-period helical twist. These are illustrated in Figure 6.3, which shows scaled representation of nanofibrils of different sizes. In panel A, they are represented as they would be if they were describable in terms of the symmetry of space groups as implicit in the crystallographically determined structures. They range in cross-sectional size from 2 by 2 nm, which approximates the most elementary fibrils observed, to 20 by 20 nm fibrils representative of algae such as Valonia and Cladophera as well as the tunicate Halocynthia. The length dimension has been scaled to be 300 nm presented in 4 nm intervals to aid in visualization. The geometry of fibrils was defined by the requirement of translational symmetry along three non-coplanar linear axes in Cartesian space. Note that the angles between axes will not be 90° for many crystallographic space groups, although linear translational symmetry in three dimensions is foundational in space group theory (19). Panel B in contrast shows the same fibrils as transformed to reflect a helix with a period of 1200 nm. The figure in panel B represents a 90° turn of the end of each fibril over 300 nm to display the effect of a 360° turn over 1200 nm.


It is immediately obvious that assumptions underlying crystallographic analyses are ap­proximations, the implications of which cannot be ignored. For this reason, we need to adopt new terminology to avoid confusion. We suggest no longer using the terms crystal or crystalline, but rather use the term “aggregate” to indicate ordered fibrils. We regard them as highly ordered biological structures that do not meet the classical criteria for crystallinity illustrated in panel A of Figure 6.3. However, since they are periodic at the molecular level as well, we anticipate that they will diffract X-rays in a pattern that approximates that expected from the structures depicted in panel A of Figure 6.3.

The central problem for crystallographic analyses is that the helical twist in fibrils elimi­nates the possibility of constructing a reciprocal space, and such a construction is essential for the interpretation of diffractometric data. Given this observation, one must ask why it has been ignored for approximately 100 years since crystallinity in cellulose was first proposed and diffractometric studies were undertaken. We suggest that a key obstacle has been trans­formation of native celluloses in the course of isolation. Before reviewing the matter further,
it is useful to begin consideration of the native state in living plants by reviewing the results of the molecular modeling program.

CCR, CAD, F5H, and COMT downregulation/mutation, and the enigma of monolignol radical generation

Since the early 1930s or so, there have been a number of reports indicating that various agronomically important plant species, i. e., maize (Zea mays), sorghum (Sorghum bicolor),

and pearl millet (Pennisetum glaucum), can produce so-called brown-midrib mutants. The maize mutants (bm1-bm4) were spontaneous mutations (58-60) and were later shown to have lower lignin contents than wild-type lines (62), whereas both sorghum (227) and pearl millet (228) mutants were generated via chemical mutagenesis. For sorghum, out of the 19 individual mutant lines obtained, two (bmr12 and bmr18) had estimated lignin contents reduced by 42 and 45%, respectively, when compared to wild type (227). Since then a spontaneous mutation has also been identified in sorghum (bmr26) (229). To our knowledge, none of these mutants have yet found commercial application since their discovery, due to defects, such as increased brittleness, increased lodging, delayed growth, and flowering (230).

Some of these lines have been characterized as mutations in CAD (bml) (67) and COMT [ bm3 (66,231) and bmr12/bmr18/ bmr26(229)]. As noted earlier, our previous metabolic flux analyses (34) had suggested that under the conditions employed none of these steps would “normally” have a rate-limiting role in terms of carbon allocation to the monolignol/lignin pathway. Nor would modulation of F5H be anticipated to alter carbon (metabolic flux) allocation. Effects of manipulating each of these steps are thus discussed below.

Molecular model types

There are a number of variations on the classical force field approach described above. One can categorize the different methods by considering the concept of a molecular model. Where a molecular model is defined by the basic elements, such as atoms and bonds, and the nature of the interactions of these elements, described by the force field equation, such as that discussed above, the parameters of the potential and the method used for the representation of solvent. Together these elements form the model. There is a tight connection between a model type and an accompanying force field in the sense that the molecular model and the kinds of behaviors to be modeled determine what needs to be in the force field. However, there can be several force fields, with different parameters and functional forms for the same molecular model type. With this important distinction between the model type and the force field in mind, the most commonly used model types and then the most popular force fields will be discussed, and finally a brief description of current solvent model types. Each of these topics, coupled with the background to classical molecular mechanics described above, deserves a full chapter in itself and so we present only a brief overview of what is most popular and what we believe to be the essential tools in the current state-of-the-art research. For a more in-depth discussion the reader can refer to references Allen and Tildesley (9), Grant and Richards (10), Frenkel and Smit (11), Leach (12), Jensen (13), and Cornell etal. (6).

The most detailed and also most used classical mechanical model for biomolecular sim­ulation is the “All-Atom” model in which the basic element of the model is an atom with properties of mass, partial charge, and an atom type which describes its bonding properties and van der Waals parameters. The energy function of a given arrangement of atoms in an all-atom approach typically conforms to the AMBER force field equation described above which is itself an all-atom model. An all-atom molecular model is created by describing all the atoms and choosing a force field. A full description of atoms includes specifying which atoms are bonded together, what kind of atoms they are, such as an sp2 carbon, and positioning them in space. The “United Atom” model (14-17) is the same as an all-atom model except that aliphatic hydrogen atoms are combined with the carbon to which they are bonded to form a “united” atom with most of the properties of the carbon, but a larger van der Waals radius and a larger mass. This model requires a different force field than the all-atom model.

Reduced models currently in use are the bead (18) or GO (19, 20) models, models used in large-scale normal mode analysis, and large subunit modeling such as is used in virus capsid assembly modeling (21). Bead models use a whole residue as the element of the model in which each residue, such as an amino acid in a protein, is represented by a single sphere with size and interaction properties, and each bead is bonded to other beads. There is a simple force field for this kind of model, which can include attractions, repulsions, and bonding properties such as angles and dihedrals. It is used primarily for studies of folding of biopolymers. For large-scale normal mode analysis, described later, the Hessian matrix can become impossibly large at 9N2 for a system of N atoms. An elastic-network model, in which each alpha carbon of a protein is connected to every other alpha carbon by a spring, reproduces the lowest frequency modes (22-24), which are generally the modes of interest, reducing the size of the problem by at least an order of magnitude. The granularity of the problem has been increased further by combining multiple residues into blocks with only rotational and translational degrees of freedom, bonded together and with an elastic spring network; this model is the RTB model (25). The largest granularity that is worth mentioning is the subunit model used in simulating virus capsid assembly (26), in which each unit represents the basic subunit of a virus, which contains three to nine proteins. Each subunit is a rigid body that interacts with other subunits through interaction points with both attractive and repulsive forces. One could envision using this kind of modeling for interactions between the subunits of the plant cell wall.


Xyloglucan is one of the major hemicellulose polymers in growing primary cell walls of various plant species. Due to analytical problems, it has been difficult to differentiate it from cellulose and xylan. Xyloglucan is closely associated with cellulose microfibrils via hydrogen bonding, thus providing the load-bearing network of the cell wall, which protects the cell wall from collapsing due to the osmotic stress. Most research efforts have been focused on determination of plant enzymes responsible for control and modification of the expanding cell wall (42-60). Although not as heavily branched as xylans, the xylose and other substituents on xyloglucan can make enzymatic digestion more complicated than that of cellulose and p-glucan.

Xyloglucan has cellulose-like backbone of p-1,4-D-glucopyranose residues to which a-D — xylose residues are attached at C-6 position. Generally, 60-75% of the glucose residues are branched with xylose, except in grasses, which have a lower degree ofsubstitution. Xylose can form side chains with D-galactopyranose and L-fucopyranose saccharides, and rarely with L-arabinofuranose. Plant cell wall degrading enzymes, such as endoglucanases, xyloglucan endotransglycosylases, and exoglycosidases, like a-fucosidases or p-galactosidases have been reported to digest the xyloglucan. Also, some cellulases of Trichoderma reesei are able to hydrolyze the xyloglucan backbone (61). A new class of polysaccharide-degrading enzymes comprise the specific xyloglucanases, or xyloglucan-specific endoglucanases, which can attack the backbone also at substituted glucose residues (62).

Some xyloglucanases require a specific xylose substitution pattern while others are more general (43). This determination seems to be dependent on the binding sub-sites in specific endoglucanases (61). A xyloglucanase from Aspergillus niger has been shown to be active against several p-glucans, but having the highest activity against tamarind xyloglucan (63). This, combined with its lack of synergy with cellulases, indicates specificity different from traditional endoglucanases. A plant-specific enzyme believed to be responsible for modifica­tion of xyloglucan in the cell wall through endo-hydrolysis and glycosyl transferase activities has also been characterized (59, 60).

The cellulosome rationale

Why did some cellulolytic anaerobes evolve to produce such a complicated mechanism in the production of an intricate multi-component conglomerate of enzymes, CBMs and other functional modules, complexed into a discrete type of complex, which is located in


Figure 13.9 Solubilization of concentrated suspensions of microcrystalline cellulose by the combined cellulosome p-glucosidase system. The figures show the time course of solubilization of the indicated concentrations of Avicel using purified preparations of the C. thermocellum cellulosome (8 pg per mg cellulose) together with the A. niger p-glucosidase (0.04 cellobiase units per mg substrate). At very high substrate concentrations, a second sample of the combined enzyme system was applied (arrow) to achieve near-complete solubilization of substrate (dashed line). Without the added p-glucosidase (-pG), only meager levels of cellulose solubilization are observed.

bundle-like organelles over the cell surface? Such an expense in metabolic energy would imply that the bacterium would gain a significant compensation for the effort. Abiochemical rationale for enhanced cellulolytic activity of cellulosomes on recalcitrant forms of cellulose was originally proposed already upon its discovery (88, 120):

The (cellulosome) complex apparently comprises various different forms of cellulases, each of which may bear separate specificities toward different quaternary structures on the complex cellulose substrate. The major organizational role of this complex might be designed for effective delivery to the substrate as well as to bring into proximity the various complementary enzymes (e. g., exo and endocellulases). In addition, the complex may be structured in such a way as to enable the protection of various product intermediates and to facilitate their transfer to other cellulase components for further hydrolysis. In any event, the cellulase subunits seem to be arranged within the CBF (cellulosome) complex in a defined supramolecular fashion designed for highly efficient cellulose degradation.

Proof of the targeting and proximity effects was eventually realized through the use of arti­ficial “designer cellulosomes,” which enabled controlled incorporation of selected cellulases into a cellulosome-like complex (144, 145). For this purpose, a chimeric scaffoldin that contains divergent cohesins and matching dockerin-bearing enzymes can be mixed in vitro to form minicellulosomes of defined composition and spatial arrangement (Figure 13.10). The capacity to prepare cellulosomes of uniform composition and to control the types of enzymes included therein has contributed a better understanding of the factors important for efficient cellulosome action. Thus, the two major factors that serve to enhance decon­struction of recalcitrant forms of cellulose are indeed targeting to the substrate surface by the scaffoldin-borne cellulose-binding module (CBM), and the consequent proximity of the enzyme components.


The incorporation of enzymes into cellulosomes would ensure that different complemen­tary activities are all contained in the same microscopic area of the cellulose substrate rather than being dispersed statistically over its surface. Separation of relatively small amounts of enzymes by large distances would thwart the synergistic action of complementary en­zymes, each of which bears a CBM that prevents its free diffusion as in the case of the free (non-cellulosomal) enzymes. The free systems are limited by the fact that such enzymes are statistically isolated and hence do not benefit of the full complementary action of the other enzymes. The “proximity effect” of the cellulosomes counteracts this problem. The “remedy” for the free systems is the production of very large quantities of enzymes to over­come their solitude, which is the strategy taken by aerobic bacterial and fungal systems. In fact, this maybe the reason why cellulose-binding CBMs are appended to non-cellulolytic enzymes: both cellulosomes and the free enzyme systems use CBMs to prevent the useless dispersion of their enzymes (whether cellulolytic or not). Disruption of the sites of the cellulose-hemicellulose interface is a key to potent digestion of the plant cell wall. It is im­perative that the different enzymes, the cellulases and hemicellulases, are in close proximity to each other. Because of its large size (relative to bacteria), the fungi probably deliver their cellulase mixture in a localized specific area (tip of hypha perhaps) and then the enzymes remain together in that area as a function of their respective CBMs. In contrast, due to its smaller size, it is C. thermocellum’s best interest to remain attached to a multienzyme system capable of accommodating almost any type of enzyme needed to breakdown completely a piece of plant cell wall without having to move. This may be the reason that cellulosomes appear to be so superior to the free enzyme systems.

With the exception of the cellulosome cluster in C. acetobutylicum, which probably pro­duces a crippled complex (83,84), it can be seen from the few known genomes of cellulosome — producing bacteria that there is a much larger number of dockerin-containing enzymes that can be attached to a scaffoldin than there are slots (cohesins) on the scaffoldin. In these bac­teria, there are more hemicellulose-degrading enzymes than there are cellulases. This infers that cellulosome organization is beneficial not only for crystalline cellulose digestion, but
equally for the digestion of hemicelluloses, pectins, and other plant cell wall polysaccharides. This holds true even for C. thermocellum, a bacterium capable of assimilating only cellulose and its degradation products. Nevertheless, it appears to serve as the major polysaccharide­degrading bacterium in its ecosystem, passing on the surplus (e. g., cellobiose) and/or su­perfluous (e. g., xylose, etc.) sugars to its companion (saccharolytic) strains that share the same locale. Particular types of plant cell wall-derived biomass require very precise enzyme mixtures. In this context, previous attempts to “benchmark” one cellulase “system” against another did not particularly reflect an intrinsic ability but the “fit” of the enzyme mixture with respect to the actual substrate used. In other words, it is hard to compare one complex mixture to another.

For many decades, it was believed that the solution to efficient degradation of recalcitrant cellulose would be found in conventional engineering approaches by employing a set of sol­uble enzymes, such as the then-known endo — and exo-glucanases. The amount of financial resources versus the expectations of society resulted in great disappointment that served to minimize further scientific activity in this type of research. Very little has been achieved on the scientific front during the interim period. However, some positive results in this area have been recorded, including the discovery of the cellulosome (the topic of this chapter) and the realization that the bacterial cellulases are multi-modular entities that exhibit distinct functionalities. It is hoped that in the future these novel contributions will encourage a new burst of serious scientific and applied efforts in this area. The resolution and harnessing of the cellulosome (146) may thus provide a renewed opportunity for combating the diffi­culties encountered in digesting recalcitrant cellulosic biomass in an effective, cost-efficient manner.


The authors are grateful to Claire Boisset, Henri Chanzy, Pedro M. Coutinho, the late Martin Schulein, Ely Morag, Yuval Shoham, Ilya Borovok, Henri-Pierre Fierobe, Jean-Pierre and Anne Belaich, Marco Rincon and Harry Flint for their input, discussions, and collaboration, past and present. This work was fundedby Research Grants 394/03 and 422/05 from the Israel Science Foundation (Jerusalem), a grant from the Alternative Energy Research Initiative, and by grants from the United States-Israel Bi-national Science Foundation (BSF), Jerusalem, Israel. EAB is the incumbent of The Maynard I. and Elaine Wishner Chair of Bio-organic Chemistry at The Weizmann Institute of Science.

Cellulase regulation

Regulation of cellulase synthesis by C. thermocellum is an important feature of the physiology of this microorganism, particularly in light of the substantial investment of ATP that cellulase synthesis represents (34,35). Johnson and coworkers (36) reported that true cellulase activity (i. e., degradation of crystalline cellulose) synthesis was markedly repressed by cellobiose. mRNAs corresponding to endoglucanases CelA, CelF, and CelD were found to be regulated at the level of transcription by a mechanism analogous to catabolite repression (37). The number of CelS (the most dominant cellulolytic enzyme of the cellulosome) and CipA (the cellulosomal scaffolding protein) transcripts per cell were shown to decrease with increasing growth rate (38, 39). CelS transcripts were found to be higher for growth under cellobiose — limitation as compared to growth under nitrogen limitation (38) and control of scaffoldin and CelS transcription was shown to involve a housekeeping Sigma-A factor (39).

Stevenson and Weimer (40) investigated expression profiles of 17 genes involved in cel­lulose hydrolysis, intracellular phosphorylation, catabolite repression, and fermentation end product formation as determined by real-time PCR in continuous cultures grown on cellobiose and cellulose. Thirteen genes displayed modest (fivefold or less) differences in expression in response to varied growth rate or substrate. By contrast, cipA, celS, and manA genes displayed 10-fold reduced levels when grown on cellobiose at dilution rates of >0.05/h, suggesting that at least some cellulosomal components are transcriptionally regulated.

Zhang and Lynd (32) investigated the regulation of cell-specific cellulase synthesis (de­fined as mg cellulase/g cell dry weight) by C. thermocellum using an ELISA protocol based on an antibody raised against a peptide sequence from the scaffoldin protein (31). We found that cellulase synthesis in Avicel-grown batch cultures was nine times greater than in cellobiose — grown batch cultures. In substrate-limited continuous cultures, however, cellulase synthesis with Avicel-grown cultures was greater than that in cellobiose-grown cultures by 1.3- to 2.4- fold, depending on the dilution rate. Continuous cellobiose-grown cultures maintained at either high dilution rates or high feed substrate concentration resulted in decreased cellulase synthesis, with a large (sevenfold) decrease between 0 and 0.2 g/L cellobiose and a much more gradual further decrease for cellobiose concentrations >0.2 g/L. Several factors suggest that cellulase synthesis in C. thermocellum is regulated by carbon catabolite repression (CCR). These factors include: 1) substantially higher cellulase yields observed during batch growth on Avicel as compared to cellobiose, 2) a strong negative correlation between cellobiose con­centration and cellulase yield in continuous cultures with varied dilution rate at constant feed substrate concentration and also with varied feed substrate concentration at constant dilution rate, and 3) the presence of sequences corresponding to key elements of catabolite repression systems in the C. thermocellum genome. CCR-mediated control of cellulosome synthesis in C. thermocellum is supported by the observation that the three key components of a CCR system — an LacI/GalR family regulatory protein, an HPr protein and an HPr kinase, and a 14-bp cis-acting catabolite responsive element (CRE) binding sequence — are presentinthe C. thermocellum genomic sequence (http://genome. ornl. gov/microbial/cthe/). Several putative LacI/GalR family genes are found in C. thermocellum. We were able to lo­cate many (>100) putative CRE sequences, including two putative CREs inside the cipA
structural gene (+953, +5231), using the more degenerate CRE consensus sequence (WG — WNANC/GNTNNCW). Substantial degeneracy of CRE sequences is supported by results from B. subtilis. For example, Chaveaux (41) found only 29 CRE sequences based on a consensus sequence with 7 of the 14 bases degenerate, whereas Moreno and coworkers (42) found, using DNA arrays, that ~330 genes were regulated by CCR. Moreover, whole genome analysis of B. subtilis indicates that the CREs sequence is not strictly conserved, and that CRE variation provides a means to alter the affinities of regulatory proteins to CRE sequences thereby modulating regulation (42, 43).


Pectin is likely the most structurally complex family of polysaccharides in nature (Figures 5.2 and 5.3). Pectin is particularly abundant in primary walls, i. e. those walls surrounding


4)aGalA-( 1,4)-aGalA-(1,4)-aGalA-(1,4)-aGalA-( 1,4)-aGalA-(1,4)-aGalA-( 1,4)-aGalA-(1,

Rhamnogalacturonan I

Ara — and/or Gal-containing Ara — and/or Gal-containing

side chains GlcA side chains

4 4 4

4 44

4)-aGalA-(1,2)-aLRha-(1,4)-aGalA-(1,2)-aLRha-( 1,4)-aGalA-(1,2)-aLRha-( 1,4)-aGalA-( 1,2)-

Rhamnogalacturonan II





Side chain D 5

5 Side chain C







4)aGalA-( 1,4)-aGalA-(1,4)-aGalA-(1,4)-aGalA-( 1,4)-aGalA-(1,4)-aGalA-( 1,4)-aGalA-i





Side chain A PApi

PApi Side chain B





PGalA—>3 pLRha2^aGalA

3 pLRha^aLAceA aLRha


2 2


t 2

2Me aXyl->3 aLFuc4^PGlcA2^LGal

2Me aLFuc—>2 PGal4^aAra

Figure 5.2 Representative structures of the three pectic polysaccharides HG, RG-I, and RG-II.

growing and dividing cells and the terminal wall in many cells of the soft parts of the plant. Pectin is also abundant in the middle lamella which is the junction between adjacent cells. Pectin comprises ~35% of the polysaccharides in dicot and non-graminaceous monocot primary walls, and 2-10% of the wall in the grasses (157, 158). Pectin is also present in the walls of gymnosperms, pteridophytes, and bryophytes as well as Chara, a charophycean alga, which is believed to be the closest extant relative of land plants (159). Although pectin is not a major component of secondary walls, it is present as the outer layer of secondary walls and can represent ~5% of harvested tree wood. Thus, depending on the plant and tissue used, pectin will be present in the biomass used for biofuel production and, since it comprises a complex interconnected matrix in the wall, likely affects the recalcitrance of biomass to deconstruction for biofuel production.

Pectins have multiple roles in plant defense, growth, and development (158, 160). They provide wall structure (161), bind and exchange apoplastic anions and macromolecules (162, 163), influence cell-cell adhesion (164, 165), and are involved in cell signaling (166,

Подпись: ^ = Glucuronic acid


о = Acetyl groups о = Methyl groups

Figure 5.3 Schematic structure of pectin showing the three main pectic polysaccharides homogalac — turonan (HG), rhamnogalacturonan I (RG-I), and rhamnogalacturonan II (RG-II) linked to each other. A region of substituted galacturonan known as xylogalacturonan is also shown (XGA). The representative pectin structure shown is not quantitatively accurate, HG should be increased 12.5-fold and RG-I increased 2.5-fold to approximate the amounts of these polysaccharides relative to each other plant walls. The monosaccharide symbols used are either from the Symbol and Text Nomenclature for Representation of Glycan Structure. Nomenclature Committee Consortium for Functional Glycomics (http://www. functionalglycomics. org/glycomics/molecule/jsp/carbohydrate/carbMoleculeHome. jsp) or from D. Mohnen. (The figure is modified from http://www. uk. plbio. kvl. dk/plbio/cellwall. htm.) (Reproduced in color as Plate 4.) 167). Pectins have roles in pollen tube growth (168), seed hydration (169-171), leaf ab­scission (172), guard cell function (173), organ formation (174, 175), fruit development (158), and possibly water movement (176). Pectic oligosaccharides are intercellular sig­nal molecules (177) in plant development (166) and defense responses (178, 179). Mutant plants with altered pectin structure maybe dwarfed (161,180), have brittle leaves (164), re­duced numbers of side shoots and flowers (175), and reduced cell-cell adhesion (135,181).

Pectin is defined as a family of plant cell wall polysaccharides that contain 1,4-linked galac­turonic acid (157). Galacturonic acid (GalA) comprises roughly70% oftotal cell wall pectin and is a major component of the three major types of pectic polysaccharides: homogalac — turonan (HG), rhamnogalacturonan I (RG-I), and the substituted galacturonans for which rhamnogalacturonan II (RG-II) is the most ubiquitous and structurally invariant member (157). In addition, pectin includes the less abundant substituted galacturonan xylogalactur­onan (XGA) (182-187) and apiogalacturonan (AG) (158,188-191). The complex structure of the pectic polysaccharides makes the study of pectin synthesis challenging. It is estimated that at least 58 enzymes are required to synthesize pectins, including methyltransferases, acetyltransferases and numerous glycosyltransferases (192).

HG accounts for ~65% of pectin (193, 194) and is a homopolymer of a-D-1,4-linked GalA residues (Figures 5.2 and 5.3) that is partially methylesterified at the C-6 carboxyl (157, 195), may be O-acetylated at O-2 or O-3 (196-199), and may contain other esters
whose structure remains unclear (200-204). RG-I accounts for 20-35% of pectin (194) and is a family of polysaccharides with an alternating [^4)-a-D-GalA-(1^2)-a-L-Rha — (1^] backbone (Figures 5.2 and 5.3). Between 20 and 80% of the rhamnosyl residues are substituted with side chains composed predominantly of linear and branched а-L-Ara/ and (3-D-Galp (157, 205). The main types of side chains include a-1,5-linked L-arabinan with some 2- and 3-linked arabinose or arabinan branching, p-1,4-linked D-galactans with some 3-linked L-arabinose or arabinan branching and p-1,3-linked D-galactan with p-6-linked galactan or arabinogalactan branching (205). RG-I side branches may also contain a-L-Fucp, p-D-GlcpA, and 4- O-Me p-D-GlcpA residues (206). The composition and length of RG-I side chains varies between cell types and in different plant species (158,160). RG-II accounts for ~10% of pectin (158, 194) and contains 12 different types of sugars in over 20 different linkages. The HG backbone of RG-II is substituted at O-2 and O-3 with four structurally complex oligosaccharides A-D (159) (Figures 5.2 and 5.3). RG-II in the plant generally occurs as a RG-II dimers crosslinked by borate diesters (159). The 4-linked galacturonans that are substituted at O-3 with D-xylose (the xylogalacturonans, XGA) are often found in reproductive tissues (157, 186, 193, 207) whereas galacturonans substituted at O-2 or O-3 by D-apiofuranose (188, 189) (the apiogalacturonans, AG) are restricted to selected aquatic monocots (e. g., Lemna).

When walls are isolated from the plant, the pectic polysaccharides appear to be covalently cross-linked since harsh chemical treatments or digestion by pectin-degrading enzymes is required to isolate HG, RG-I, and RG-II separately from each other. It is not known, however, how the pectic polysaccharides are covalently linked to each other or to other polymers in the wall. It is also not clear where and how that crosslinking occurs, i. e. via the action of glycosyltransferases in the Golgi or by transglycosylases or other enzymes in the wall. The available data (208) support a model whereby HG, RG-I, and RG-II are linked via their backbones. However, due in part to the uncertainty of how and where the pectic polysaccharides are cross-linked, it is currently not possible to predict the complete reper­toire of biosynthetic enzymes that are needed to synthesize pectin. Furthermore, although the general types of pectic polysaccharides are similar in different plant species, there is a growing body of evidence showing that species-, cell-type-, and developmental state-specific differences in pectin structure exist, thereby making it likely that the number and types of enzymes required to synthesize pectin will depend on the plant, tissue and developmental state of the cells of interest.

Finally, a knowledge of the structure of the “mature” polysaccharides in the wall, or at least those that can be isolated from the wall and characterized, does not necessarily reflect the structures as they are synthesized, but rather the structures as they are inserted into the wall and after they have been modified by wall-localized enzyme catalyzed (and chemical) reactions. In the following discussion ofpectin, there is no attempt to define species-specific differences in pectin synthesis, since our understanding of the species-specific tailoring of pectin structures and synthesis is only just beginning to be studied. Rather, this review em­phasizes our current understanding of biosynthetic enzymes required for the basic pectin structures that appear to be common in all species. Although only few of the genes encoding pectin biosynthetic enzymes have been confirmed by demonstration of enzymatic activ­ity of the encoded proteins, recent progress in identifying genes encoding putative pectin biosynthetic enzymes make it likely that more pectin biosynthetic genes will be functionally identified in the near future. The availability of such genes should facilitate the elucidation of how diverse pectin biosynthetic enzymes work together, likely within protein complexes (Atmodjo and Mohnen, unpublished results) to synthesize the multifunctional family of pectic polysaccharides.

Several comprehensive reviews on pectin biosynthesis (158,192,205,209), as well as more general reviews on plant wall biosynthesis (72, 99, 126, 155, 210-213, 214) and strategies to identify wall biosynthetic glycosyltransferases (118, 215-218) and regulation of cell wall synthesis (219-221) have previously been published. This review will attempt to merge recent advances in pectin synthesis with the prior studies so as to reflect our current understanding of pectin synthesis.

UDP-o. — D-glucuronic acid (UDP-GlcA)

In 1952, Dutton and Storey discovered that UDP-GlcA acts as a glucuronosyl donor in the synthesis of glucuronides by liver enzymes. In plants, UDP-GlcA is a key intermediate serving as a branch point between UDP-hexose (six carbons) and UDP-pentose (five carbons) sugars. UDP-GlcA is the precursor for UDP-D-xylose, UDP-L-arabinose, UDP-apiose, and UDP-galacturonic acid that contribute to synthesis of over 40% of cell wall polysaccharides. UDP-GlcA is made by (i) sequential phosphorylation of D-GlcA at C-1 by a membrane bound kinase (411) followed by a pyrophosphorylase activity that converts a-D-GlcA-1-P and UTP to UDP-GlcA, (ii) NAD-dependent oxidation of UDP-Glc to UDP-GlcA by UDP — Glc dehydrogenase (UGD, UDPGDH), and by a controversial pathway (iii) oxidation of UDP-Glc to UDP-GlcA by a bifunctional alcohol dehydrogenase ADH/UDPGDH.

In the 1960s the pathway from myo-inositol to cell wall glycans was proposed as a signif­icant metabolic pathway. Early experiments with 3H-inositol demonstrated that the label

was readily incorporated into cell wall polysaccharides (452). More recently, a labeling ex­periment with inositol in Arabidopsis showed that radioactivity is found only in the uronic acids, arabinose, and xylose that were released from wall glycans (453). The ability of inositol to drive the synthesis of GlcA was named “the myo-inositol oxidation pathway" We will discuss the myo-inositol and salvage pathways separately.

Phenylalanine formation

There has long been uncertainty as to how plants produce the (essential) aromatic amino acid, phenylalanine (6) (Figure 7.8). Originally considered to be formed via transamination of phenylpyruvate (35) (80), work pioneered by Jensen and colleagues (81) and later by Siehl and Conn (82) provided evidence for an alternate pathway using arogenate (37); however,


Figure 7.8 Proposed biosynthetic pathway from prephenate (36) and arogenate (37) to Phe (6) in plants (83) and nitrogen recycling mechanism (85-88).

the arogenate dehydratases detected in crude extracts were apparently not stable enough for purification and there were no encoding genes identified.

In Arabidopsis, using a data mining approach, we identified six putative arogenate dehy­dratases (ADTs) and/or prephenate dehydratases (PDTs) based on (relatively low) sequence homology to known bacterial PDTs (83). Subsequent expression of each of the recombinant proteins in a NUS-His-tagged form established that each was capable of more efficiently converting arogenate (37) into phenylalanine (6), rather than catalyzing the conversion of prephenate (36) into phenylpyruvate (35) (83), i. e., indicative of a six-membered aro — genate dehydratase family in Arabidopsis. It will be instructive to next establish the nature of this Phe-forming metabolic network. That is, by determining which of these genes are involved in phenylpropanoid/lignin metabolism, protein formation, and/or other biochem­ical processes, as well as to what extent this gene family is functionally redundant (due to co-expression).

Early beginnings: the Freudenberg (random coupling) and the Forss (regular repeating unit) models for lignins

Although progress was made from the 1920s to the 1960s, this was not the best period in which to scientifically study native lignin macromolecular configuration. This was not only because of the serious technological limitations encountered during this period relative to today, but also because there was no indication that phenolic radical-radical coupling could be controlled in any specific manner. Accordingly, with the best of the intentions, the structural depictions envisaged for lignins and how they were formed were highly speculative at that time to say the very least. In any event, this ultimately led to two widely divergent views of lignin structure from the Freudenberg and Forss laboratories, respectively.