Category Archives: Biomass Recalcitrance

Non-bonded cutoffs and long range electrostatics

It is the non-bonded interaction terms that are the most computationally demanding aspect of a force field calculation. There are N(N — 1)/2 interactions, where N is the number of atoms in the system. Bulk solution is represented most commonly by using a periodic boundary representation in which the unit cell is replicated infinitely in three dimensions. In
this case, the number of interactions for atoms in the primary cell becomes infinite and the standard pairwise electrostatic interaction term becomes a divergent sum. A reduction in the number of non-bonded interactions is thus required in order to make the computation tractable. Since the size of the van der Waals interaction between atoms decreases rapidly with distance it is possible to truncate the Lennard-Jones potential without introducing significant errors in the calculation. Unfortunately, the electrostatic interactions are longer ranged and truncating them can introduce significant errors into the calculation. Much effort has been expended over the years to develop effective cutoff methods that allow the electrostatic interaction to be truncated at some distance, typically below 15 A. However, all these methods suffer from problems arising due to cancellation of errors and it is now almost universally accepted that cutoffs should not be used unless a method is used which allows the “missing” energy to be calculated. One such method which is now commonly used in explicit solvent simulations is the Particle Mesh Ewald Method (PME) (8), which divides the electrostatic calculation into a direct space, pairwise evaluation, and a reciprocal space calculation. The direct space part of the calculation is conducted using a regular pairwise interaction within a cutoff, typically 8-10 Awhile the remainder of the “missing” electrostatic contribution from the infinitely replicated system is included by calculating the charge field on a grid and then using Fast Fourier Transforms to obtain the potential and force at each atom. This reduces the scaling of the calculation from N2 to N ln N while at the same time avoiding the approximations introduced by use of a cutoff.

P — glucanases

Endoglucanases are most often referred to in the context of cellulases. Though these enzymes have a high affinity toward cellulose, they often exhibit cross reactivity with other p-glucans, including xyloglucan and mixed p-(1 —>3,1—4)-glucans. As with the endoxylanases, these enzyme cleave p-(1—4) or p-(1 —— 3) linkages in the interior of the glucose chain, generating two new chain ends, one reducing and one non-reducing. Different enzymes are required to cleave the two forms of p-glucan (28, 30-35). Enzymatic degradation of p-glucan is accomplished through glycosyl hydrolase family 12 enzymes (EC 3.2.1.4). Although these endo-acting enzymes are active on p-(1——4) glycosidic linkages, they are differentiated from other p-(1—4)-acting enzymes by being able to hydrolyze the p-(1——4) linkages in mixed p-(1—3,1—4)-linked polysaccharides. Glucan endo-1,3-p-D-glucosidase [p-(1——3) glucanase] (EC 3.2.1.39) is an endo-acting glycosyl hydrolase that acts on p-(1——3) glucan, but has very limited activity on the mixed linkage p-glucan. Endo-1,3(4)-p-glucanase [p- (1—3, 1—4) glucanase] (EC 3.2.1.6), is also an endo-acting glycosyl hydrolase. There is an exo-acting glycosyl hydrolase that is active on p-(1——3) glucan. Glucan 1,3-p-glucosidase (EC 3.2.1.58) acts by processively releasing glucose from p-(1 —— 3) glucan from the non­reducing end.

Much of the initial work on cellulases and endoglucanases was concentrated on the en — zymology of the fungus Trichoderma reesei. This fungus produces several cellulases which act synergistically in the degradation of cellulose. In T. reesei, the Cel7B is a major endoglu — canase, forming about 6-10% of the total T. reesei cellulase (36, 37). It has wide activity against solid and soluble substrates, such as CMC, as well as against xylan and glucoman — nans (38). Also, the endoglucanase Cel5A has activity against solid (celluloses) and soluble (CMC, mannan) substrates (39-41), but not on xylans. This enzyme comprises up to 10% of the total cellulases in T. reesei (36, 37). The minor endoglucanases (Cel12A, Cel45A) are reported to hydrolyze solid and soluble (including glucomannan) substrates with diverse specific activities.

Degradation of cellulose by the C. thermocellum cellulosome

The efficiency of a crude cell-free cellulase system from C. thermocellum for the biodegrada­tion of crystalline cellulose was first reported by Johnson etal. (140). These authors claimed that the activity of this cellulase system toward cotton was at least 50 times higher than that of the extracellular cellulase system from T. reesei. This level of disparity has since been tempered somewhat, and elevated (~4-fold) levels of cellulose degradation have been estimated in favor of the cell-surface cellulases from C. thermocellum over the free cellulase system of T. reesei (16, 141). It is in fact very hard to assess this difference, since the equili­bration and estimation of equivalent crude or isolated preparations of cellulases from two

image209

Figure 13.6 Transmission electron microscopy of cellulosome-induced degradation of bacterial cellulose ribbons. (A) Untreated substrate. (B) Bacterial cellulose following 3 hours of digestion using preparations of the C. thermocellum cellulosome. (C) As in (B), but after 6.5 hours of digestion.

 

image210

Figure 13.7 Transmission electron microscopy of cellulosome-induced degradation of Valonia cellulose microcrystals. (A) Untreated microcrystals. (B) Valonia microcrystals following 16 days of digestion using preparations of the C. thermocellum cellulosome. The arrows indicate pointed microcrystals, characteristic of the unidirectional action of exo-acting cellulase.

different species are difficult in themselves to determine. Nevertheless, such attempts have consistently shown that the C. thermocellum cellulosome is superior in its cellulolysis of recalcitrant cellulosic substrates when compared to the free fungal cellulase systems (Figure 13.8). The principal family-48 processive cellulase is decisive to the observed decomposition of the substrate, since cellulosome preparations that are deficient in this enzyme display re­duced levels of hydrolysis on recalcitrant forms of cellulose (142). In any case, it is important

image211

Incubation time (hours)

■ C. thermocellum cellulosome ▲ Humicola insolens

• C. thermocellum cellulosome (-Cel48S) ♦ Trichoderma reesei

Figure 13.8 Cellulase activity of the C. thermocellum cellulosome versus those of the free cellulase sys­tems from the fungi, T. reesei (Hypocrea) and Humicola insolens. With one exception, the cells were grown on microcrystalline cellulose (Avicel), and crude preparations of the extracellular enzymes (cellulosome or free cellulases) were produced. The curve (filled circles) denoting C. thermocellum (-Cel48S) represents a cellulosome preparation, derived from cells grown on cellobiose instead of cellulose, under conditions that result in highly reduced quantities of the Cel48S cellobiohydrolase in the cellulosome (27, 53, 142). Avicel was employed as a substrate and subjected to treatment using equivalent amounts of the bacterial cellulosome or fungal cellulase preparations.

to note that although bacterial cellulosomes seem to exhibit enhanced activity compared to that of the fungal enzymes, anaerobic bacteria produce much less cellulolytic enzymes (< 1 g/L) than do the fungi (~ 100 g/L). In view of this imbalance, industry has consistently turned to the economically favorable fungal enzymes, which are thus preferred in all current industrial applications of cellulases.

The cellulolytic potential of the cellulosome on Avicel was in fact demonstrated many years ago (143). For these experiments, cellulosome action was enhanced by inclusion of the Aspergillus niger p-glucosidase, which served to counteract the inhibitory effect of cel­lobiose on the cellulosomal enzymes. The p-glucosidase severs the p-1,4 bond of cellobiose to produce two molecules of the non-inhibitory glucose product. Without this added en­zyme, the course of cellulose hydrolysis by the cellulosome is rapidly impeded. In its pres­ence, however, facile degradation of relatively low concentrations of cellulose in suspension (20 g/L) is achieved to completion in a relatively short time period (Figure 13.9). Com­plete digestion of concentrated cellulose suspensions (200 g/L) are also attained, provided that optimal amounts of cellulosome complex are included in the reactor (Figure 13.9, arrow).

CBP advances

16.3.1 Native cellulolytic microorganisms

C. thermocellum, an anaerobic, thermophilic, Gram-positive bacterium, exhibits one of the highest rates of cellulose utilization among described microorganisms (15). C. thermocellum produces a cellulase complex, or “cellulosome,” a substantial fraction of which is bound to the cell surface under most culture conditions (30-33). Because the turnover number of cellulases on insoluble substrates is much lower than most catalytic enzymes on soluble substrates, substantial amounts of cellulase (~2-20% of cellular proteins, weight based) are required to support cellulolytic cells growing on cellulose. Because cellulose is an insoluble
substrate, cellulases must be secreted across the cell membrane so that they can access and hydrolyze insoluble cellulose to soluble sugars that cells can assimilate.

Mixed linkage glucans

Walls of the grasses contain mixed-linkage (1,3;1,4)-p-D-glucans (MLGs) which are not present in walls of dicotyledons or most other monocotyledonous plants (147). Some alga and liverworts may also have MLGs (148). The (1,3;1,4)- (3 — D-glucans have an unusual struc­ture consisting of an unbranched, unsubstituted glucan chain with two linkages arranged in a non-repeating, but non-random, fashion. The glucan chains consist of primarily cel — lotriosyland cellotetraosyl units separated by single (1 ^ 3)-p — linkages (149). MLGs can be synthesized in vitro from Golgi membrane fractions with UDP-Glc as a substrate (150,151). The amount of UDP-Glc used in the assay alters the ratio of cellotriosyl and cellotetraosyl units, indicating that this is not a fixed property of the biosynthetic enzyme (152).

Recently, following prescient speculation as to possible structural similarity between cel­lulose synthase and the MLG synthase (153), it was shown that expression of a rice CslF gene in Arabidopsis led to the accumulation of (1,3;1,4)-p-D-glucan biosynthesis in Arabidopsis (154). Thus, it appears that CSLF encodes the mixed glucan synthase. The generally low levels of (1,3;1,4)-p-D-glucan in walls of the transformed Arabidopsis plants is consistent with the concept that limiting levels of other components might be required for high-level synthesis of the polysaccharide or its transfer to the cell wall. Similarly, the preferential de­position of the (1,3;1,4)-p-D-glucan in the epidermal layers of the transgenic Arabidopsis lines, despite the fact that transgene expression was driven by the constitutive 35 S promoter, could indicate that the epidermal cells contain ancillary factors that are not abundant in other cells of the leaf. The identification of the gene opens up new approaches to understand­ing the fascinating process by which an enzyme catalyzes substantially different transferase reactions (151, 153).

The role of the MLGs is not clear. The MLGs are synthesized in relatively large amounts during growth and may coat cellulose microfibrils during the synthesis and expansion phase, but they are degraded when elongation ceases (155, 156).

UDP-L-rhamnose (UDP-Rha)

Rhamnose is a major sugar moiety in pectin and in various glycosides of secondary metabo­lites. UDP-Rha is the activated sugar for the synthesis of flavonoids (448); however, the activated rhamnose-donor form for pectin synthesis has not been determined. Previously, it was suggested that synthesis of UDP-Rha from UDP-Glc is mediated by three separate enzymes, similar to the conversion of TDP-Glc to TDP-Rha in bacteria (403). UDP-Glc is first modified to the UDP-4-keto-6-deoxyGlc intermediate by UDP-Glc 4,6-dehydratase. The intermediate is modified in the presence of NAD(P)H by a 3,5-epimerase and 4,6- keto-reductase to form UDP-L-p-Rha. A debate in the literature as to whether two or three different enzymes are involved in UDP-Rha came to an end with the functional cloning NRS/er (134) from Arabidopsis (At1g6300). The activity of recombinant NRS/er demon­strates irrefutably that the 3,5-epimerase and 4,6-keto-reductase activities reside in one polypeptide. Interestingly, in vitro NRS/er accepts both TDP — and UDP-4-keto-6-deoxyGlc as substrates to form TDP-Rha and UDP-Rha, respectively. Although TDP-Glc is found in plants (403) and several enzymes can generate TDP-Glc in vitro, the physiological signif­icance of the ability of NRS/er to generate TDP-Rha is unclear. Only by isolating a pectin rhamnosyltransferase and characterizing the donor specificity, can the true nature of NDP — Rha form be conclusively determined.

The Arabidopsis genome consists of three genes (At1g78570, At3g14790, At1g53500) each that encodes a large protein (~670 aa) having two domains: an N-terminal domain (~330 aa long) that shares amino acid sequence similarity to 4,6-dehydratase followed by a C-terminal domain (~320 aa long) that shares over 80% sequence identity to NRS/ER. The C-terminal domain of At1g78570 has similar enzyme activity as NRS/ER (134). Mutations in At1g53500, mum4 (449), and rhm2 (450) result in decreasing amounts of Rha and GalA sugar moieties in RG-I structures isolated from seed mucilage. These mutants provided the first genetic evidence for the involvement of these genes (we named URS, UDP-Rha- synthase) in rhamnose synthesis. More recently, when theses genes were recombinantly expressed in yeast, Oka and coworkers (451) reported that all of the Arabidopsis URS genes (also named RHM/MUM) have UDP-rhamnose synthase activity and interestingly, are highly inhibited by UDP-Xylose.

Pioneers of monolignol biosynthesis, recent progress, and metabolic flux analyses

Both the monolignol and the shikimate-chorismate biochemical pathways have been ex­tensively studied over a period spanning nearly five decades. As regards the former, the reader is highly encouraged to review the two most comprehensive treatises on monolignol biosynthesis (31) and monolignol pathway genetic manipulation (77). The first provides an historical account of the pioneering work in enzyme identification, enzyme isolation, and subsequent gene cloning in the monolignol/lignin pathway, whereas the second largely attempts to identify predictable trends in downregulating/mutating various pathways steps, e. g., following application of standard molecular biological approaches. Interestingly, the five-decade held view that lignins were randomly assembled had apparently all but deterred many researchers in this field from systematically studying — in a productive and predictive hypothesis-driven manner — the effects of manipulating the lignin-forming apparatus.

Moreover, although there are occasional reports describing the monolignol pathway as being recently redrawn in the past few years (78), in hindsight this is really not the case. Instead, a number of discoveries had been made — some up to a quarter of a century ago — whose significance had not yet been appreciated by various researchers at the time. We, therefore, briefly review the monolignol pathway, the highlights of recent years, and the recent advances made in understanding the nature of metabolic pathway flux associated with monolignol/lignin formation. Furthermore, because of the completion of the Ara — bidopsis genome sequencing in 2000 (79), an emphasis is also placed on the study of the phenylpropanoid-forming biochemical machinery in that organism as it is currently the most extensively studied.

PROTEINS WITH ESTABLISHED ROLES IN

OXIDATION/POLYMERIZATION: PEROXIDASES These are large multigene families in plants whose full physiological/biochemical roles and functions still remain poorly understood. Unlike laccases, however, a direct role for per­oxidases in lignification has been demonstrated. That is, downregulation of a peroxidase (TP60) in tobacco (N. tabacum) gave transformants with lignin levels reduced by circa 40­50% (257). Additionally, phloroglucinol-HCl staining suggested that the vasculature had been weakened (Figure 7.12H), although no quantitative structural testing on the plant stems was carried out. Thus, at present, the only oxidative enzyme demonstrated to have a role in monolignol oxidation/lignification is peroxidase. As before, more detailed analyses are required to fully ascertain the effects of peroxidase downregulation on lignification/cell wall structure(s).

7.6.2.1 Summary

Studies of CCR downregulation/mutation gave rise to severely dwarfed phenotypes; in Arabidopsis, typical G/S lignins were biosynthesized albeit at a delayed rate. There was no convincing evidence for replacement of monolignols with other non-monolignol moi­eties — e. g., feruloyl tyramine (60), acetosyringone (61), etc. The studies with CAD, F5H, and COMT were also most informative: while many of the mutants (e. g., CAD, COMT) have been known for almost three-quarters of a century, the biochemical basis of how they (CAD and COMT) disrupt the normal proposed template polymerization (see later) has now apparently come to light, i. e., via limited substrate degeneracy on the proposed lignin-forming template. The effects of this (attempted) degeneracy were not though structurally beneficial, and thus help explain why neither p-hydroxycinnamaldehydes nor 5-hydroxyconiferyl alcohol (4) evolved as substrates proper for lignification. Addition­ally, the F5H mutant (fah1-2) and the C4H-F5H overexpressing lines afforded two lignins with altered G and S levels, as did downregulation/mutation of COMT. Significantly, the patterns of interunit linkage frequency established that (based on monomer/dimer re­lease) lignification was apparently proceeding in similar (controlled) manner in each case. The data are explained through limited (substrate) degeneracy during proposed template polymerization.

Potential of mean force and umbrella sampling

The potential of mean force (PMF) is used to get information about how the free energy of a system changes as it moves along a particular trajectory, which could represent a conformational change or chemical reaction. The key relationship is

P(x) a e-A(x)/kbT (8.10)

where P (x) is the probability of state x, A(x) is the Helmholtz free energy of state x, kb is Boltzman’s constant, and T is the temperature. State x is defined as a configuration or reaction coordinate and is constrained to a particular place in configuration space while the rest of the system is averaged over all accessible states. The PMF is useful in following the change in Helmholtz free energy as the system moves along the trajectory coordinate, most often to find the change in free energy between initial state, the transition and final state. In some cases, this probability can be arrived at by a simple MD simulation, run for sufficiently long time that all states x are sampled frequently enough to yield valid probabilities. The method of straight MD breaks down for this method when the barriers are high enough that the states around the transition state do not get visited even for very long simulations. The answer to this problem is to use some method of enhanced sampling, such as umbrella sampling (66). Umbrella sampling is implemented by applying a biasing potential, usually harmonic similar in shape to an inverse umbrella, centered at various points along the defined trajectory or reaction path and constructed such that the umbrellas are large enough and the points are close enough that the sampling within each umbrella overlaps with the adjacent umbrella sampling. The system will sample within the umbrella potential and give information about the probabilities of states within the umbrella, though they are biased. These probabilities can be unbiased but still will contain an unknown constant. The pieces can be spliced directlyback together lining up the overlap regions; more commonly the more exact method, weighted histogram analysis method (WHAM) (67), is used to produce the full PMF for the trajectory. Other methods, not discussed here, which are useful for the same problems as umbrella sampling and can often be used in cases where umbrella sampling does not work, are: locally enhanced sampling (68), replica exchange (69), and lambda dynamics (70).

Cellulose-binding domains

CBMs play an important role in the ability of cellulases to degrade crystalline cellulose. They have little or no effect on the activity of most cellulases on a soluble cellulose derivative, CMC, amorphous cellulose or oligosaccharides (40). One role of a CBM is to anchor the cellulase to the insoluble cellulose, so that the CD remains close to the substrate. The flexible linker that separates the CBM from the CD allows the CD to access regions of the cellulose, adjacent to the bound CBM. Some workers have proposed that a CBM can also disrupt the structure of cellulose, making it more accessible to the CD (41, 42) but this is still controversial (43). This activity would be equivalent to the Cx activity proposed by Reese, in his early discussion of the nature of cellulases (44). CBMs have also been reported to target cellulases to specific regions of cellulose, presumably to the regions where they will be most active (45). It has been shown that family 2 CBMs can diffuse on the surface of cellulose without dissociation; giving them the ability to readily access new regions of a cellulose particle after the CD has hydrolyzed cellulose near the original site of binding (46). Other CBMs that bind to crystalline cellulose may also have this ability, since they appear to bind in a similar way.

There are many CBM families with 45 listed on the CAZy web site (http://afmb. cnrs- mrs. fr/CAZY/fam/acc_CBM. html). Not all CBMs bind cellulose, as many families con­tain chitin-binding domains, xylan-binding domains, or mannose-binding domains. Some CBMs can bind to several polymers, while others are specific for only one. Labeled CBMs are being used to stain plant materials and different members of a given family can give very different staining, showing that there is even greater binding specificity than is seen with pure substrates (47). Almost all known enzymes that have high activity on insoluble substrates contain a substrate-binding domain, in addition to a CD, so that the presence of such a domain is a general property of this type of enzyme.

All fungal cellulase CBMs are in family I and they are small, containing about 30 AA. Most aerobic bacterial cellulase CBMs are in family 2 and they are larger, containing about 120 AA. The CBMs on cellulosomal scaffoldins are in family 3. Most of the CBMs in these three families bind to crystalline cellulose and have a relatively flat-binding surface that usually contains three aromatic residues spaced, so they can bind to three adjacent glucose residues in a cellulose molecule. They also contain a number of residues, which can hydrogen bond to the cellulose chain, but site-directed mutagenesis has shown that the aromatic residues are essential for high affinity binding, while the other residues play a secondary role (48). The CBMs in families 4 and 6 bind to single cellulose molecules and their binding sites are in a groove (49). A number of cellulases contain multiple CBMs and in some cases they are from the same family and in other cases they are from different families. It has been shown that the affinity of a protein containing two CBMs can be significantly higher than one with only one CBM (50). Atomic force microscopy of the binding of a family I CBM to cellulose found that the bound CBM was present in aggregates, not as single domains (49).

Binding of a cellulase to cellulose is an important step in hydrolysis and there have been many studies of cellulase binding (50-54). Cellulases that contain a CBM bind tightly to cellulose, but cellulase CDs bind cellulose weakly. Mutations that dramatically reduce activity can lead to enhanced CD binding, showing that the cleavage products bind more weakly than the intact substrate. The extent of binding is directly related to the accessible surface area of the substrate, and accessible surface area also determines the rate of hydrolysis of a substrate. Much of the surface area of cellulose is in the interstitial region between microfibrils of cellulose (irregularly shaped pores), so that the size of a cellulase will affect how much of the cellulose surface area is available for its binding. There is good evidence that binding to cellulose does not fit the Langmuir isotherm, and in many cases binding appears to be irreversible (53). At low protein concentrations, several family 2 CBMs were shown to bind reversibly and in this region binding might be to the external surface, but at higher extents of binding, where binding might be occurring in pores, it was irreversible (53). In a study where the binding of a mixture of two cellulases was studied, synergism in binding was observed in several mixtures (54).