Category Archives: Biomass Recalcitrance

Approaches to current questions about structure and hydrolysis

There are two major approaches to hydrolysis of cellulose, acid hydrolysis, and enzymatic hydrolysis. The enzymatic process is poorly understood and must contain the solution to the recalcitrant nature of cellulosic degradation. The enzymes can be modeled, as well as their interactions with cellulose and even the process of enzymatic hydrolysis. The techniques that will probe the processes and mechanisms are numerous and range from reduced models to all-atom QM/MM and thermodynamic integration. Using reduced models, the structural stabilities and solvation free energies can be determined quickly. Normal mode and elastic network models and quasiharmonic analysis can probe the major structural modes of motion of cellulose, cellulases, xylans, lignins, and their mutual complexes. Mutational studies, using thermodynamic integration, can be performed to reveal the effects on structure and on kinetic behaviors, and even on reaction energetics and mechanisms. Umbrella sampling is a key player in understanding the binding affinities of different binding or catalytic domains on cellulose, or the relative binding affinities on different faces of cellulose or even on different locations of the same face. QM/MM is a tool for probing the hydrolysis reaction inside a cellulase catalytic site. This method is at the stage of development that performance is sufficient and the QM approximations are good enough to follow a reaction quantum mechanically while treating the non-reactive portion of the system classically and have reasonable answers for not much higher computational cost than pure classical simulations. It is expected that exceptionally useful information about the release of energy from reaction, and the accompanying structural changes will come from these numerical experiments.

The steered molecular dynamics, targeted MD, and pulling methods are the tools of choice for initial examination of the process of decrystallization of the cellulose fibers into cellodex­trin chains suitable for hydrolysis to mono — and disaccharides. These kinds of numerical experiments can suggest the energy barriers associated with decrystallization, and suggest more detailed studies such as obtaining PMF profiles from umbrella sampling runs, or free energies of decrystallization from Jarzynski pulling experiments. Beyond that, details about how the solvent plays a role in all the aforementioned processes can be carefully quantified and help to select the most likely and deselect unlikely mechanisms.

Cellulases from Trichoderma reesei

The most studied aerobic cellulolytic microorganism is the fungus, Hypocrea jecorina, orig­inally called Trichoderma reesei. It was isolated and studied by Drs Reese and Mandels at the

Army Quartermaster Lab in Natick, MA, during World War II, because it was degrading the cotton fabrics used by the army for tents, gun straps, etc. on islands in the Pacific Ocean (56). The original goal of this work was to find cellulase inhibitors, which was not achieved, as only the toxic ions, Hg and Ag, are good inhibitors. However, this group also carried out many studies on the organism and its crude cellulase, which then led to the development of high producing mutant strains by Dr Eveleigh that were used to develop the strains used for industrial cellulase production by several companies (57). The most abundant cellulase produced by T. reesei is the reducing end-specific exocellulase, Cel7A (cellobiohydrolase I), which makes up about 70% of the cellulase protein secreted by T. reesei (58). The next most abundant cellulase is Cel6A (CBH II), which makes up a further 10% of T. reesei secreted cellulase. T. reesei crude cellulase contains seven endoglucanases of which Cel7B (EGLI) is the most abundant. In addition, Cel5A (EGLII), Cel12A (EGLIII), Cel61A (EGLIV), Cel45A (EGLV), Cel5B, and Cel61B are also present in T. reesei secreted cellulase. Most of the T. reesei cellulases contain a family I CBM except Cel5B, Cel12A, and Cel61B. It is not clear why T. reesei produces so many endocellulases, but Cel12A was shown to have expansin activity as well as cellulase activity (59). Expansin is present in plants and it appears to disrupt the hydrogen bonds that bind different carbohydrate chains together in plant cell walls, so that it may make the chains more accessible to hydrolytic enzymes. The least studied endocellulase is Cel61A, which has extremely low cellulase activity. It is quite surprising that when a set of thermophilic fungal cellulases were screened for the ability to stimulate the activity of T. reesei crude cellulase, a number of them were able to increase it about threefold and the component that was most active in giving this stimulation was a family 61 enzyme (60). Little is known about the role of Cel45A in cellulose degradation (61). Another protein se­creted by T. reesei is swollenin, which is a low molecular weight protein that has no catalytic activity but appears to disrupt the structure of cellulose microfibers, possibly by breaking hydrogen bonds (62).

Most of the T. reesei cellulases are glycosylated and glycosylation appears to protect the cellulases from proteolysis (63). The linker peptide is particularly susceptible to proteolysis and T. reesei secretes proteases (64), so that protection from proteolysis maybe an important role for the O-linked glycosylation found on the linker peptide (65). The role of the N-linked glycosylation on the CD is unclear at this time. There is a great deal of heterogeneity in the glycosylation of any given cellulase and this causes heterogeneity of each enzyme during gel electrophoresis and column chromatography (66).

There have been extensive studies of the regulation of cellulase synthesis in T. reesei and it appears that regulation is complex (67). Glucose strongly represses cellulase synthesis and the (3-1,2-linked glucose disaccharide, sepharose, induces synthesis. A number of transcription factors have been identified in T. reesei, which can bind to cellulase promoters, and some of these are activators and some are repressors. The exact mechanisms that regulate cellulase synthesis are still not completely understood.

Cellulose-dissolving solvents

Another category of solvent pretreatment involves the use of cellulose-dissolving solvents, such as cadoxen, concentrated mineral acids, DMSO, and zinc chloride (10, 12). While these agents can be effective at directly releasing sugars from the carbohydrate fractions of biomass and/or producing a solid residue containing cellulose that is highly digestible by enzymes, the use of such solvents in pretreatment processes for the production of fuels and commodity chemicals from biomass will be challenging due to the expense of such catalysts, catalyst recycle requirements, and the requirement for clean process streams for subsequent biological conversions.

14.5.4 Supercritical fluid pretreatments

Biomass pretreatment processes using supercritical fluids to extract lignin from biomass feedstock have been investigated. A number of different supercritical fluids (alone or in mixtures) have been investigated, although the most common approaches utilize water, carbon dioxide, or ammonia (14, 68). While supercritical pretreatment conditions can effectively remove lignin and produce pretreated biomass that exhibits good enzymatic digestibility, the economic viability and practical operation of processes at supercritical operating conditions have not been effectively demonstrated. Of greatest concern are the extremely high-pressure requirements (generally above 10 MPa) of these processes.

Location of pectin synthesis

All available evidence, including autoradiographic pulse chase studies using wallbiosynthetic precursors (222, 223), immunocytochemical studies using anti-pectin-specific antibodies (224-226), and subcellullar fractionation and topology studies of pectin biosynthetic en­zymes (227-231), indicate that pectin is synthesized in the Golgi and transported to the wall in membrane vesicles. Plant cells, unlike animal cells, have multiple Golgi and thus pectin synthesis occurs simultaneously in numerous Golgi stacks in the cell (225, 232). The synthesized pectin and other macromolecules are targeted to the wall by the movement of Golgi vesicles, presumably along actin filaments that have myosin motors (233).

Immunocytochemical studies also indicate that the synthesis of different regions of the pectic polysaccharides occurs in different Golgi cisternae as pectin moves from the cis, through the medial and to the trans-Golgi. For example, the use of antibodies specific to different regions of HG and RG-I suggests that HG and RG-I synthesis begins in the cis-Golgi (225, 234, 235) and continues with more extensive decoration of the backbones as the polymers move through the medial Golgi (224, 225, 235) and into the trans-Golgi cisternae (225, 235). Additional modifications of the pectic glycan structure also appear to proceed in a more-or-less organized manner with HG (236, 237) and RG-I (106, 234, 238, 239) initially synthesized in less modified forms in the cis — and medial-Golgi and becoming more modified (e. g., methylesterified) (236) in the medial — and trans-Golgi (225, 235,240-242). HG is believed to be transported to the plasma membrane and inserted into the wall as a highly methylesterified polymer (214, 237, 243-245) and once in the wall, HG is deesterified to varying degrees by pectin methylesterases (246) in the wall or at the cell plate (245). The deesterification of HG converts it to a more negatively charged form (240, 247-250) which is then available to bind ions, enzymes, proteins, and other HG molecules through Ca++ salt bridges. It is believed that a spatial partitioning of HG esterification and deesterification occurs in the wall based on localization of esterified HG throughout the cell wall (237, 240-242, 243, 245, 250, 251), while relatively unesterified HG is more restricted to the middle lamella. This conclusion is supported by the frequently observed absence of unesterified HG epitopes in the trans-Golgi vesicles. However, since some cell types, such as melon callus cells (240), contain unesterified HG in the trans-Golgi, it is possible that HG may be inserted into the wall in a relatively unesterified form, at least in some cells. Also, since specific pectic epitopes localize to different Golgi compartments in different cell types (5, 225, 234, 237, 244), it is likely that the specific localization of the diverse pectin biosynthetic enzymes may vary in a cell type, species, and development-specific manner (226,252-255). It must be noted, however, that the interpretation ofimmunocytochemistry results can be difficult since the absence of a signal using an epitope-specific antibody may be due to masking of the epitope by additional glycosylation or some other modification (e. g., methylation, acetylation, feruloylation). Thus, to conclude that a particular pectin biosynthetic event does not occur in a cell, the lack of a particular immunocytochemical signal is not sufficient. Information on the presence of the biosynthetic enzyme activity or the actual wall carbohydrate structure itself is required.

The myo-inositol pathway

In plants, the first step in myo-inositol synthesis is the cyclization of d-G1c-6-P to myo — inositol-1-P (Ino-1P) by lL-myo-inositol 1-phosphate synthase. In Arabidopsis, two func­tional isoforms were reported, At4g39800 and At2g22240 (454). The second step involves dephosphorylation of Inol-P to myo-inositol by myo-inositol monophosphatase (IMPase; EC 3.1.3.25) (455). Distinct multiple but highly conserved IMPase isoforms are found in each plant species. Three IMPases were identified in tomato (456). In Arabidopsis, a con­served IMPase-like-protein (At3g02870) was proposed by Glilaspi to act as IMPase; however, biochemical and genetic data indicate that At3g02870 encodes L-Gal-1-P phosphatase (457, 458). It is possible that other IMPase-like proteins in Arabidopsis (for example, At1g31190, At4g39120) encode the Ino-1P phosphatase activity to form myo-inositol. The identification of the true IMPase gene product is critical to evaluate what controls the pathway to shunt Ino-1P to the myo-inositol oxidation pathway. Free myo-inositol is oxidized by inositol oxy­genase (MIOX; E. C. 1.13.99.1) to D-GlcA. Arabidopsis contains four Miox isoforms (453).

It would be interesting to determine if the myo-inositol oxidation pathway operates inde­pendently of the pathway leading to synthesis of UDP-GlcA from UDP-Glc. This knowledge could aid in determining which flux of sugars the plant uses to facilitate wall synthesis in specific tissues. For example, myo-inositol in seed is stored as phytic acid (inositol hexaphos — phate). During germination, phosphatases provide a rapid source of inositol which is con­verted, in part, to GlcA. Hence, this would provide a source of UDP-GlcA for wall pectin synthesis. However, during germination, rapid synthesis of L-ascorbate from myo-inositol also occurs. The relationship and coordination of the supply of sugars to wall glycans and to ascorbate synthesis must be better understood at all stages of growth.

Metabolic flux analyses and transcriptional profiling in the monolignol pathway

These analyses have been most useful in predicting the outcome of various manipulations in the lignin-forming pathway. That is, previous studies, whereby monolignol 1 and 3 formation could be induced in loblolly pine (Pinus taeda) cell suspension cultures, enabled us to gain important insights into factors controlling metabolic flux to both p-coumaryl (1) and coniferyl (3) alcohols (34, 35). Thus, by increasing levels of available sucrose, the monolignol-forming pathway could be induced, with the cells secreting the monolignols (in the presence of an H2O2-scavenger) into the culture medium.

The data (based on both measuring various pathway metabolite levels and transcript pro­files) provided quite informative insights: the first was that (regulation of) carbon allocation to the pathway was controlled upstream through the amounts of Phe (6) being made avail­able, as well as through the differential activities of both cinnamate-4-hydroxylase (C4H) and p-coumarate-3-hydroxylase (pC3H). Furthermore, metabolite analyses also indicated that formation of both p-coumaryl (1) and coniferyl (3) alcohols could be differentially in­duced, suggesting the existence of distinct metabolic control over segments (i. e., H versus G) within the monolignol/lignin forming processes through differential modulation of pC3H activity. Beyond the hydroxylation steps, other downstream enzymatic steps (see Figure 7.1) were not considered to be rate limiting, at least under the conditions employed in the studies. [Of course, any enzymatic step becomes rate limiting if abolished or “knocked out.”] Tran­scriptional profiling data of each of the known steps involved in monolignol biosynthesis (available at that time) also appeared to support this analysis and interpretation (35).

Computational Approaches to Study Cellulose Hydrolysis

Michael F. Crowley and Ross C. Walker

8.1 Introduction

Molecular modeling is a process employing a powerful set of tools for probing the atomistic mechanisms of cellulose hydrolysis, and will feature prominently in efforts to harness cel­lulose as a biomass energy resource. It provides a bridge between theoretical concepts and proposed mechanisms and the experimental data. Molecular modeling is most powerful when used in a synergistic fashion with experimentation where hypothesis-driven compu­tational research is used to help explain theoretical observations and to drive the design of future experiments. Molecular modeling encompasses the entire range of computational approaches available to the molecular biologist and it is beyond the scope of this book to give a comprehensive overview of all the computational models that exist. Instead we will concentrate on the subset of molecular modeling termed molecular dynamics, which will prove crucial in advancing our understanding of the behavior of celluloses and cellulases on the atomistic scale.

Molecular dynamics (MD) is generally used as a virtual experimental tool to probe the structure, function, kinetic, and thermodynamic properties of substances. In the biomolec­ular field, it has been invaluable in validating structures and elucidating mechanisms of structural stability and conformational change and for understanding interactions between molecules, their ligands, and their constituent parts. Most of MD is based on classical molec­ular mechanics (MM) with a smaller amount of work on using quantum mechanics (QM) with molecular mechanics to produce hybrid, QM/MM, dynamics methods. Although many of the computational methods used in molecular dynamics studies of biomolecular systems are mature, having been extensively applied to proteins, small molecules, and to nucleic acids, there has been, until recently comparatively little interest in the use of MD methods for carbohydrates and even less so for cellulose. The thrust of this chapter will be to doc­ument the current state of MD methods and MM force fields with the intent of inspiring greater use of the tools for the study of cellulosic recalcitrance to complement experimental studies, to answer questions that are unapproachable by current experimental technology, and to provide to experimentalists a wish list of new experimental targets, mutations, and structural information. We will include the work already accomplished, and outline the currently available methods and the kinds of questions they can answer for systems of the

Biomass Recalcitrance: Deconstructing the Plant Cell Wall for Bioenergy. Edited by Michael. E. Himmel © 2008 Blackwell Publishing Ltd. ISBN: 978-1-405-16360-6

size, complexity, and chemical nature of cellulose and the enzymes and other biomolecules that interact with it.

Experimental investigation of hydrolysis

In order to investigate the relative rates of hydrolysis and dehydration, experiments were conducted on xylose, xylan, and xylobiose using a small, glass reactor that was heated with microwave energy [CEM-Discover]. In these experiments, products were measured using high pressure liquid chromatography (HPLC). The microwave reactor system con­sists of an 8 mL glass tube enclosed inside the cavity of a microwave heating unit. Ex­periments were conducted with 2 mL of solution containing the substrate in an aqueous solution of sulfuric acid (1.2% by weight). The tube was fitted with a Teflon-coated cap and a pressure sensor, which are designed to contain pressures up to 250 psi. The tubes contained a Teflon-coated stir bar and the temperature was measured with an optical py­rometer. Batch experiments were conducted in which the samples were heated to a fixed temperature.

Hydrolyzates were subjected to chemical analysis using HPLC to determine the con­centrations of xylose, xylobiose, and xylose degradation products present in the reaction solutions. The solutions were analyzed using an Agilent 1100 series HPLC with an HPX — 87H column and a precolumn (Bio-Rad Laboratories) operated at 65°C. The eluant was 0.01N H2SO4 flowing at 0.6 mL/min. Samples and standards were injected (10 ^L) onto the column after filtering through a 0.45 ^m nylon membrane filter (Pall, Acrodisc Sy­ringe filter). Solute concentrations were measured with an Agilent 1100 refractive index detector controlled to 45° C and a diode array detector. The detectors were calibrated with a set of four standards for all solutes except xylobiose, which had a single-point calibra­tion. The HPLC was controlled and data was analyzed using Agilent Chemstation software (rev A.09.03).

Before the microwave heating system was used for kinetic measurements of xylobiose and xylan decomposition, an accurate temperature in the reactor was obtained. The provided optical pyrometer measures the infrared light emitted from the reactor and could provide an inaccurate temperature if the walls of the reactor were cooler than the solution. A more accurate technique for measuring the temperature would be to use a chemical reaction with known activation energy (chemical thermometer). In this study, we used the thermal decomposition of xylose in acid solutions as our chemical thermometer. We measured the decomposition of xylose at a fixed nominal temperature and compared the measured rate constant to the values reported in the literature to extract an effective temperature. The relationship between the rate constant for the decomposition of xylose and the temperature has been reported (19) to be

k = 0.0453a 8ухСце-3570( т ) (Equation I)

where a is the ratio of the density of xylose solution to that of the solution without xylose (a = 1), 8 is the specific gravity of water at a given temperature relative to the specific gravity at 30°C, yx is a correlation constant that was empirically determined (yx = 0.95), Cn is the acid concentration, and т is the absolute temperature.

Nominal temperature (°C)

Rate constant (s 1)

Effective temperature (°C)

125

3.7 x 10-5

117

135

2.5 x 10-5

134

145

1.0 x 10-4

149

155

3.7 x 10-4

163

165

6.7 x 10-4

170

175

2.9 x 10-3

188

Table 9.2 Measured rate constants and temperature of microwave reactora

a Determined from Equation I.

These data were then used in Equation I to determine the actual reactor temperatures, which are shown in Table 9.2. As Table 9.2 shows, the effective temperature is typically slightly higher than the nominal temperature measured by the optical pyrometer. Since the microwaves heat the solution directly, it is reasonable that the glass reactor tube would have a lower temperature than the solution. The optical pyrometer measures the temperature on the surface of the glass and it is not surprising that the nominal temperature is lower than the effective temperature in the solution.

Information from metagenomics

The extreme oxygen sensitivity and fastidious growth requirements of many rumen mi­croorganisms mean that they may often be difficult to culture, or even unculturable. At the very least, the task of completing the description of rumen microbial diversity through cultivation looks daunting. An attractive alternative, therefore, is to build metagenomic li­braries of rumen DNA that can be screened for activities of interest, or randomly sequenced to gain information on the genes present. This approach has been successfully applied to the recovery of genes encoding esterases, amylases and cellulases, xyloglucanases from the rumen (80, 81). Most of the glycoside hydrolases so far recovered appear to have simple domain structures, resembling those from the Prevotella group rather than those typical of known cellulolytic bacteria. This is most likely to reflect the relative ease of recovery of planctonic, rather than surface-attached bacteria, and perhaps also the relative ease of lysis
of Gram-negative cells. It may also reflect greater numbers of secondary utilizers of plant polysaccharides compared with primary degraders within the community. Nevertheless, the value of the approach is demonstrated by the fact that the collection includes representatives of new enzyme families. Information on sequence diversity can also be obtained without the need for library construction, either through amplification of specific genes by degenerate PCR, or directly by 454 sequencing.

Comparative sequencing and quantification of rRNA

Measuring microbial diversity typically involves sequencing individual 16S rDNA gene se­quences, to obtain species-level resolution. The power of the approach comes from the ability to amplify, clone, and analyze homologous regions of 16S rDNA from small amounts of sample DNA. Some studies focus on only a small portion of the 16S rDNA, while others survey the entire gene sequence. Sequencing the entire 16S rDNA enhances the accuracy of estimating the species variability of microbial communities, and is preferred when broad comparative determinations are preformed.

The question of whether two populations of microorganisms have different numbers of species and the level ofgenetic diversitycan be answered bycomparing the relationships and degrees of divergence among sequences. Depending on amplification and cloning conditions the technique can overrepresented some species while underrepresenting others. There is also a relatively small potential for error from sequence variation due to PCR replication errors. Variations of the 16S approach include denaturing gradient gel electrophoresis and in situ hybridization.

Bacterial populations, including soil communities, have been characterized using rRNA intergenic spacer analysis (RISA) which determines the variability in the length of the in­tergenic spacer between the small (16S) and large (23S) ribosomal subunits. The method has been automated (ARISA), and the sensitivity increased by the use of fluorescence — tagged oligonucleotide primers for PCR amplification and for subsequent electrophoresis in an automated capillary electrophoresis system. Fungal ARISA makes use of the length polymorphism of the nuclear ribosomal DNA (rDNA) region that contains the two in­ternal transcribed spacers (ITS) and the 5.8S rRNA gene. Comparative sequencing and quantification of rRNA genes using universal and phylogenetically specific primers has become an established method for detecting and characterizing subgroups of prokary­otic and eukaryotic microorganisms. For instance, constructed 16S and 18S rRNA gene libraries have been constructed to examine the effects of elevated CO2 levels on the composition of microbial communities associated with the rhizosphere of trembling aspen (88). However, a thorough analysis of complex microbial communities using rRNA gene-based libraries requires a huge sequencing effort. To overcome this sequenc­ing limitation, several rapid high-throughput DNA-based molecular methods for rRNA gene-based analysis of microbial communities have been developed. These methods, includ­ing terminal restriction fragment length polymorphism (T-RFLP) and denaturing gradient gel electrophoresis (DGGE), provide a DNA fingerprint of the microbial community present in the sample (89-92). While T-RFLP and DDGE are sensitive methods for differentiating between microbial communities, these methodologies do not provide actual DNA sequence information unless the resolved bands are recovered and sequenced (93).