Cellulolytic aerobic bacteria

Almost all aerobic cellulolytic bacteria secrete sets ofindividual cellulases, and most ofthese cellulases contain one or more CBMs. Two such organisms, whose cellulases have been well characterized, are Cellulomonas fimi and T. fusca. The sets of six cellulases produced by these two organisms are similar in their activity, CD families and CBMs, but differ significantly in their sequences and domain order, suggesting that these organisms did not obtain their cellulase genes from a common ancestor, but rather each set is the result of convergent evolution. In five of the six corresponding cellulase pairs from each species, the family 2 CBM is at the opposite end of the enzyme (23). Both these organisms are actinomycetes and they are found in soils, but C. fimi is mesophillic with an optimum growth temperature near 30°C, while T. fusca is moderately thermophilic with an optimum growth temperature of 50°C. An interdisciplinary group at the University of British Columbia has studied C. fimi cellulases, and this group has made many important contributions to cellulase research (74).

T. fusca is often found in compost piles, rotting hay, and manure piles. The genome sequence of T. fusca was determined in 2000 by the DOE Joint Genome Institute and the finished 3.7 mb sequence is available at: http://genome. jgi-psf. org/finished_microbes/thefu/ thefu. home. html (75). These two organisms are not closely related, as C. fimi is in the suborder Micrococcineae while T. fusca is in the suborder Streptosporangineae. It is interesting that Streptomyces lividans, which is a relative of T. fusca, contains five cellulase genes that are similar to those present in T. fusca and, in every one, the family 2 CBM is in the same location as it is in T. fusca, suggesting that these sets of genes did come from a common ancestor.

In addition to the six cellulases, T. fusca secretes a number of other proteins when it is grown on cellulose. Among them are a xyloglucanase, a xylanase, a family 81 [3-1,3 endoglucanase, and two proteins that bind to cellulose but do not appear to have any catalytic activity (76-79). The xyloglucanase and the xylanase both contain a family 2 CBM and bind tightly to cellulose. The CBM on the xylanase also binds to xylan, which is unusual, as most family 2 CBMs bind only to one of these polymers (80). It seems likely that the role of the xyloglucanase is to degrade xyloglucan, which is bound to cellulose, allowing the cellulases to access the cellulose rather than to allow T. fusca to utilize xyloglucan, since T. fusca does not grow on xyloglucan, even though it completely degrades it to oligomers (76). Further evidence for this conclusion is that a xyloglucan-cellulose composite, which was not degraded by a mixture of pure cellulases, was degraded when pure T. fusca xyloglucanase was added to the cellulase mixture (76). The role of the two cellulose-binding proteins is not known, but they do stimulate the activity of several T. fusca cellulases in the initial phase of the reaction, when the cellulases are assayed at low levels (79).

Cellulase synthesis in T. fusca and many cellulolytic bacteria is regulated by at least two mechanisms: induction by cellobiose and laminaribiose (p-1,3-glucose disaccharide), and repression by any good carbon source including cellobiose (78, 80, 81). This makes sense, as the extracellular enzymes secreted by T. fusca grown on cellulose make up about 50% of the total protein in the culture. If there is sufficient sugar for growth or no cellulose, there is no reason to synthesize cellulases. The cellobiose required for induction is produced from cellulose by the uninduced level of secreted cellulase, which is mainly Cel6A. CelR is a regulatory protein that is a member of the Lac I gene family (82), and it binds to a 14 base inverted repeat sequence: TGGGAGCGCTCCCA. This sequence is upstream of the start site of all six T. fusca cellulase genes coding for secreted cellulases, as well as the genes for a number of other secreted proteins induced by growth on cellulose, and an operon that codes for a cytoplasmic 3 glucosidase and a putative cellobiose transport system (83). The CelR gene is just upstream of this operon. The binding of CelR to the regulatory sequence is inhibited by cellobiose, as expected. Since laminaribiose also induces cellulase synthesis it also probably binds to CelR but this has not been tested. At this time it is not known whether induction by laminaribiose is useful for T. fusca, or if it is an accidental result of the low specificity of the CelR sugar-binding site. A puzzling finding is that the Cel6B and Cel48A genes, which each encode an exocellulase, have a second CelR-binding site that is about 200 bases upstream of the start site of the gene and this site is not present in the other cellulase gene upstream sequences. These two cellulases are made in equal amounts and together they make up over 70% of the secreted cellulase protein. At this time there is no information about the mechanism by which good carbon sources inhibit cellulase synthesis in T. fusca.

Saccharophagus degradans is an aerobic marine plant cell wall degrading organism, whose genome was recently sequenced. Preliminary analysis of its 13 cellulase genes shows that it contains ten family GH-5 enzymes, which are most similar to endoglucanases and most of them also contain cellulose-binding domains. It also contains two family GH-9 endoglu­canase genes, one of which encodes both a family 10 and a family 2 CBM (84).

Cytophaga hutchinsonii is an aerobic cellulolytic bacterium and the DOE Joint Genome Institute determined the DNA sequence of its genome (http://genome. jgi- psf. org/finished_microbes/cythu/cythu. home. html) (85). While it codes for a number of cellulase genes, most of these genes lack a CBM and all the genes appear to code for endoglucanases. These results clearly distinguish C. hutchinsonii from all other aerobic cellulolytic bacteria, whose cellulases are known, and suggest that it uses a different mech­anism for degrading cellulose than the secretion of a synergistic set of free cellulases used by the other well-studied aerobic cellulolytic microorganisms. None of its cellulase genes encode dockerin sequences, so that it does not appear to produce cellulosomes, like anaer­obic cellulolytic bacteria. Thus, it appears that C. hutchinsonii has a third mechanism for degrading cellulose. It is interesting that the anaerobic rumen bacterium, Fibrobacter suc- cinogenes, whose genome sequence was determined by TIGR funded by a USDA grant to the North American Consortium for Genomics of Fibrolytic Ruminal Bacteria, also does not code for any known processive cellulases and only one of the many endocellulases that have been cloned and sequenced from it appears to bind to cellulose (86,87). F. succinogenes does not appear to encode dockerin domains and a scaffoldin gene has not been identified. Thus, F. succinogenes also probably uses a novel mechanism for degrading cellulose. F. suc­cinogenes grows very rapidly on cellulose, so that its cellulose degrading mechanism is very efficient (88). One possible mechanism for these organisms is the one proposed for starch degradation by Bacteroides thetaiotaomicron (89). In this mechanism, starch is bound to a complex present in the outer membrane and individual molecules are transported into the periplasmic space, where they are degraded by starch degrading enzymes. This mechanism would not require processive cellulases, as individual cellulose molecules would be readily degraded by endoglucanases. If this is the process by which cellulose is degraded, it will be very interesting to determine the mechanism by which the outer membrane proteins are able to bind and transport individual cellulose molecules. It is possible that this information would allow the design of new cellulases or cellulose modifying proteins, which would be able to increase the rate of cellulose degradation by free cellulases.

Site-directed mutagenesis of cellulase genes has been used to identify residues required for activity and substrate binding. Extensive studies of T. fusca Cel6A have been carried out (90-92) and a number of Cel6A residues have been identified that are essential for activity or binding. From these studies, a detailed mechanism has been proposed for Cel6A, which differs from the standard cellulase mechanism in that there is no evidence for an essential catalytic base, although there is a conserved Asp residue (Asp 79), which is important for activity and it could be functioning as a nonessential catalytic base (91). Because it is in a flexible loop, it is not clear where this residue is positioned during bond cleavage, but it is unlikely to be in the position seen for the catalytic base in other cellulase families, which is on the other side of the sessile bond from the essential catalytic acid, which is Asp 117 in Cel6A (91). Loop movement is important in catalysis by Cel6A, as mutation of either of two adjacent Gly residues to Ala, which are at one end of the loop, reduce activity on CMC to 6% or 16% of WT activity (90).

In all WT Cel6Acd structures that contain bound sugars, the glucose molecule in subsite -1 is distorted, as is seen with most cellulases. Tyr73 is a conserved residue in family GH-6 and, when it is mutated to Phe, Cel6A activity is reduced to 10% of WT but, when it is mutated to Ser, activity is abolished (93). In a structure of the Cel6Acd Tyr73Ser mutant enzyme, the glucose in the -1 is not distorted (94). This result is consistent with molecular modeling of Cel6A, showing that the normal glucose conformation would overlap the position of Tyr73 (95). It also suggests that distortion of the glucose is essential for activity, since the Ser mutation eliminates both distortion and activity, while the Phe mutation, which does not eliminate the distortion only partially reduces activity (93). Tyr73 was reported to be the essential hydrophobic platform residue in family GH-6, and the mutagenesis data are consistent with this proposal (96). Cel6A hydrolyzes CMC and amorphous cellulose (SC) 100 times faster than filter paper and 50 times faster than bacterial cellulose. Furthermore, mutations that alter key active site residues cause much greater reductions in activity on CMC and SC than on the crystalline substrates (91). These results show that there are different rate-limiting steps for the two types of substrates. It seems likely that the rate — limiting step for CMC and SC is bond cleavage, while for crystalline substrates it maybe the binding of the substrate into the active site. In order to design new cellulases with higher activity on crystalline substrates, it is necessary to increase the rate of the rate-limiting step. Unfortunately, at this time the detailed mechanism by which a segment of a cellulose molecule is removed from its neighbors and bound into the active site is not known for any cellulase, so that it is not possible to rationally design mutant enzymes to increase the rate of this step.

Similar, but less extensive, studies have been carried out on the exocellulase, T. fusca Cel6B, and they showed that inserting a disulfide bond, which joined the two loops that cover the Cel6B active site to form a tunnel, reduced activity only 50%, so clearly loop opening, if it occurs, is not required for activity (97). These studies also found two mutations, which increased activity on crystalline cellulose, but only when the enzyme was assayed alone, not in synergistic mixtures. Several mutations were found, which reduced inhibition by cellobiose, with only a small loss of activity on crystalline cellulose. Finally, two mutations reduced activity on bacterial cellulose without reducing activity on CMC or SC, suggesting that these residues participated in a step that was only required for crystalline cellulose hydrolysis (97).

Extensive site-directed mutagenesis studies have been carried out on the processive endo — cellulase, T. fusca Cel9A (98, 99). These studies identified key residues that are important for activity, including a Glu residue that functions as the catalytic acid, and two conserved Asp residues, which are hydrogen bonded to the catalytic water molecule in the structure of the enzyme lacking bound sugars. Mutation of either residue drastically reduces activity, even though only one functions as the catalytic base (99). In the Cel9A structure, the catalytic base forms a hydrogen-bonded network with a conserved His residue and a conserved Tyr residue. Mutation of either residue drastically reduces activity, showing that this network is important for catalysis. The conserved Tyr residue is probably the hydrophobic platform residue in Cel9A, even though a different Tyr was proposed to have this function (96). When the Y429, proposed as the platform residue, was mutated, the mutant enzyme still retained about 10-30% of WT activity on various substrates, so it clearly does not have that role in catalysis (99). Several mutations increase activity on CMC, while reducing activity on bac­terial cellulose. In most cases, these mutations increase the size of the active site cleft, which may allow binding of modified sugars. A number of CD mutations reduced processivity and all these mutations were in the —2 to —4 subsites and would be expected to weaken binding to these sites, suggesting that the loss of processivity was due to decreased affinity of these subsites for the cellulose chain bound to the family 3 CBM, which is part of the active site of this enzyme. This result is consistent with the results of a docking study of Cel7A, which calculated the binding energy of each of the glucose-binding subsites in this enzyme for glucose and found that the energies increased in a way to draw the cellulose chain into the active site consistent with its processive activity (100).

There have been extensive mechanistic and structural studies of GH-5 endocellulases, which are retaining hydrolases (101,102). The GH-5 enzymes are extremely diverse in their sequences, in their detailed structures, and in their substrate specificities. A structural study of Acidothermus cellulolyticus endocellulase, Cel5A (E1), showed that the eight totally con­served residues in this family were all in the active site and were close to the bound substrate (101). There are other residues, which are not conserved in this family, that also interact with the substrate, providing further evidence of the diversity in this family. Structural studies of Bacillus agaradhaerens endocellulase Cel5A have determined structures for every step during bond cleavage and have identified the hydrogen-bonding network to the substrate at each step (102, 103). As was found for E1, there are both conserved and nonconserved residues present in these networks. Site-directed mutagenesis of Pyrococcus horikoshii endocellulase Cel5A showed that the conserved nucleophile was essential for activity, but that the con­served acid/base residue was not important for activity (104). This is a very unusual finding but, since there are several other conserved residues adjacent to the acid/base residue, this result is not due to an incorrect alignment, so that it provides even more evidence for the extensive diversity of GH-5 enzymes. All of the studies described here used low molecular weight substrates and studied events during catalysis, but not the placement of a cellulose molecule into the active site, which appears to be a key step for crystalline cellulose hydrolysis.

11.9 Outlook

In the last decade, there has been a large increase in our understanding of cellulase structure functional relationships but there is still a considerable amount that needs to be learned. This is particularly true for crystalline cellulose hydrolysis, where the exact role of CBMs in hydrolysis, how cellulose molecules are bound into cellulase active sites, and how cellulose structure influences this process, need to be determined. Only when these processes are understood, will it be possible to engineer cellulase active sites to achieve more efficient hydrolysis of specific biomass substrates. This ability should improve the economics of con — vertingbiomass to liquid fuels, possibly leading to sustainable production of non-greenhouse gas producing liquid fuels.