Metagenomics

Lignocellulose containing environments possess a natural degradative microbiome composed of a genomically diverse set of organisms. Often, these organisms, particularly those that are underrepresented, are missed in culture, but yet may supply significant metabolic contribu­tions to surrounding organisms in these complex environments (Ley et al. 2005, 2008a, b; Sogin et al. 2006; Turnbaugh et al. 2006). Microbiologists began using culture-independent methods, metagenomics, to circumvent low isolation numbers (less than 1%) often seen with culture-based techniques in environmental samples (Handelsman 2004; Schloss 2008). Metagenomics allows genomic access to the entire population of microorganisms and allows for independent analysis of these microbes in conjunction with their natural habitat.

Traditional metagenomic analyses generally begin with total extracted genomic DNA of that community. The DNA can be digested using restriction enzymes, ligated into a vector and propagated in a host, often Escherichia coli. For a sequence-driven analysis, clones can be chosen at random and subsequently sequenced. For function-driven analyses, clones can be screened for phylogenetic markers, enzymatic activity, or antibody binding. Heterologous gene expression then allows for physiological identification of small molecules or proteins. Metagenomic studies began with traditional Sanger sequencing. As microscopic enumeration and colony counts were compared to the resulting numbers of microbes cultured, it became apparent that there was a large majority of organisms that were overlooked in these traditional studies (Schloss and Handelsman 2003; Handelsman 2004) . Indeed, the sequencing and assembly of large gene insert libraries have also been hypothesized to lead to reconstruction of a nearly complete microbial genome (DeLong 2004). As demand for genomic tools arose that would allow for a more accurate picture of the functional distribution of the microbial diversity present, sequencing of large insert libraries, traditionally used in single organism genomics, was applied to total community DNA. This allows for the screening of clones for functional diversity resulting in novel gene discovery, providing a link for genetics and functional expression for each of the selected clones. Of particular significance to this review, metagenomic analysis of a bovine rumen expression identified 22 glycoside hydrolase clones of which four potentially represent previously undescribed families of glycoside hydrolases (Ferrer et al. 2005). A novel polyphenol oxidase (laccase) from this bovine rumen expression library has also been identified and characterized, and it was implied that this enzyme might play a role in ryegrass lignin digestion (Beloqui et al. 2006). Massive metagenome sequenc­ing was also recently applied to another lignocellulose- degrading community, the termite hindgut (Warnecke et al. 2007). This extensive data set showed a diversity of bacterial genes for cellulose and xylan hydrolysis, mainly from spirochete and fibrobacter species. Clearly, the termite hindgut, like the rumen, is a microbial community specialized toward plant cell wall degradation and is a potentially important source of novel enzymes for more woody substrates.

With the advent of next — generation sequencing technologies, sequenced — based metage­nomic approaches have strayed from cloning techniques, which introduce their own levels of bias, to a more random sequencing strategy, pyrosequencing (Ronaghi et al. 1996, 1998; Margulies et al. 2005). We have recently used this approach to examine randomly sampled pyrosequence data from three fiber-adherent rumen microbiomes and one pooled rumen liquid phase sample (Brulc et al. 2009). This genomic analysis revealed that, in the rumen micro — biome, the dominant enzymes are those that attack the easily available side chains of complex plant polysaccharides and not the more recalcitrant main chains, even when cellulose is present as a substrate. Furthermore, when compared to the termite hindgut microbiome, there are fundamental differences in the glycoside hydrolase content, with the termite hindgut microbiome containing more enzymes that are involved in degradation of cellulose (GH5, 9, 44, and 74) and xylan (GH10 and 11). Thus, it appears that in these lignocellulose-degrading microbiomes, CAZyme content appears to be diet driven (forages and legumes or wood). Therefore, when looking for novel microbial plant cell wall deconstructing enzymes, it is important to choose the environment that will serve as a genetic resource for plant cell-wall degrading microbial enzymes based on the substrates to be utilized.