Molecular Biology Capacity in Microalgae

11.2.1 Genomics and Molecular Biology of Microalgae

Microalgae-based biocrude production is an established technology, but compared to conventional fossil fuel extraction, it is energetically unattractive and the chemistry poorly understood. Improvements in process chemistry are necessary for microalgae biocrude to compete successfully with fossil fuels and non-algal biofuel technologies and to reach its full potential. While conventional strategies for strain development can yield significant improvements, genetic modification (GM) has the potential to improve aspects of biocrude production more rapidly and poten­tially to greater effect. As the primary aim of HTL is to generate more biocrude product per unit biomass with reduced energy costs, the manipulation of the initial biomass quality and yield, as well as aspects of the HTL chemistry (e. g. N & S content), may be amenable to GM strategies. The first step is to determine what traits would be helpful for HTL processing; the second is to identify how manip­ulating algal genetics can produce those traits.

Genetic Research in Microalgae To engineer beneficial traits into production strains, sufficient knowledge of algal biology is required to conduct targeted opti­misation. This is embodied primarily in both the understanding of the most appropriate effects to target and of the methods to enable their engineering. Just as bacterial engineering rests upon a deep knowledge of bacterial biochemistry and genetics, algal GM biotechnology needs to rest upon a firm foundation of funda­mental research into the way that algal genomes work. Despite the commonality of fundamental genetic mechanisms across the span of life on Earth, great variety is also present, and a consistent lesson is that ‘the devil is in the detail’ with respect to individual organisms. Consequently, the specifics matter greatly. Further, much biological variability will have accumulated from the ancient origins of algal phyla and their early divergence from plants and animals. Much specific knowledge of algal gene regulation will therefore be required before skilful, efficient and routine genetic manipulation will be possible. The recently expanded library of available algal genomes is a welcome advance but is of limited utility until these genomes are systematically mapped, curated, annotated and understood, a much more time­consuming task than the actual sequencing. Systematic approaches such as the generation of knockout mutants of all Chlamydomonas genes at Stanford University (Zhang et al. 2014) and the transcriptomic (FANTOM) approaches pioneered at RIKEN in Japan (Forrest et al. 2014) are needed to provide the ability to quickly and with certainty assign biological functions to specific genes and curate algal genomes similarly to those of mammals. While microalgal genomes are undoubt­edly simpler than the human genome, the resources allocated to studying them are miniscule by comparison, and the molecular toolkit is sparse, especially the lack of specific antibodies.

Advancements in Genomics Genome sequencing and sequence analysis is an important first step in deepening our understanding of microalgal systems and ultimately developing improved engineering processes. Only a very small number of genomes are available particularly when considered against the huge microalgal species diversity; however, the number of genome sequencing programs is steadily increasing (see Table 11.1). The National Centre for Biotechnology Information (NCBI) now contains 25 green algae genomes either in full, as scaffolds, or for which sequencing is currently underway (www. ncbi. nlm. nih. gov/genomes). Fur­thermore, there are novel bioinformatic tools (e. g. KEGG assignments accessible at www. genome. jp/kegg), and as BioModels databases accessible at www. ebi. ac. uk/ biomodels-mainwww. ebi. ac. uk/biomodels-mainwww. ebi. ac. uk/biomodels- mainwww. ebi. ac. uk/biomodels-main) become available online, they will enable researchers to predict and characterise gene regulatory pathways, forecast outcomes of metabolic shifts and functionally annotate de novo genomes of diverse algal species.

Genetic Mechanisms The existence of functional microRNAs in Chlamydomonas (Molnar et al. 2007) demonstrates that much of the convoluted genomic biology being revealed in mammals can also be expected in these simple organisms. The general schema of molecular pattern receptors, signal transduction mechanisms, and complex transcription factor-mediated feedback control of nuclear genes is to be expected, and many of the protein motifs will be familiar (e. g. helix-loop-helix transcription factors). However, given the evolutionary distance between different algal clades and between algae and land plants, it is to be expected that apart from highly conserved central mechanisms (core metabolism, cell replication, and mitochondrial and photosynthetic machinery), many baroque variations remain to be discovered. Algal genetics lags far behind algal physiology, much of which is common to plants in specific detail as well as general principles. To fill this gap, high-throughput gene analysis and bioinformatics will be critical for rapid mapping of the overall territory, even if painstaking molecular analysis is still needed for final validation of proposed biochemical and information pathways.

The algal genes that have so far been studied in detail illustrate this need. Significant changes to cell status, such as nutrient limitation (sulphate, nitrogen, iron, copper), lead not to up-regulation of a few receptors or import proteins, but to coordinated changes of thousands of genes, which resemble those waves of altered

Class

Species

Strain

Project type

Genome size (Mb)

No. genes

References

Chlorophytes (green algae)

Chlamydomonas reinhardtii

CC-503

Genome

121

15,143

Merchant et al. (2007) and Proschold et al. (2005)

Chlamydomonas incerta

EST

ND

http://tbestdb. bcm. umontreal.

ca/searches/login. php

Volvox carteri

UTEX2908

Genome

138

14,437

Prochnik et al. (2010)

Dunaliella salina

ССАР19/

18

Genome

Joint Genome Institute (JGI)

Chlorella variabilis (former: Chorella vulgaris)

NC64A

Genome

46

9791

Blanc et al. (2010)

Haematococcus pluvialis

Grossman (2007)

Scenedesmus obliquus

Grossman (2007)

Oedogonium cardiacum

Chloroplast

genome

Grossman (2007)

Pseudendoclonium akinetum

Chloroplast

genome

Pombert et al. (2005)

Coccomyxa subedipsoidea

C-169

Genome

49

9915

Blanc et al. (2012)

Botryococcus braunii

Genome

Joint Genome Institute

Mesostigma viride

EST

http://tbestdb. bcm. umontreal. ca/ searches/login. php

Nephrosehnis olivacea

EST

http://tbestdb. bcm. umontreal. ca/ searches/login. php

Ulva linza

EST

6519

Zhang et al. (2012)

Leptosira terrestris

Chloroplast

genome

de Cambiaire et al. (2007)

Pedinomonas minor

Plastid

genomes

Grossman (2007), project to be

Monoraphidium neglectum

SAG 48.87

Genome

16,761

Bogen et al. (2013)

Table 11.1 Update on available algal genome sequences, ongoing and future genome sequencing projects

(continued)

11 Genetic Engineering for Microalgae Strain Improvement..

Class

Species

Strain

Project type

Genome size (Mb)

No. genes

References

Stramenopiles

(diatoms)

Thalassiosira pseudonana

CCMP1335

Genome

32

13,025

Armbrust et al. (2004)

Thalassiosira oceanica

Genome

92

34,684

Lommer et al. (2012)

Phaeodactylum tricomutum

CCP1055/1

Genome

27

10,398

Bowler et al. (2008)

Fragilciriopsis cylindrus

CCMP1102

Genome

81

Joint Genome Institute

Pseudo-Nitzschia mutiseries

CLN-47

Genome

Joint Genome Institute

Amphora sp.

CCMP2378

Genome

Raymond and Kim (2012)

Attheya sp.

CCMP212

Genome

Raymond and Kim (2012)

Fragilciriopsis kerguelensis

T. Mock, U. East Anglia, USA

Ectocarpus siliculosus

Ec32

Genome

214

16,256

Cock et al. (2010)

Aureococcus

anophagefferens

CCMP1984

Genome

57

11,522

Gobler et al. (2011)

Haptophytes

Emiliania huxleyi

CCMP1516

Genome

168

38,549

Read et al. (2013)

E. huxleyi

RCC1217

Genome

The Genome Analysis Centre (TGAC), UK

E. huxleyi

CCMP371

EST

University of Iowa, USA

Phaeocystis antarctica

Joint Genome Institute

Phaeocystis globosa

Joint Genome Institute

Pavlova lutheri

EST

University Montreal

Isochrysis galbana

CCMP1323

EST

University Montreal

(continued)

11 Genetic Engineering for Microalgae Strain Improvement..

VO

-J

gene expression seen in multicellular organisms. Only high-throughput mapping can provide the necessary background to support the efficient dissection of these biological responses. Apart from nutrient limitation, the kinds of coordinated responses which might be expected include photoacclimation, responses to pre­dators and pathogens, differentiation-like developmental programs and adaptions to environmental niches. Fortunately, many of the tools developed for the study of other organisms can readily be adapted for algal biology. These include powerful genome-editing platforms either developed [zinc finger nucleases, TALENs (Gao et al. 2014; Sizova et al. 2013)] or under-development [CRISPR/Cas (Sander and Joung 2014)]. Although not yet routine, the ability to conduct precise genome engineering will greatly advance the speed and scope of algal GM production.

Case Studies A number of genetic responses in algae have been described, mainly in response to key physiological processes such as photosynthesis, nutrient limi­tation and circadian rhythm. These include the analysis of the transcriptional responses of the light-harvesting complex (LHC) genes to light and circadian signals, the carbon concentrating mechanism (CCM) in response to CO2 limitation, and responses to iron, copper and sulphur limitation.

Few of the estimated 234 transcription factors and regulators initially identified bioinformatically in the Chlamydomonas genome (Riano-Pachon et al. 2008) have even tentative roles assigned to them. Although no promoters have been compre­hensively analysed, several have been cloned and their behaviours studied and utilised for experimental systems. The best examples are the light-harvesting antenna genes which are regulated both by light and by circadian mechanisms. In addition to promoter regulation, post-transcriptional regulation has been demon­strated by an mRNA binding protein CHLAMY1, composed of two subunits (C1 and C3). In turn, an E-box-like promoter element has been shown to be involved in the regulation of the circadian rhythm protein C3 (Seitz et al. 2010) and some binding factors isolated. Regulatory factors controlling the CCM have been iden­tified [CCMl (Fang et al. 2012; Fukuzawa et al. 2001); LCR1 (Ohnishi et al. 2010; Yoshioka et al. 2004)]. Iron-responsive elements have been identified in several genes [Fox1 (Allen et al. 2007; Fei et al. 2009), Atx1, Fbp1, Fld, Fea1], while the copper response regulator CRR1 has been shown to mediate copper and zinc responses (Malasarn et al. 2013; Sommer et al. 2010) and anaerobiosis [HydA1 (Lambertz et al. 2010; Pape et al. 2012) and Fdx5]; other nutrient uptake regulatory genes include those for sulphur SAC1 (Davies et al. 1996; Moseley et al. 2009) and phosphate PSR1 (Moseley et al. 2009; Wykoff et al. 1999). Although this represents a beginning, it pales in comparison with the extensive analyses of animal genomes, and when contrasted to the * 15,000 genes of Chlamydomonas, it is unlikely that this subset will provide an adequate basis for modelling promoter function in algae in general.

Some analysis has started in species other than Chlamydomonas including Dunaliella (Jia et al. 2012; Lu et al. 2011; Park et al. 2013), and some crossover is expected from plant gene analysis especially in Arabidopsis. A start has also been made in understanding the role of mRNA regulation (Schulze et al. 2010; Wobbe et al. 2009) and chromatin remodelling in Chlamydomonas (Strenkert et al. 2011, 2013). While miRNA regulation has been demonstrated (Molnar et al. 2007; Yamasaki et al. 2013), little detail is available, nor is epigenetics well understood. In summary, the detail and breadth of examples typical of the regulation of mammalian promoters and their resultant mRNAs is sorely lacking for algal gen­omes. Consequently, close study of a set of promoter control mechanisms as models is badly needed and will greatly advance the level of understanding in this area, enabling much more sophisticated photosynthetic engineering, including the discovery of useful inducible/repressible promoters, and the ability to manipulate metabolic pathways and cellular strategies which are normally tightly regulated by photosynthesis. Lipid and starch accumulation, photoprotection and cellular repli­cation, for example, are all cellular functions which are desirable to control for biotechnology applications. Abundant proteins including rubisco and LHC proteins represent substantial cellular resources. Some LHC adaptive functions are important to retain or enhance; others are potentially dispensable under bioreactor conditions or can even reduce biomass growth if allowed to operate naturally. Resource allocation within a cell is complex (Pahlow and Oschlies 2013) and only partly within our control as over — or under-production of specific metabolites can be detrimental to the fitness of the organism and feedback regulation in algae is incompletely understood. Therefore, opportunities exist for the development of the excretion of the end product (e. g. H2 produced from water via the photosynthetic machinery; volatile metabolic intermediates (Melis 2013); specific secretion mechanisms for proteins and lipids).

The study of gene regulation has traditionally proceeded through intensive analysis of specific cases. As broad understanding evolves of the kinds of mech­anisms that are present in biological systems, the emphasis has shifted to high — throughput analyses starting with microarrays and mass mutant libraries, and it is to be expected that this will quickly generate large amounts of data once algal genomics matures. Nonetheless, there are very few case studies of algal genetic mechanisms, and the study of particular cases will still be vital to anchor, interpret and calibrate the results of mass data acquisition.