Evolution of Molecular Marker Types

DNA molecular markers started with restriction fragment length polymorphism (RFLP), which refers to the differences of restriction sites between two or more DNA samples. After a DNA sample is digested into pieces by restriction enzymes, the resulting restriction fragments are separated according to their lengths and detected by hybridization with labeled nucleotide probes. Although now largely outdated, RFLP was the first-generation DNA profiling technique used for genetic diversity analysis and linkage map construction in switchgrass (Hultquist et al. 1996; Missaoui et al. 2005b, 2006).

Next widely used first-generation marker system is random amplified polymorphic DNA (RAPD). RAPD does not require any information of the DNA sequence of a target organism, thus it is cheaper to develop than RFLP. It has been used for switchgrass genetic diversity and evolution studies (Gunter et al. 1996; Casler et al. 2007). One major disadvantage of RAPD is its low reproducibility and instability due to slippage and low specificity of random primer binding.

Simple sequence repeat (SSR) markers, also known as microsatellites, are repeats of short nucleotide sequences, usually equal to or less than six bases in length per core repeat. SSRs are highly variable in the number of repeats at a specific locus and distributed throughout the eukaryotic genomes. In addition, SSR markers are amplified using the polymerase chain reaction (PCR) with fewer experimental steps and a lower cost and smaller amount of DNA templates compared with RFLP, thus allowing for the rapid generation of data from a relatively small amount of plant tissues. They have been popularly used as the second-generation DNA markers in construction of linkage maps, QTL (quantitative trait loci) mapping, gene cloning, germplasm diversity study, cultivar identification, and marker — assisted selection.

The SSR markers in switchgrass are available. Tobias and colleagues reported the primer sequences of 32 effective SSR markers developed from a switchgrass expressed sequence tag (EST) project (Tobias et al. 2005,

2006) . Later, Tobias et al. (2008) developed additional 830 EST-derived SSR markers. Not long after, 185 and 1,030 genomic SSRs were developed from sequencing SSR-enriched genomic libraries by two research groups, respectively (Okada et al. 2010; Wang et al. 2011). Recently 538 effective EST-SSRs were reported by our group (Liu et al. 2013b). Our experiments clearly showed higher polymorphisms of genomic SSRs than EST-SSRs in switchgrass (Wang et al. 2011; Liu et al. 2012).

Single nucleotide polymorphisms (SNPs) are a single nucleotide variation in sequence, and represent the most abundant type of genetic polymorphisms in plant genomes (Kwok 2001). While the majority of the SNPs are of no biological consequences, a fraction of the substitutions have functional significance and are the basis for plant diversity. Compared to SSRs, SNPs are, to some extents, more amenable to high-throughput automated genotyping assays that allow samples to be genotyped faster and more economically (Rafalski 2002; Ha and Boerma 2008; Han et al.

2011) . Scanning for new SNPs can be divided into two methods: i. e., global and regional approaches. Global SNP discovery is generally time — and labor-consuming. It is limited by the amount of funding available and whole genome sequence to provide the reference against which all other sequencing data can be compared. In contrast, local SNP discovery are relatively inexpensive to develop and rely mostly on direct DNA sequencing. SNP detection technologies have evolved from expensive, time-consuming, and labor-intensive processes to some of the most highly automated, efficient, and relatively inexpensive methods of DNA marker detection (Kwok and Chen 2003; Han et al. 2011). Two complete switchgrass chloroplast (cp) genomes were sequenced from upland (‘Summer’) and lowland (‘Kanlow’) ecotypes, and totally 116 SNPs were identified (Young et al. 2011). As a marker system developed from cp genome, their application in breeding is limited due to maternal inheritance of cp genome. Recently, Ersoz et al. (2012) established EST libraries of leaf tissues from thirteen diverse switchgrass cultivars, which represented upland and lowland ecotypes, as well as tetraploid and octoploid genomes. These libraries were sequenced by ABI 3730 instruments, and 100,000 EST sequences were produced. Subsequently, they generated reduced-representation genomic libraries from the same samples, which were massively sequenced as short — reads (35 bp) on a first-generation Illumina Genome Analyzer. Using EST as reference framework, these short sequence reads were assembled, and over 149,000 SNPs were identified. In addition, through combining with previously published 500,000 ESTs by Tobias et al. (2005, 2008), 25,000 additional SNPs were identified from the entire EST collection (Ersoz et al. 2012).

Next-generation sequencing (NGS) technologies are making a substantial impact on crop breeding. Genotyping by sequencing (GBS) is an emerging technology based on the platform of NGS. It utilizes ample SNPs generated by sequencing for genetic research. Comparing to other marker systems, GBS reduces sample handling time, uses fewer PCR samples, lowers costs if in a high-resolution scale. In addition, restriction enzymes can be used to reduce genome complexity and avoid the repetitive sequences of the genome, which is essential to expand in switchgrass genome. In wheat and barley, NGS was proven effective and over 200,000 sequence tags were mapped (Elshire et al. 2011). Its use in switchgrass is ongoing (http:// www. maizegenetics. net/snp-discovery-in-switchgrass) and expected to have a substantial role for genotyping but, to a large degree, depending on funding availability.