ASSIGNMENT OF ANNOTATION FROM ARABIDOPSIS THALIANA

To extend the terms associated with C. reinhartdii genes, functional terms were inferred by homology to the annotation set of the plant Arabidop — sis thaliana (thale cress). Identification of orthologous proteins was based on sequence similarity and subsequent filtering of the results by retain­ing only mutual best hits between the two sets of protein sequences. The corresponding Arabidopsis thaliana annotation was used to supplement GO terms and was similarly expanded to contain term ancestry. The A. thaliana annotations of the MapMan Ontology [33] and MetaCyc Pathway database [2] were also used to provide more complete annotation coverage of the C. reinhardii genome.

TABLE 2: Number of gene identifiers associated with annotation databases

Identifier

Type

Total

Gene

IDs

KEGG

Reac-

tome

Pan­

ther

Gene

Ontol­

ogy

Map-

Man

KOG

Pfam

InterPro

JGI v3.0

14598

5348

2740

1147

6563

5214

9139

7166

7532

JGI v4.0

16706

4232

1949

1085

7568

3171

9973

7305

8151

Augustus

v5.0

16888

4686

2983

1673

4334

3160

5123

8202

5202

Augustus

u10.2

17302

4583

3326

1913

6956

3892

8977

8691

7464

Number of Chlamydomonas reinhardtii identifiers with at least one functional annotation for each primary database, shown per identifier type.