Current Annotations

Annotation Details and Downloads

The gene association files submitted by GO Consortium members are shown in the tables below. Files are in the GO annotation file format and are compressed using the UNIX gzip utility. Please see the appropriate README file for further details on the annotation set. Any errors or omissions in annotations should be reported by writing to the GO helpdesk.

Ontology and annotation data is integrated in the mySQL and XML files. See the GO database guide for more information.

These files can also be downloaded via FTP; we recommend this method for the larger files, such as the UniProt dataset, as the web-based download may not work correctly.

Filtered Files

These files are taxon-specific and reflect the work of specific projects, primarily the model organisms database groups, to provide comprehensive, non-redundant annotation files for their organism. All the files in this table have been filtered using the annotation file QC checks script. A major component to the filtering is the requirement that particular taxon IDs can only be included within the association files provided by specific projects; please see the list of the authoritative groups for the major model organisms.

Statistics as of March 19, 2019

Filtered Annotation File Downloads
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download filtered files
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download filtered files
Agrobacterium tumefaciensstr. C58
PAMGO
79 183
(183 non-IEA)
3/19/2019
Arabidopsis thaliana
TAIR
32272 236757
(208145 non-IEA)
3/19/2019
Aspergillus nidulans
AspGD
156034 622927
(87098 non-IEA)
3/19/2019
Comprehensive Microbial Resource [multispecies]
JCVI
58454 144550
(144550 non-IEA)
3/19/2019
Bos taurus
GO Annotations @ EBI
18569 119105
(21528 non-IEA)
3/19/2019
Caenorhabditis elegans
WormBase
14224 118133
(53023 non-IEA)
3/19/2019
Candida albicans
CGD
56475 311367
(43318 non-IEA)
3/19/2019
Canis familiaris
GO Annotations @ EBI
16758 116849
(5725 non-IEA)
3/19/2019
Danio rerio
ZFIN
24266 222349
(98004 non-IEA)
3/19/2019
Dickeya dadantii
PAMGO
125 304
(304 non-IEA)
3/19/2019
Dictyostelium discoideum
dictyBase
9292 74940
(34967 non-IEA)
3/19/2019
Drosophila melanogaster
FlyBase
14420 113740
(104795 non-IEA)
3/19/2019
Escherichia coli
EcoCyc & EcoliHub
5246 25938
(25938 non-IEA)
3/19/2019
Gallus gallus
GO Annotations @ EBI
16561 136688
(57613 non-IEA)
3/19/2019
Gene Ontology Normal Usage Tracking System [multispecies]
GONUTS
193 282
(282 non-IEA)
10/14/2016
Homo sapiens
GO Annotations @ EBI
19782 476348
(398441 non-IEA)
3/19/2019
Leishmania major
Sanger GeneDB
2809 6289
(6289 non-IEA)
3/19/2019
Magnaporthe grisea
PAMGO
11062 26627
(26627 non-IEA)
3/19/2019
Mus musculus
MGI
24844 406676
(331824 non-IEA)
3/19/2019
Oomycetes
PAMGO
29 125
(125 non-IEA)
3/19/2019
Oryza sativa
Gramene
40512 47260
(47260 non-IEA)
3/19/2019
Protein Data Bank [multispecies]
GO Annotations @ EBI
197003 1898136
(211568 non-IEA)
1/23/2018
Plasmodium falciparum
Sanger GeneDB
2337 6162
(6162 non-IEA)
3/19/2019
Pseudomonas aeruginosa PAO1
PseudoCAP
1535 3701
(3687 non-IEA)
3/19/2019
Rattus norvegicus
RGD
21005 443747
(316897 non-IEA)
3/19/2019
Reactome [multispecies]
CSHL & EBI
11597 104059
(104059 non-IEA)
3/19/2019
Saccharomyces cerevisiae
SGD
6442 119100
(68408 non-IEA)
3/19/2019
Schizosaccharomyces pombe
PomBase
5402 54637
(51297 non-IEA)
3/19/2019
Solanaceae
SGN
799 1349
(1349 non-IEA)
3/19/2019
Sus scrofa
GO Annotations @ EBI
19661 140986
(7297 non-IEA)
3/19/2019
Trypanosoma brucei
Sanger GeneDB
6390 19134
(19134 non-IEA)
3/19/2019

Unfiltered Files

These files have not been filtered with the annotation file QC checks script. The most important difference between these files and the filtered files above is that gene products from certain taxa are not stripped out of the file; they may also contain annotations to obsolete terms or outdated IEA annotations. Please see the annotation file QC script documentation for full details of the checks performed.

Please note that if you use unfiltered files in conjunction with filtered files, there may be duplicated annotations.

Statistics as of March 19, 2019

Unfiltered Annotation File Downloads
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download unfiltered files
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download unfiltered files
Protein Data Bank [multispecies]
GO Annotations @ EBI
344485 6501220
(2749109 non-IEA)
7/17/2018
Reactome [multispecies]
CSHL & EBI
11597 104479
(104479 non-IEA)
12/17/2018
UniProt [multispecies]
GO Annotations @ EBI
0 0
(0 non-IEA)
12/31/1969

In the tables above gene association counts are provided for all evidence codes and separately for everything except IEA, Inferred from Electronic Annotation. The IEA code means there has been no human involvement in the assignment of the association; see the GO evidence code documentation for more details.

Back to top

gp2protein files

The gp2protein directory contains files that map between model organism database object IDs and UniProt accessions.

Back to top