Simple Search: This search form allows you to enter basic information (locus name, Gene Model ID, Transcipt ID, Translation ID, Gene symbol, Gene name), including partial names, to search for a gene and/or gene model.
Use the wildcards '%' or '*' to find matches that contain your search term. '^' at the beginning of search term will find matches that start with that term. '$' at the end of search term will find matches that end with that term.

Submit (see a sample gene model query or locus query)

    (upper limit on results is 2,000 records)
Display records per page.

More Examples: lg1, liguleless1, Zm00001d002005, GRMZM2G036297, DAA35605, Zm00001d002005_T001

Advanced Search

Check the boxes next to the fields you want to search; if you just want to find records that have any value for that attribute, check the box and leave the criteria alone.

Show only genes:
from :
of :
on :

Search for Gene Models by Sequence

Enter sequence, Genbank IDs or gene model names (Zmdddddadddddd, ZEAMMB73_xxxx or GRMZMxxxxxx): Sample
Amino acid

Translate Gene Model IDs - download

   Enter list - 8,000 gene model limit: (Example list)
Translate to:


Alternatively, download the full gene model associations list between v3 and all other assemblies in our database

Download By Region and Gene Model Set

Gene model set: Chromosome:
Model type: Data type:
Start position: End position:
(enter positions w/o commas or spaces, or leave both empty for entire chromosome)
   Enter two markers to get gene models within the span. Both must be on the same chromosome. Only applicable for assemblies with aligned markers.
Enter a list of Gene Models, Transcripts, and/or Proteins to retrieve their positions on a given assembly in a tab-delimited format.
Output type:       Submit

Download Sequence for Gene Model List

When downloading sequence please specify which type of input you are entering. For genomic please use the gene model name (e.g. GRMZM2G165390). For cDNA, CDS, and mRNA please use the transcript ID (e.g. GRMZM2G165390_T01). For protein please use the translation ID (e.g. GRMZM2G165390_P01). If you enter only the gene model ID, please choose if you want to see all transcripts or only the canonical transcripts.

   Enter list - 8,000 gene model limit: (Example list)
Input type:
Output type:


Gene Model Downloads

The current gene model set for the representative maize genome, B73 v5 is Zm00001e.1.

Zm00001e.1 gene model GFF
Zm00001e.1 gene model cDNA fasta
Zm00001e.1 gene model CDS fasta
Zm00001e.1 gene model genomic fasta
Zm00001e.1 gene model protein fasta

Gene model cross-references across maize genome assemblies

Please note that there is not a 1-to-1 correspondence between all gene models in all annotations. Some gene models are unique to specific genome assemblies, some have been split or merged between annotation or assembly versions, and some do not appear in the same syntenic locations.
+ Click to learn more.

Complete B73 v4 (Zm00001d.2) gene model cross reference
Complete B73 v3 (5b+) gene model cross reference
Complete B73 v2 (5b) gene model cross reference
Complete B73 v1 (4b) gene model cross reference

Older gene model downloads

Gene model set Zm00001d.2 corresponds to Gramene release 36.
Zm00001d.2 gene model cDNA fasta
Zm00001d.2 gene model ncRNA fasta
Zm00001d.2 gene model translations fasta
Zm00001d.2 gene model GFF3
Gene model set Zm00001d.1 corresponds to Gramene release 32.
Zm00001d.1 gene model cDNA fasta
Zm00001d.1 gene model ncRNA fasta
Zm00001d.1 gene model translations fasta
Zm00001d.1 gene model GFF3
Gene model set 5b+ for B73 RefGen v3 corresponds to Gramene release 21.
5b+ gene model cDNA fasta
5b+ gene model ncRNA fasta
5b+ gene model translations fasta
5b+ gene model GFF3
B73 RefGen_v3 MAKER-P gene models

Gene model set Zm00001d.provisional holds low confidence gene models that were not included in the Zm00001d.2 annotation.
Zm00001d.provisional (low confidence) gene model GFFs
Zm00001d.provisional (low confidence) gene model transcripts
Zm00001d.provisional (low confidence) gene model proteins Cross reference for 5b+ GRMZM and ZEAMMB73 IDs
5b.60: Filtered Gene Set for B73_RefGen_v2
5a.59: Working Gene Set for B73_RefGen_v2
4a.53: Filtered Gene Set for B73_RefGen_v1
4a.53: Working Gene Set for B73_RefGen_v1

B73 Reference Genome Assembly and Gene Model Issues

We need your help! Please report any assembly or gene model structure problems. This includes misassembled regions, evidence for closing gaps, gene models that should be merged or split, evidence supporting low-confidence gene models, et cetera. All issues will be shared with the maize community and with the team charged with improving the B73 assembly and gene models.

All open gene model issues
All resolved gene model issues

All open assembly issues
All resolved assembly issues

About The Current Gene Model Set

The current gene model set (i.e. structural assembly annotation) is Zm00001e.1.

See the 2016 Whole-Genome Assembly and Annotation nomenclature document for an explanation of the assembly and annotation identifiers, which was first adopted for the Zm-B73-REFERENCE-GRAMENE-4.0 / Zm00001d assembly and structural annotation and subsequent assemblies and annotation for B73 and other accessions.

The Zm00001e.1 gene model set for Zm-B73-REFERENCE-NAM-5.0 is the current recommended set. Other gene model sets are provided for comparison.

Gene model sets and assemblies:
set assembly Gramene/EnsemblPlant version
Zm00001d.2   Zm-B73-REFERENCE-NAM-5.0
Zm00001d.2   Zm-B73-REFERENCE-GRAMENE-4.0 36/54
Zm00001d.1   Zm-B73-REFERENCE-GRAMENE-4.0 32/50
5b+   B73 RefGen_v3 18/36 - 31/49
5b   B73 RefGen_v2 7/25 - 17/37
4a   B73 RefGen_v1

Reference gene model releases
Gene models for the B73 genome assembly are provided at both MaizeGDB and Gramene. Nomenclature guidelines for gene models, as agreed to by the maize research community, indicate that gene model sets are named with the associated assembly identifier. For the B73 reference genome, this is Zm0001d. Gramene, which manages these gene models uses a different versioning system.

Bold font indicates the current official gene model set.

Version Gramene
Date Changes
v38/56 - v43/61 12/7/17 - 3/15/19  Changes limited to gene models outside the reference set (ENSRNA, ncRNA, and inferred organelle gene models)
v37/55 09/21/17  3722 new non-coding gene models, using non-standard prefix, "ENSRNA"; 2318 ncRNA gene models have changed transcripts
Zm00001d.2  v36/54 06/07/17  transcripts changed for 28 miRNAs
v35/53 04/02/17 published in Nature; transcripts changed for 547 gene models
v34/52 12/14/16 transcripts changed for 28 miRNAs
v33/51 174 Mt and Pt gene models added, transcripts changed for 3127 gene models
Zm00001d.1  v32/50 09/28/16  inital release

Gene Model Functional Annotations and Orthologs Functional Annotations (B73 RefGen_v2 only)
Phytozome: Functional Annotations (B73 RefGen_v3 and v4; log-in required)
Freeling Lab: Syntenic Orthologs (mapped to RefGen_v2)

Gene Models with Associated Genes (B73 RefGen_v3)

Classical Genes:   table   tab delimited
MaizeGDB curated genes:   table   tab delimited
All associated genes:   table   tab delimited

Gene Models with UniformMu insertions

About the UniformMu project

Genomic coordinates for Zm-B73-REFERENCE-GRAMENE-4.0 (aka B73 RefGen_v4):
Release 9 Excel spreadsheet
Release 9 Excel spreadsheet with gene structure

List of gene models from the B73 RefGen_v3 Filtered Gene Set that have UniformMu insertions:
Release 8 Excel spreadsheet

List of gene models from the B73 RefGen_v2 Filtered Gene Set that have UniformMu insertions including 100 bp upstream or downstream:
Release 7 Excel spreadsheet
Release 8 Excel spreadsheet

List of gene models from the B73 RefGen_v2 Filtered Gene Set that have UniformMu insertions in exons:
Release 7 Excel spreadsheet
Release 8 Excel spreadsheet

Zm-B73-REFERENCE-NAM-5.0/Zm00001e.1 Information

In-depth metadata for Zm-B73-REFERENCE-NAM-5.0 is available here.
See the paper for B73 RefGen_v1 here, and for Zm-B73-REFERENCE-GRAMENE-4.0 here.

Counts for each chromosome.
Chromosome Accession Protein Coding miRNA Transposable Element Low Confidence
Chromosome 1 NC_024459.3 5905 14 2209
Chromosome 2 NC_024460.3 4737 22 2209
Chromosome 3 NC_024461.3 4737 16 1571
Chromosome 4 NC_024462.3 4115 20 1826
Chromosome 5 NC_024463.3 4480 24 1681
Chromosome 6 NC_024464.3 3290 11 1223
Chromosome 7 NC_024465.3 3108 10 1193
Chromosome 8 NC_024466.3 3561 13 1288
Chromosome 9 NC_024467.3 2973 7 1191
Chromosome 10 NC_024468.3 2684 17 1034
Unmapped 319 0 357
Nuclear Total 39,324 154 15,516
Annotations: Zm00001e.1

Zm-B73-REFERENCE-GRAMENE-4.0/Zm00001d Stats

Gene Feature Value
Average protein-coding transcript size 7638 bp
Average low confidence transcript size 6981 bp
Average transposable element size unavailable
Average Exon size 156 bp
Average Number of exons per gene 4 exons
Maximum exons per gene 81 exons (Zm00001d040166)
Average Intron size 578 bp
Average Coding region size 207 bp

Top 20 genes

These are the most searched-for genes in 2017

1 lg1*
2 matl1
3 br2
4 vp1
5 o2
6 wx1
7 bz1
8 tb1
9 o2
10 sh2
11 lg1
12 pl1
13 sh1
14 kn1
15 ae1
16 su1
17 fea3
18 p1
19 b1
20 rap2
* lg1 is the example gene name in the gene search page

NCBI B73_v4 annotation release 101

The NCBI B73_v4 annotation release 101 was developed independently at NCBI using the NCBI Eukaryotic Genome Annotation Pipeline on B73 RefGen_v4. The final set of annotated features comprises, in order of preference, pre-existing RefSeq sequences and a subset of well-supported Gnomon-predicted models. It is built by evaluating together at each locus the known RefSeq transcripts, the features projected from curated RefSeq genomic alignments and the models predicted by Gnomon.

The NCBI B73_v3 annotation release 100 is available here.

Image of gene model frequency for Zm-B73-REFERENCE-GRAMENE-4.0 (Click image for larger view)

Gene Model Terms

Associated Genes: Associated Genes are genes that have been linked to a gene model by hand curation.

Canonical: The canonical transcript is defined as either the longest CDS, if the gene has translated transcripts, or the longest cDNA. Note: a canonical transcript is not always the first transcript (T01) or the longest transcript.

Non-canonical. All other transcripts for a gene model that are not the canonical transcript.

Evidence Type: The source of evidence to support the gene model.

Model Types:

Protein Coding A gene model with supporting evidence.
miRNA small, non-coding RNA.
TE Transposable elements.
Low Confidence A gene model with little or no supporting evidence.
WGS. (Versions 5a.59 and earlier) Working Gene Set. This set merges new annotations performed on RefGen_v2 with RefGen_v1 4a gene models mapped onto V2. New annotations were achieved by an evidence-based method (Gramene GeneBuilder) and complemented with de novo Fgenesh models performed on masked DNA.
FGS: (Versions 5b.60 and earlier) Filtered Gene Set. The filtered set was generated by screening the working set to remove pseudogenes, TE-encoded genes, and low-confidence hypothetical models.

Transcript Classes:

WH. With homology to a known non-transposable element in the NR (non-redundant) database at GenBank. Protein-coding gene.
NH. No homology in the NR (non-redundant) database at GenBank. Hypothetical gene or pseudogene.
TE. With homology to a known transposable element (TE) in the NR (non-redundant) database at GenBank. Transposable element.

Discussion of Gene Data