Nonhuman Genetic Terms
Nonhuman Genetic Terms
- [T]he word mouse … comes originally from the
- Sanskrit mush derived from a verb meaning to
- steal. … Mice and rats, through their voracious
- activities in grain larders and as carriers of disease,
- inflicted considerable losses in food and lives upon
- ancient civilizations.
- H. C. Morse III1(p6)
- A very obvious gap in our understanding of human
- genome evolution lies in the complete absence of any
- mapping data from the eutherian orders most dis-
- tantly related to man, particularly the edentates. We
- would urge anyone with an interest in the genetics of
- the aardvark and the armadillo to consider a unique
- mapping project which will be at the forefront (al-
- phabetically, at least) of the comparative mapping
- J. A. Marshall Graves et al2(p964)
Comparative genome analysis has shown that eukaryote species share genes to a great extent.3 Therefore, similar or identical names designate the same gene across species whenever possible. Italicization of gene symbols is uniformly observed.
Animal gene symbols resemble human gene symbols (see 15.6.2, Human Gene Nomenclature, and below).4,5 However, unlike human gene symbols, animal gene symbols typically use or include lowercase letters and punctuation marks. Editors of medical publications may follow author style for animal gene symbols.
Gene terminology for the laboratory mouse (Mus musculus domesticus) and laboratory rat (Rattus norvegicus), often seen in medical publications because of the common use of those species in investigating diseases affecting humans, is prototypic of such style.
Mouse and Rat Gene Nomenclature
Mouse and rat gene nomenclature guidelines were unified in 2003 by the International Committee on Standardized Genetic Nomenclature for Mice and the Rat Genome and Nomenclature Committee.6
Mouse and rat gene symbols resemble human symbols in several respects.6,7 They are descriptive, short (preferably 3 to 5 characters), and italicized. Symbols begin with letters, not numbers. They contain roman letters in place of Greek letters and arabic numerals in place of roman numerals.
Mouse and rat gene symbols differ from human symbols in using lowercase letters. Symbols usually contain an initial capital. Capital letters within a mouse gene symbol may indicate the laboratory code or code for another species/vector (see below). A symbol with all lowercase letters (ie, no initial capital) indicates a recessive trait. Mouse and rat gene symbols may contain hyphens and other punctuation.
The central source for mouse gene terms is the Mouse Genome Database (http://www.informatics.jax.org),8 and for rats, RatMap (http://ratmap.gen.gu.se) and the Rat Genome Database (http://rgd.mcw.edu).9 Gene names and symbols may be verified by means of the search features at those sites.
Style rules and conventions for mouse and rat gene symbols are shown in Tables 8 through 10. (Note: The gene descriptions in the tables that follow are based on but not identical to the approved gene names available in the Mouse Genome Informatics database, which are more complete and do not use Greek letters and other typographic variants. For instance, in searching for a term with a, one would type in “alpha.”) The Mammalian Orthology Query Form (http://www.informatics.jax.org/searches/homology_form.shtml) allows comparative searches of 20 vertebrate species. Note that a given letter or letter combination often but not always signifies a conventional usage. For instance, l at or near the end of a symbol often, but not always, indicates “like.”
Table 9. Conventions for Mouse Gene Symbols and Comparison With Human Gene Symbols (Examples)
Mouse Gene Symbol
Mouse Gene Description
Human Gene Symbol (When Available)
breast cancer 1
same as human symbol except for case
caffeine metabolism QTL 1
q: quantitative locus
complement component 4 binding protein, pseudogene 1
DNA segment, Chr 10, Massachusetts Institute of Technology 1
symbol for DNA segment identified only in the mouse; includes laboratory code (see “Laboratory Codes”)
DNA segment, Chr 17, human D21S56
H21 indicates DNA segment resides on human chromosome 21
glucose-6-phosphate dehydrogenase X-linked
similar but not identical to human gene symbol
guanine binding protein, related sequence 1
-rs: related sequence
gene trap locus 10
Gt: gene trap
gene trap ROSA 26, Philippe Soriano
vector in parentheses; laboratory code indicated (see “Laboratory Codes” section)
histocompatibility 2, class II antigen A, α
hemoglobin β-chain complex
same as human symbol except for case
heterochromatin, Chr 9
Harvey rat sarcoma virus oncogene 1
see also 15.6.3, Oncogenes and Tumor Suppressor Genes
IGHMBP2 (formerly nmd)
immunoglobulin heavy chain μ binding protein 2 (formerly neuromuscular degeneration)
name change with new information about gene
lethal, Chr 17, University of Wisconsin 9
initial l: lethal
β1 laminin, subunit 1
hyphen separates 2 adjacent numbers
P lysozyme structural
12S RNA, mitochondrial
mast cell protease-like
Nidd1, Nidd2, Nidd3, Nidd4
non-insulin-dependent diabetes mellitus 1, 2, 3, 4
same stem (root) for gene families
name change (formerly Gtl1-13)
rRNA, chromosome 13 cluster
T-cell receptor β-chain
TRB@ (formerly TCRB; @ signifies gene family; see 15.6.2 Human Gene Nomenclature)
telomeric sequence, Chr 10, centromere end
Tel: telomere; 10: Chr 10; p: short arm
transgene insertion 1, Fred Van Leuven
Tg: transgene; parenthetic material: inserted gene, in this case the human gene APOE; Vln: founder or “laboratory of” designation
Table 8. Style Rules for Mouse Gene Symbols and Comparison With Human Gene Symbols (Examples)
Mouse Gene Symbol
Mouse Gene Description
Human Gene Symbol (When Known)
lowercase initial capital because named for mutant recessive trait
initial capital, otherwise lowercase, Greek letter changed to roman
Greek letter changed to roman and moved to end of symbol
gene trap, ROSA 26, Philippe Soriano
parentheses may be used
symbol does not begin with number
RN5S1@ (@ signifies gene family; see 15.6.2, Human Gene Nomenclature)
Table 10. Conventions for Mouse Gene Symbols Identified in Collaborative Sequencing Efforts (Examples)a
Mouse Gene Symbol
Mouse Gene Description
Human Gene Symbol (When Available)
RIKEN cDNA 0610005C13 gene
RIKEN symbol assigned to sequence that does not match known genes in other species; Rik: RIKEN Institute, Japan
CDC42 effector protein (Rho GTPase binding) 3; formerly 3200001F04Rik
RIKEN symbol changed when gene identified in another organism
cDNA sequence BC023055
BC indicates sequence from Mammalian Gene Collection of the National Institutes of Health
aldolase 2, B isoform, formerly BC016435
Mammalian Gene Collection symbol changed when gene identified in another organism
cDNA sequence AF179933
Genbank symbol for genes with no other information available in other organisms or sequencing efforts
palmitoyl-protein thioesterase 2, formerly AA672937 and 0610007M19Rik
Genbank sequence ID withdrawn when gene identified in other organism
a See also Database Identifiers for Genomic Sequences in 15.6.1, Nucleic Acids and Amino Acids.
A mouse allele symbol consists of a mouse gene symbol often with a superscript. As with mouse gene symbols, mouse allele symbols are italicized.
Allele symbols can be verified within the records of a mouse gene:
▪ Search for the gene symbol at http://www.informatics.jax.org/javawi2/servlet/WIFetch?page=markerQF
▪ Click on link for the gene symbol that has been located
▪ Under Phenotypes, click on the numeric link after “all phenotypic alleles”
Conventions and rules for mouse allele symbols are shown in Table 11. In a phenotype expression, a superscript plus sign indicates wild-type, eg:
Table 11. Rules and Conventions for Mouse Allele Terms (Examples)
Convention or Rule Illustrated
recessive trait, thus begins with lowercase; because there is no superscript indicating an allelic term, use context to clarify
dominant trait, thus begins with capital; because there is no superscript indicating an allelic term, use context to clarify
situs inversus viscerum allele of dynein, axon, heavy chain 11 gene
allele superscript designation is lowercase (recessive)
Akita allele of insulin 2 gene
allele superscript designation has initial capital (dominant)
dystrophia muscularis allele, Jackson 2, of α2-laminin gene (second allele discovered at the Jackson Laboratory)
laboratory code included in superscript (see “Laboratory Codes” section); hyphens used
underwhite dominant brown alleles of membrane-associated transporter protein gene
multiple alleles separated by hyphen in superscript
which indicates a phenotype with a mutant neurofibromatosis allele (targeted mutation 1, Fredrick Cancer Research and Development Center) and the wild-type neurofibromatosis allele.
Chromosome nomenclature is similar for mice and humans (see 15.6.4, Human Chromosomes). However, in mice, rearrangement terms are capitalized. The following listing and subsequent examples are from the International Committee on Standardized Genetic Nomenclature for Mice10:
homogeneous staining region
As with human chromosomes, lowercase p represents the short arm and lowercase q the long arm. When specific mouse chromosomes are referred to, the word Chromosome is capitalized (and abbreviated Chr after first mention), eg:
Human chromosome 1 shows extensive homology to several mouse chromosomes, especially Chromosome (Chr) 4 and Chr 1.
Chromosome anomaly symbols usually include a unique laboratory code (see the “Laboratory Codes” section) and a series number, eg:
fifth inversion found by Roderick
37th translocation found at Harwell
Chromosome number appears in parentheses:
inversion in Chr 2
Semicolons separate numbers of chromosomes involved in translocations:
translocation involving Chr 4 and Chr X
Periods indicate the centromere in robertsonian translocations:
robertsonian translocation involving Chr 9 and Chr 19
In insertions, the donor chromosome number comes first:
insertion from Chr 7 to Chr 1
For further rules and conventions for chromosomes, see the chromosome nomenclature section of the Mouse Genome Informatics website.10
Laboratory registration codes appear as 1- to 4-letter symbols in animal genetic terminology, including chromosomal, DNA locus, and mouse strain nomenclature (see below). Such codes help identify specific colonies, useful in genetic studies that can extend over many generations. Laboratory codes are registered with the Institute of Laboratory Animal Resources at the National Academy of Sciences in Washington, DC, and may be located at http://dels.nas.edu/ilar_n/ilarhome.11 These codes uniquely identify an investigator, laboratory, or institution that breeds rodents or rabbits. Laboratory codes have initial capitals and appear without expansion. Examples are as follows:
Arb: Arthritis and Rheumatism Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases
Ddd: University of Durham, Drug Dependence Group
J: The Jackson Laboratory
N: National Institutes of Health
Ty: Benjamin A. Taylor, The Jackson Laboratory
Wil: Jean Wilson, University of Texas
Mouse strain names12 are registered at the Mouse Genome Informatics website (http://www.informatics.jax.org/mgihome/submissions/submissions_menu.shtml). Mouse strain names are available at http://www.informatics.jax.org/external/festing/search_form.cgi. (Rat strain names are registered at the Rat Genome Database.9)
Mouse strain names consist of capital letters or combinations of capital letters and numbers:
A few earlier strains have names that are entirely numeric, eg:
atherosclerosis in CBA/J mice
FVB/N mice used as controls
A serial number may precede the laboratory code, eg, the 10 before the J in this example:
Exceptions to the initial capital after the virgule exist in the case of 2 well-known strains (not substrains) of mouse:
Many standard laboratory mouse strains are derived from crosses dating back to the early 20th century or even older lines, and the names reflect abbreviations for characteristics:
dilute, brown, nonagouti
However, mouse strain names are not expanded.
Strain names may be abbreviated using approved abbreviations, eg:
Note that some abbreviations are the same as some names of different strains (eg, the strain C and the abbreviation C), so context must clarify. Additional abbreviations are available at http://www.informatics.jax.org/mgihome/nomen/strains.shtml.
Abbreviations and the letter X are used to indicate recombinant inbred strains (female parental strain first), eg:
BALB/c x C57BL
Capital F followed by a number in parentheses may appear after a strain designation to indicate the number of inbred generations:
20 inbred generations
For further guidelines on mouse strain nomenclature, see the Mouse Genome Informatics website at http://www.informatics.jax.org/mgihome/nomen/strains.shtml.12
Gene symbols for the fruitfly Drosophila melanogaster are generally capital and lowercase or all lowercase for recessive phenotypes. This convention is also observed for gene names. Gene symbols may include punctuation.13,14 A source for background on Drosophila gene names is FlyNome.15 Nomenclature rules and symbol search are available at FlyBase.13
suppressor of Hairy wing
transfer RNA:ser7:23Ea (ser7: seventh isoform of serine; 23E: map position)
As with mouse alleles, Drosophila alleles are indicated with superscripts:
Hnr, Hnr2 (Henna gene, eye color-defective alleles)
Parentheses indicate mutation in the gene:
Mutation symbols consist of 1- or 2-letter terms plus a number:
A characteristic of a mutation may be indicated by a 2-letter ending set in roman type:
hc17ts (ts: temperature sensitive)
Microorganism Gene Nomenclature
Gene symbols for the fungus Saccharomyces cerevisiae consist of 3 capital letters plus a number (or, occasionally, a number-letter) ending19:
adenylate cyclase regulatory protein
cytochrome c oxidase chain Va
This represents a change from earlier style in which all-lowercase symbols were used for loci named for recessive mutations (the preponderance of symbols) and all-capital symbols for loci named for dominant mutations. Allele symbols still follow the case convention (ie, capital for dominant, lowercase for recessive).
Bacterial Gene Nomenclature
Gene terms typically consist of an italicized lowercase 3-letter abbreviation often with an uppercase locus designator. The phenotype or encoded entity (eg, enzyme) is in all roman letters with an initial capital.14,20,21
AraA (L-arabinose isomerase)
Asr (acid shock protein)
imp (formerly ostA)
OstA (organic solvent intolerance; imp: increased membrane permeability)
SodA (superoxide dismutase, manganese)
SodB (superoxide dismutase, iron)
A number of bacterial genome databases are available on the Internet. The National Center for Biotechnology Information sponsors Entrez Genome (http://www.ncbi.nlm.nih.gov/entrez: under Search, select Gene, then search for the gene in question).
Alleles are designated with a number after the uppercase letter or following a hyphen, when not assigned to a locus. Wild-type alleles are designated with a superscript plus sign:
Retroviral Gene Nomenclature
group-specific core antigen gene
regulator of viral protein expression
transactivator of viral transcription
viral protein R
viral protein U
viral protein X
Compare typographic style of gene names and their products (p stands for protein, gp for glycoprotein):
Gene Product (Protein or Polypeptide)
Protein Products (Examples)
p6, p7, p17, p24
p12, p32, p66/51
1. Morse HC III. The laboratory mouse—a historical perspective. In: Foster HL, Fox F, eds. The Mouse in Biomedical Research. Vol 1. Orlando, FL: Academic Press Inc; 1981:6-10.Find this resource:
2. Marshall Graves JA, Wakefield MJ, Peters J, Searle AG, Womack JE, O'Brien SJ. Report of the Committee on Comparative Gene Mapping. In: Cuticchia AJ, ed. Human Gene Mapping 1994: A Compendium. Baltimore, MD: Johns Hopkins University Press; 1995:962-1016.Find this resource:
3. Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nat Genet. 2000; 25(1):25-29. Also available at http://www.geneontology.org/GO_nature_genetics_2000.pdf. Accessed April 21, 2006.Find this resource:
4. ARKdb. http://www.thearkdb.org. Accessed April 21, 2006.
5. RatMapGroup. RATMAP: the Rat Genome Database. http://ratmap.gen.gu.se/. Accessed April 21, 2006.
6. International Committee on Standardized Genetic Nomenclature for Mice and Rat Genome and Nomenclature Committee. Rules for nomenclature of genes, genetic markers, alleles, and mutations in mouse and rat. http://www.informatics.jax.org/mgihome/nomen/gene.shtml#genenom. Updated January 2005. Accessed April 21, 2006.
7. Maltais LJ, Blake JA, Chu T, Lutz CM, Eppig JT, Jackson I. Rules and guidelines for mouse gene, allele, and mutation nomenclature: a condensed version. Genomics. 2002;79(4):471-474. Also available at http://www.informatics.jax.org/mgihome/nomen/short_gene.shtml. Accessed April 21, 2006.Find this resource:
8. Jackson Laboratory. MGI: Mouse Genome Informatics. http://www.informatics.jax.org. Updated April 20, 2006. Accessed April 21, 2006.
9. RGD: Rat Genome Database. http://rgd.mcw.edu. Updated April 17, 2006. Accessed April 21, 2006.
10. International Committee on Standardized Genetic Nomenclature for Mice. Rules for nomenclature of chromosome aberrations. http://www.informatics.jax.org/mgihome/nomen/anomalies.shtml. Accessed April 21, 2006.
11. ILAR: Institute for Laboratory Animal Research. Laboratory Code Registry. http://dels.nas.edu/ilar_n/ilarhome. Accessed April 21, 2006.
12. International Committee on Standardized Genetic Nomenclature for Mice and Rat Genome and Nomenclature Committee. Rules for nomenclature of mouse and rat strains. http://www.informatics.jax.org/mgihome/nomen/strains.shtml. Updated January 2005. Accessed April 21, 2006.
13. FlyBase: a database of the Drosophila genome. http://flybase.net. Accessed April 21, 2006.
14. Stewart A, ed. TIG Genetic Nomenclature Guide. Tarrytown, NY: Elsevier Trends Journals; 1995.Find this resource:
15. FlyNome: a database of Drosophila nomenclature. http://www.flynome.com. Accessed April 21, 2006.
16. Hodgkin J. Recommended genetic nomenclature for Caenorhabditis elegans. http://elegans.swmed.edu/Genome/nomen.html.01_10_25. Accessed April 21, 2006.
17. Nicholas FW. Online Mendelian Inheritance in Animals (OMIA). http://www.angis.org.au/omia. Updated October 16, 2003. Accessed April 21, 2006.
18. Rangel P, Giovannetti J. Genomes and Databases on the internet: A Practical Guide to Functions and Applications. Norfolk, England: Horizon Scientific Press; 2002.Find this resource:
19. SGD gene naming guidelines. http://www.yeastgenome.org/gene_guidelines.shtml. Accessed April 21, 2006.
20. Demerec M, Adelberg EA, Clark AJ, Hartman PE. A proposal for a uniform nomenclature in bacterial genetics. Genetics. 1966;54(1):61-76.Find this resource:
21. Journal of Bacteriology 2006 instructions to authors. http://jb.asm.org/misc/itoa.pdf. Accessed April 21, 2006.
22. Guatelli JC, Siliciano RF, Kuritzkes DR, Richman DD. Human immunodeficiency virus. In: Richman DD, Whitley RJ, Hayden FD, eds. Clinical Virology. 2nd ed. Washington, DC: ASM Press; 2002:685-729.Find this resource: