Subscriber Login

  • This account has no valid subscription for this site.

Forgotten your password?

Contents

Nonhuman Genetic Terms 

Chapter:
Nomenclature
Author(s):

Harriet S. Meyer

Nonhuman Genetic Terms

  • [T]he word mouse … comes originally from the
  • Sanskrit mush derived from a verb meaning to
  • steal. … Mice and rats, through their voracious
  • activities in grain larders and as carriers of disease,
  • inflicted considerable losses in food and lives upon
  • ancient civilizations.
  •     H. C. Morse III1(p6)
  • A very obvious gap in our understanding of human
  • genome evolution lies in the complete absence of any
  • mapping data from the eutherian orders most dis-
  • tantly related to man, particularly the edentates. We
  • would urge anyone with an interest in the genetics of
  • the aardvark and the armadillo to consider a unique
  • mapping project which will be at the forefront (al-
  • phabetically, at least) of the comparative mapping
  • effort.
  •   J. A. Marshall Graves et al2(p964)

Comparative genome analysis has shown that eukaryote species share genes to a great extent.3 Therefore, similar or identical names designate the same gene across species whenever possible. Italicization of gene symbols is uniformly observed.

Vertebrates.

Animal gene symbols resemble human gene symbols (see 15.6.2, Human Gene Nomenclature, and below).4,5 However, unlike human gene symbols, animal gene symbols typically use or include lowercase letters and punctuation marks. Editors of medical publications may follow author style for animal gene symbols.

Gene terminology for the laboratory mouse (Mus musculus domesticus) and laboratory rat (Rattus norvegicus), often seen in medical publications because of the common use of those species in investigating diseases affecting humans, is prototypic of such style.

Mouse and Rat Gene Nomenclature

Mouse and rat gene nomenclature guidelines were unified in 2003 by the International Committee on Standardized Genetic Nomenclature for Mice and the Rat Genome and Nomenclature Committee.6

Mouse and rat gene symbols resemble human symbols in several respects.6,7 They are descriptive, short (preferably 3 to 5 characters), and italicized. Symbols begin with letters, not numbers. They contain roman letters in place of Greek letters and arabic numerals in place of roman numerals.

Mouse and rat gene symbols differ from human symbols in using lowercase letters. Symbols usually contain an initial capital. Capital letters within a mouse gene symbol may indicate the laboratory code or code for another species/vector (see below). A symbol with all lowercase letters (ie, no initial capital) indicates a recessive trait. Mouse and rat gene symbols may contain hyphens and other punctuation.

The central source for mouse gene terms is the Mouse Genome Database (http://www.informatics.jax.org),8 and for rats, RatMap (http://ratmap.gen.gu.se) and the Rat Genome Database (http://rgd.mcw.edu).9 Gene names and symbols may be verified by means of the search features at those sites.

Style rules and conventions for mouse and rat gene symbols are shown in Tables 8 through 10. (Note: The gene descriptions in the tables that follow are based on but not identical to the approved gene names available in the Mouse Genome Informatics database, which are more complete and do not use Greek letters and other typographic variants. For instance, in searching for a term with a, one would type in “alpha.”) The Mammalian Orthology Query Form (http://www.informatics.jax.org/searches/homology_form.shtml) allows comparative searches of 20 vertebrate species. Note that a given letter or letter combination often but not always signifies a conventional usage. For instance, l at or near the end of a symbol often, but not always, indicates “like.”

Table 9. Conventions for Mouse Gene Symbols and Comparison With Human Gene Symbols (Examples)

Mouse Gene Symbol

Mouse Gene Description

Convention Illustrated

Human Gene Symbol (When Available)

Brca1

breast cancer 1

same as human symbol except for case

BRCA1

Cafq1

caffeine metabolism QTL 1

q: quantitative locus

C4bp-ps1

complement component 4 binding protein, pseudogene 1

-ps: pseudogene

C4BPB

D10Mit1

DNA segment, Chr 10, Massachusetts Institute of Technology 1

symbol for DNA segment identified only in the mouse; includes laboratory code (see “Laboratory Codes”)

D17H21S56

DNA segment, Chr 17, human D21S56

H21 indicates DNA segment resides on human chromosome 21

D21S56

G6pdx

glucose-6-phosphate dehydrogenase X-linked

similar but not identical to human gene symbol

G6PD

Gna-rs1

guanine binding protein, related sequence 1

-rs: related sequence

GNL1

Gtl10

gene trap locus 10

Gt: gene trap

Gt(ROSA)26Sor

gene trap ROSA 26, Philippe Soriano

vector in parentheses; laboratory code indicated (see “Laboratory Codes” section)

H2-Aa

histocompatibility 2, class II antigen A, α

HLA-DQA1

Hbb

hemoglobin β-chain complex

same as human symbol except for case

HBB

Hc9

heterochromatin, Chr 9

Hc: heterochromatin

Hras1

Harvey rat sarcoma virus oncogene 1

see also 15.6.3, Oncogenes and Tumor Suppressor Genes

HRAS

IGHMBP2 (formerly nmd)

immunoglobulin heavy chain μ binding protein 2 (formerly neuromuscular degeneration)

name change with new information about gene

Ighmbp2

l17Wis9

lethal, Chr 17, University of Wisconsin 9

initial l: lethal

Lamb1-1

β1 laminin, subunit 1

hyphen separates 2 adjacent numbers

LAMB1

Lzp-s

P lysozyme structural

s: structural

mt-Rnr1

12S RNA, mitochondrial

mt: mitochondrial

MT-RNR1

Mcptl

mast cell protease-like

l: like

Nidd1, Nidd2, Nidd3, Nidd4

non-insulin-dependent diabetes mellitus 1, 2, 3, 4

same stem (root) for gene families

Nup160

nucleoporin 160

name change (formerly Gtl1-13)

NUP160

Rnr13

rRNA, chromosome 13 cluster

Tcrb

T-cell receptor β-chain

TRB@ (formerly TCRB; @ signifies gene family; see 15.6.2 Human Gene Nomenclature)

Tel10p

telomeric sequence, Chr 10, centromere end

Tel: telomere; 10: Chr 10; p: short arm

Tg(APOE)1Vln

transgene insertion 1, Fred Van Leuven

Tg: transgene; parenthetic material: inserted gene, in this case the human gene APOE; Vln: founder or “laboratory of” designation

Table 8. Style Rules for Mouse Gene Symbols and Comparison With Human Gene Symbols (Examples)

Mouse Gene Symbol

Mouse Gene Description

Rule Illustrated

Human Gene Symbol (When Known)

a

nonagouti

lowercase initial capital because named for mutant recessive trait

ASIP

Afp

α-fetoprotein

initial capital, otherwise lowercase, Greek letter changed to roman

AFP

B2m

β2-microglobulin

no subscript

B2M

Gla

α-galactosidase

Greek letter changed to roman and moved to end of symbol

GLA

Gt(ROSA)26Sor

gene trap, ROSA 26, Philippe Soriano

parentheses may be used

Rn4.5s

4.5S RNA

period permissible

Rn5s

5S RNA

symbol does not begin with number

RN5S1@ (@ signifies gene family; see 15.6.2, Human Gene Nomenclature)

Table 10. Conventions for Mouse Gene Symbols Identified in Collaborative Sequencing Efforts (Examples)a

Mouse Gene Symbol

Mouse Gene Description

Convention Illustrated

Human Gene Symbol (When Available)

0610005C13Rik

RIKEN cDNA 0610005C13 gene

RIKEN symbol assigned to sequence that does not match known genes in other species; Rik: RIKEN Institute, Japan

Cdc42ep3

CDC42 effector protein (Rho GTPase binding) 3; formerly 3200001F04Rik

RIKEN symbol changed when gene identified in another organism

CDC42EP3

BC023055

cDNA sequence BC023055

BC indicates sequence from Mammalian Gene Collection of the National Institutes of Health

C10orf83

Aldob

aldolase 2, B isoform, formerly BC016435

Mammalian Gene Collection symbol changed when gene identified in another organism

ALDOB

AF179933

cDNA sequence AF179933

Genbank symbol for genes with no other information available in other organisms or sequencing efforts

Ppt2

palmitoyl-protein thioesterase 2, formerly AA672937 and 0610007M19Rik

Genbank sequence ID withdrawn when gene identified in other organism

PPT2

a See also Database Identifiers for Genomic Sequences in 15.6.1, Nucleic Acids and Amino Acids.

Mouse Alleles

A mouse allele symbol consists of a mouse gene symbol often with a superscript. As with mouse gene symbols, mouse allele symbols are italicized.

Allele symbols can be verified within the records of a mouse gene:

Allele searches are also available at http://www.informatics.jax.org/searches/allele_form.shtml.

Conventions and rules for mouse allele symbols are shown in Table 11. In a phenotype expression, a superscript plus sign indicates wild-type, eg:

Nf1tm1Fcr/Nf1+

Table 11. Rules and Conventions for Mouse Allele Terms (Examples)

Allele Symbol

Allele Name

Convention or Rule Illustrated

abn

abnormal

recessive trait, thus begins with lowercase; because there is no superscript indicating an allelic term, use context to clarify

Dbf

doublefoot

dominant trait, thus begins with capital; because there is no superscript indicating an allelic term, use context to clarify

Dnahc11iv

situs inversus viscerum allele of dynein, axon, heavy chain 11 gene

allele superscript designation is lowercase (recessive)

Ins2Akita

Akita allele of insulin 2 gene

allele superscript designation has initial capital (dominant)

Lama2dy-2J

dystrophia muscularis allele, Jackson 2, of α2-laminin gene (second allele discovered at the Jackson Laboratory)

laboratory code included in superscript (see “Laboratory Codes” section); hyphens used

MatpUw-dbr

underwhite dominant brown alleles of membrane-associated transporter protein gene

multiple alleles separated by hyphen in superscript

which indicates a phenotype with a mutant neurofibromatosis allele (targeted mutation 1, Fredrick Cancer Research and Development Center) and the wild-type neurofibromatosis allele.

Mouse Chromosomes.

Chromosome nomenclature is similar for mice and humans (see 15.6.4, Human Chromosomes). However, in mice, rearrangement terms are capitalized. The following listing and subsequent examples are from the International Committee on Standardized Genetic Nomenclature for Mice10:

Cen

centromere

Del

deletion

Df

deficiency

Dp

duplication

Hc

pericentric heterochromatin

Hsr

homogeneous staining region

In

inversion

Is

insertion

MatDf

maternal deficiency

MatDi

maternal disomy

MatDp

maternal duplication

Ms

monosomy

Ns

nullisomy

PatDf

paternal deficiency

PatDi

paternal disomy

PatDp

paternal duplication

Rb

robertsonian translocation

T

translocation

Tc

transchromosomal

Tel

telomere

Tet

tetrasomy

Tg

transgenic insertion

Tp

transposition

Ts

trisomy

UpDf

uniparental deficiency

UpDi

uniparental disomy

UpDp

uniparental duplication

As with human chromosomes, lowercase p represents the short arm and lowercase q the long arm. When specific mouse chromosomes are referred to, the word Chromosome is capitalized (and abbreviated Chr after first mention), eg:

Human chromosome 1 shows extensive homology to several mouse chromosomes, especially Chromosome (Chr) 4 and Chr 1.

Chromosome anomaly symbols usually include a unique laboratory code (see the “Laboratory Codes” section) and a series number, eg:

In5Rk

fifth inversion found by Roderick

T37H

37th translocation found at Harwell

Chromosome number appears in parentheses:

In(2)5Rk

inversion in Chr 2

Semicolons separate numbers of chromosomes involved in translocations:

T(4;X)37H

translocation involving Chr 4 and Chr X

Periods indicate the centromere in robertsonian translocations:

Rb(9.19)163H

robertsonian translocation involving Chr 9 and Chr 19

In insertions, the donor chromosome number comes first:

Is(7;1)40H

insertion from Chr 7 to Chr 1

For further rules and conventions for chromosomes, see the chromosome nomenclature section of the Mouse Genome Informatics website.10

Laboratory Codes

Laboratory registration codes appear as 1- to 4-letter symbols in animal genetic terminology, including chromosomal, DNA locus, and mouse strain nomenclature (see below). Such codes help identify specific colonies, useful in genetic studies that can extend over many generations. Laboratory codes are registered with the Institute of Laboratory Animal Resources at the National Academy of Sciences in Washington, DC, and may be located at http://dels.nas.edu/ilar_n/ilarhome.11 These codes uniquely identify an investigator, laboratory, or institution that breeds rodents or rabbits. Laboratory codes have initial capitals and appear without expansion. Examples are as follows:

Arb: Arthritis and Rheumatism Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases

Ddd: University of Durham, Drug Dependence Group

J: The Jackson Laboratory

N: National Institutes of Health

Ty: Benjamin A. Taylor, The Jackson Laboratory

Wil: Jean Wilson, University of Texas

Mouse Strains

Mouse strain names12 are registered at the Mouse Genome Informatics website (http://www.informatics.jax.org/mgihome/submissions/submissions_menu.shtml). Mouse strain names are available at http://www.informatics.jax.org/external/festing/search_form.cgi. (Rat strain names are registered at the Rat Genome Database.9)

Mouse strain names consist of capital letters or combinations of capital letters and numbers:

A

BXH

CBA

C57BL

FVB

HDA32

A few earlier strains have names that are entirely numeric, eg:

129

A substrain is indicated by a term following the strain name after a virgule, usually the laboratory registration codes (see above), eg:

129/J

A/J

atherosclerosis in CBA/J mice

FVB/N mice used as controls

A serial number may precede the laboratory code, eg, the 10 before the J in this example:

C57BL/10J

Exceptions to the initial capital after the virgule exist in the case of 2 well-known strains (not substrains) of mouse:

BALB/c

C57BR/cd

Many standard laboratory mouse strains are derived from crosses dating back to the early 20th century or even older lines, and the names reflect abbreviations for characteristics:

A

albino

BALB

Bagg, albino

DBA

dilute, brown, nonagouti

However, mouse strain names are not expanded.

Strain names may be abbreviated using approved abbreviations, eg:

B

C57BL

C

BALB/c

Note that some abbreviations are the same as some names of different strains (eg, the strain C and the abbreviation C), so context must clarify. Additional abbreviations are available at http://www.informatics.jax.org/mgihome/nomen/strains.shtml.

Abbreviations and the letter X are used to indicate recombinant inbred strains (female parental strain first), eg:

CXB

BALB/c x C57BL

Capital F followed by a number in parentheses may appear after a strain designation to indicate the number of inbred generations:

F(20)

20 inbred generations

For further guidelines on mouse strain nomenclature, see the Mouse Genome Informatics website at http://www.informatics.jax.org/mgihome/nomen/strains.shtml.12

Invertebrates

Drosophila melanogaster

Gene symbols for the fruitfly Drosophila melanogaster are generally capital and lowercase or all lowercase for recessive phenotypes. This convention is also observed for gene names. Gene symbols may include punctuation.13,14 A source for background on Drosophila gene names is FlyNome.15 Nomenclature rules and symbol search are available at FlyBase.13

Gene Symbol

Name

Ppi

Preproinsulinlike

SerT

Serotonin transporter

su(Hw)

suppressor of Hairy wing

tRNA:S7:23Ea

transfer RNA:ser7:23Ea (ser7: seventh isoform of serine; 23E: map position)

As with mouse alleles, Drosophila alleles are indicated with superscripts:

Hnr, Hnr2 (Henna gene, eye color-defective alleles)

Caenorhabditis elegans

The gene symbols for this nematode (roundworm) consist of 3 lowercase letters, hyphen, arabic numeral (sometimes a decimal), and, sometimes, a roman numeral after a space14,16:

dpy-1

dpy-5 I

let-37 X

sir-2.1

Parentheses indicate mutation in the gene:

let-37(mn138)

Mutation symbols consist of 1- or 2-letter terms plus a number:

mn138

A characteristic of a mutation may be indicated by a 2-letter ending set in roman type:

hc17ts (ts: temperature sensitive)

OMIA.

Online Mendelian Inheritance in Animals is the counterpart to Online Mendelian Inheritance in Man (OMIM; see 15.6.2, Human Gene Nomenclature).17,18

Microorganism Gene Nomenclature

Yeasts

Gene symbols for the fungus Saccharomyces cerevisiae consist of 3 capital letters plus a number (or, occasionally, a number-letter) ending19:

Gene Symbol

Name

ACT1

actin

CDC25

adenylate cyclase regulatory protein

COX5A

cytochrome c oxidase chain Va

This represents a change from earlier style in which all-lowercase symbols were used for loci named for recessive mutations (the preponderance of symbols) and all-capital symbols for loci named for dominant mutations. Allele symbols still follow the case convention (ie, capital for dominant, lowercase for recessive).

Bacterial Gene Nomenclature

Gene terms typically consist of an italicized lowercase 3-letter abbreviation often with an uppercase locus designator. The phenotype or encoded entity (eg, enzyme) is in all roman letters with an initial capital.14,20,21

Gene Symbol

Phenotype (Explanation)

araA

AraA (L-arabinose isomerase)

asr

Asr (acid shock protein)

imp (formerly ostA)

OstA (organic solvent intolerance; imp: increased membrane permeability)

katE

KatE (catalase)

sodA

SodA (superoxide dismutase, manganese)

sodB

SodB (superoxide dismutase, iron)

A number of bacterial genome databases are available on the Internet. The National Center for Biotechnology Information sponsors Entrez Genome (http://www.ncbi.nlm.nih.gov/entrez: under Search, select Gene, then search for the gene in question).

Alleles are designated with a number after the uppercase letter or following a hyphen, when not assigned to a locus. Wild-type alleles are designated with a superscript plus sign:

ara+

araA1

ara-23

sodA1

Retroviral Gene Nomenclature

Human immunodeficiency virus and other retro-viruses contain 3 main structural genes and a number of regulatory genes22 (see also 15.6.3, Oncogenes and Tumor Suppressor Genes):

Structural:

env

envelope gene

gag

group-specific core antigen gene

pol

polymerase gene

 

 

Regulatory:

nef

negative factor

rev

regulator of viral protein expression

tat

transactivator of viral transcription

vif

viral infectivity

vpr

viral protein R

vpu

viral protein U

vpx

viral protein X

Compare typographic style of gene names and their products (p stands for protein, gp for glycoprotein):

Gene

Gene Product (Protein or Polypeptide)

Protein Products (Examples)

env

Env

gp41, gp120

gag

Gag

p6, p7, p17, p24

pol

Pol

p12, p32, p66/51

nef

Nef

p27

rev

Rev

p19

tat

Tat

p14

vif

Vif

p24

vpr

Vpr

p15

vpu

Vpu

p16

vpx

Vpx

p14

References

1. Morse HC III. The laboratory mouse—a historical perspective. In: Foster HL, Fox F, eds. The Mouse in Biomedical Research. Vol 1. Orlando, FL: Academic Press Inc; 1981:6-10.Find This Resource

    2. Marshall Graves JA, Wakefield MJ, Peters J, Searle AG, Womack JE, O'Brien SJ. Report of the Committee on Comparative Gene Mapping. In: Cuticchia AJ, ed. Human Gene Mapping 1994: A Compendium. Baltimore, MD: Johns Hopkins University Press; 1995:962-1016.Find This Resource

      3. Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nat Genet. 2000; 25(1):25-29. Also available at http://www.geneontology.org/GO_nature_genetics_2000.pdf. Accessed April 21, 2006.Find This Resource

      4. ARKdb. http://www.thearkdb.org. Accessed April 21, 2006.

      5. RatMapGroup. RATMAP: the Rat Genome Database. http://ratmap.gen.gu.se/. Accessed April 21, 2006.

      6. International Committee on Standardized Genetic Nomenclature for Mice and Rat Genome and Nomenclature Committee. Rules for nomenclature of genes, genetic markers, alleles, and mutations in mouse and rat. http://www.informatics.jax.org/mgihome/nomen/gene.shtml#genenom. Updated January 2005. Accessed April 21, 2006.

      7. Maltais LJ, Blake JA, Chu T, Lutz CM, Eppig JT, Jackson I. Rules and guidelines for mouse gene, allele, and mutation nomenclature: a condensed version. Genomics. 2002;79(4):471-474. Also available at http://www.informatics.jax.org/mgihome/nomen/short_gene.shtml. Accessed April 21, 2006.Find This Resource

      8. Jackson Laboratory. MGI: Mouse Genome Informatics. http://www.informatics.jax.org. Updated April 20, 2006. Accessed April 21, 2006.

      9. RGD: Rat Genome Database. http://rgd.mcw.edu. Updated April 17, 2006. Accessed April 21, 2006.

      10. International Committee on Standardized Genetic Nomenclature for Mice. Rules for nomenclature of chromosome aberrations. http://www.informatics.jax.org/mgihome/nomen/anomalies.shtml. Accessed April 21, 2006.

      11. ILAR: Institute for Laboratory Animal Research. Laboratory Code Registry. http://dels.nas.edu/ilar_n/ilarhome. Accessed April 21, 2006.

      12. International Committee on Standardized Genetic Nomenclature for Mice and Rat Genome and Nomenclature Committee. Rules for nomenclature of mouse and rat strains. http://www.informatics.jax.org/mgihome/nomen/strains.shtml. Updated January 2005. Accessed April 21, 2006.

      13. FlyBase: a database of the Drosophila genome. http://flybase.net. Accessed April 21, 2006.

      14. Stewart A, ed. TIG Genetic Nomenclature Guide. Tarrytown, NY: Elsevier Trends Journals; 1995.Find This Resource

        15. FlyNome: a database of Drosophila nomenclature. http://www.flynome.com. Accessed April 21, 2006.

        16. Hodgkin J. Recommended genetic nomenclature for Caenorhabditis elegans. http://elegans.swmed.edu/Genome/nomen.html.01_10_25. Accessed April 21, 2006.

        17. Nicholas FW. Online Mendelian Inheritance in Animals (OMIA). http://www.angis.org.au/omia. Updated October 16, 2003. Accessed April 21, 2006.

        18. Rangel P, Giovannetti J. Genomes and Databases on the Internet: A Practical Guide to Functions and Applications. Norfolk, England: Horizon Scientific Press; 2002.Find This Resource

          19. SGD gene naming guidelines. http://www.yeastgenome.org/gene_guidelines.shtml. Accessed April 21, 2006.

          20. Demerec M, Adelberg EA, Clark AJ, Hartman PE. A proposal for a uniform nomenclature in bacterial genetics. Genetics. 1966;54(1):61-76.Find This Resource

          21. Journal of Bacteriology 2006 instructions to authors. http://jb.asm.org/misc/itoa.pdf. Accessed April 21, 2006.

          22. Guatelli JC, Siliciano RF, Kuritzkes DR, Richman DD. Human immunodeficiency virus. In: Richman DD, Whitley RJ, Hayden FD, eds. Clinical Virology. 2nd ed. Washington, DC: ASM Press; 2002:685-729.Find This Resource

            Previous | Next