Nonhuman Genetic Terms
15.6.5 Nonhuman Genetic Terms
- [T]he word mouse … comes originally from the
- Sanskrit mush derived from a verb meaning to
- steal. … Mice and rats, through their voracious
- activities in grain larders and as carriers of disease,
- inflicted considerable losses in food and lives upon
- ancient civilizations.
- H. C. Morse III1(p6)
- A very obvious gap in our understanding of human
- genome evolution lies in the complete absence of any
- mapping data from the eutherian orders most dis-
- tantly related to man, particularly the edentates. We
- would urge anyone with an interest in the genetics of
- the aardvark and the armadillo to consider a unique
- mapping project which will be at the forefront (al-
- phabetically, at least) of the comparative mapping
- effort.
- J. A. Marshall Graves et al2(p964)
Comparative genome analysis has shown that eukaryote species share genes to a great extent.3 Therefore, similar or identical names designate the same gene across species whenever possible. Italicization of gene symbols is uniformly observed.
Vertebrates.
Animal gene symbols resemble human gene symbols (see 15.6.2, Human Gene Nomenclature, and below).4,5 However, unlike human gene symbols, animal gene symbols typically use or include lowercase letters and punctuation marks. Editors of medical publications may follow author style for animal gene symbols.
Gene terminology for the laboratory mouse (Mus musculus domesticus) and laboratory rat (Rattus norvegicus), often seen in medical publications because of the common use of those species in investigating diseases affecting humans, is prototypic of such style.
Mouse and Rat Gene Nomenclature. Mouse and rat gene nomenclature guidelines were unified in 2003 by the International Committee on Standardized Genetic Nomenclature for Mice and the Rat Genome and Nomenclature Committee.6
Mouse and rat gene symbols resemble human symbols in several respects.6,7 They are descriptive, short (preferably 3 to 5 characters), and italicized. Symbols begin with letters, not numbers. They contain roman letters in place of Greek letters and arabic numerals in place of roman numerals.
Mouse and rat gene symbols differ from human symbols in using lowercase letters. Symbols usually contain an initial capital. Capital letters within a mouse gene symbol may indicate the laboratory code or code for another species/vector (see below). A symbol with all lowercase letters (ie, no initial capital) indicates a recessive trait. Mouse and rat gene symbols may contain hyphens and other punctuation.
The central source for mouse gene terms is the Mouse Genome Database (http://www.informatics.jax.org),8 and for rats, RatMap (http://ratmap.gen.gu.se) and the Rat Genome Database (http://rgd.mcw.edu).9 Gene names and symbols may be verified by means of the search features at those sites.
Style rules and conventions for mouse and rat gene symbols are shown in Tables 8 through 10. (Note: The gene descriptions in the tables that follow are based on but not identical to the approved gene names available in the Mouse Genome Informatics database, which are more complete and do not use Greek letters and other typographic variants. For instance, in searching for a term with a, one would type in “alpha.”) The Mammalian Orthology Query Form (http://www.informatics.jax.org/searches/homology_form.shtml) allows comparative searches of 20 vertebrate species. Note that a given letter or letter combination often but not always signifies a conventional usage. For instance, l at or near the end of a symbol often, but not always, indicates “like.”
Table 8. Style Rules for Mouse Gene Symbols and Comparison With Human Gene Symbols (Examples)
Mouse Gene Symbol |
Mouse Gene Description |
Rule Illustrated |
Human Gene Symbol (When Known) |
|---|---|---|---|
a |
nonagouti |
lowercase initial capital because named for mutant recessive trait |
ASIP |
Afp |
α-fetoprotein |
initial capital, otherwise lowercase, Greek letter changed to roman |
AFP |
B2m |
β2-microglobulin |
no subscript |
B2M |
Gla |
α-galactosidase |
Greek letter changed to roman and moved to end of symbol |
GLA |
Gt(ROSA)26Sor |
gene trap, ROSA 26, Philippe Soriano |
parentheses may be used |
|
Rn4.5s |
4.5S RNA |
period permissible |
|
Rn5s |
5S RNA |
symbol does not begin with number |
RN5S1@ (@ signifies gene family; see 15.6.2, Human Gene Nomenclature) |
Table 10. Conventions for Mouse Gene Symbols Identified in Collaborative Sequencing Efforts (Examples)a
Mouse Gene Symbol |
Mouse Gene Description |
Convention Illustrated |
Human Gene Symbol (When Available) |
|---|---|---|---|
0610005C13Rik |
RIKEN cDNA 0610005C13 gene |
RIKEN symbol assigned to sequence that does not match known genes in other species; Rik: RIKEN Institute, Japan |
|
Cdc42ep3 |
CDC42 effector protein (Rho GTPase binding) 3; formerly 3200001F04Rik |
RIKEN symbol changed when gene identified in another organism |
CDC42EP3 |
BC023055 |
cDNA sequence BC023055 |
BC indicates sequence from Mammalian Gene Collection of the National Institutes of Health |
C10orf83 |
Aldob |
aldolase 2, B isoform, formerly BC016435 |
Mammalian Gene Collection symbol changed when gene identified in another organism |
ALDOB |
AF179933 |
cDNA sequence AF179933 |
Genbank symbol for genes with no other information available in other organisms or sequencing efforts |
|
Ppt2 |
palmitoyl-protein thioesterase 2, formerly AA672937 and 0610007M19Rik |
Genbank sequence ID withdrawn when gene identified in other organism |
PPT2 |
Mouse Alleles. A mouse allele symbol consists of a mouse gene symbol often with a superscript. As with mouse gene symbols, mouse allele symbols are italicized.
Allele symbols can be verified within the records of a mouse gene:
▪ Search for the gene symbol at http://www.informatics.jax.org/javawi2/servlet/WIFetch?page=markerQF
▪ Click on link for the gene symbol that has been located
▪ Under Phenotypes, click on the numeric link after “all phenotypic alleles”
Table 9. Conventions for Mouse Gene Symbols and Comparison With Human Gene Symbols (Examples)
Mouse Gene Symbol |
Mouse Gene Description |
Convention Illustrated |
Human Gene Symbol (When Available) |
|---|---|---|---|
Brca1 |
breast cancer 1 |
same as human symbol except for case |
BRCA1 |
Cafq1 |
caffeine metabolism QTL 1 |
q: quantitative locus |
|
C4bp-ps1 |
complement component 4 binding protein, pseudogene 1 |
-ps: pseudogene |
C4BPB |
D10Mit1 |
DNA segment, Chr 10, Massachusetts Institute of Technology 1 |
symbol for DNA segment identified only in the mouse; includes laboratory code (see “Laboratory Codes”) |
|
D17H21S56 |
DNA segment, Chr 17, human D21S56 |
H21 indicates DNA segment resides on human chromosome 21 |
D21S56 |
G6pdx |
glucose-6-phosphate dehydrogenase X-linked |
similar but not identical to human gene symbol |
G6PD |
Gna-rs1 |
guanine binding protein, related sequence 1 |
-rs: related sequence |
GNL1 |
Gtl10 |
gene trap locus 10 |
Gt: gene trap |
|
Gt(ROSA)26Sor |
gene trap ROSA 26, Philippe Soriano |
vector in parentheses; laboratory code indicated (see “Laboratory Codes” section) |
|
H2-Aa |
histocompatibility 2, class II antigen A, α |
HLA-DQA1 |
|
Hbb |
hemoglobin β-chain complex |
same as human symbol except for case |
HBB |
Hc9 |
heterochromatin, Chr 9 |
Hc: heterochromatin |
|
Hras1 |
Harvey rat sarcoma virus oncogene 1 |
see also 15.6.3, Oncogenes and Tumor Suppressor Genes |
HRAS |
IGHMBP2 (formerly nmd) |
immunoglobulin heavy chain μ binding protein 2 (formerly neuromuscular degeneration) |
name change with new information about gene |
Ighmbp2 |
l17Wis9 |
lethal, Chr 17, University of Wisconsin 9 |
initial l: lethal |
|
Lamb1-1 |
β1 laminin, subunit 1 |
hyphen separates 2 adjacent numbers |
LAMB1 |
Lzp-s |
P lysozyme structural |
s: structural |
|
mt-Rnr1 |
12S RNA, mitochondrial |
mt: mitochondrial |
MT-RNR1 |
Mcptl |
mast cell protease-like |
l: like |
|
Nidd1, Nidd2, Nidd3, Nidd4 |
non-insulin-dependent diabetes mellitus 1, 2, 3, 4 |
same stem (root) for gene families |
|
Nup160 |
nucleoporin 160 |
name change (formerly Gtl1-13) |
NUP160 |
Rnr13 |
rRNA, chromosome 13 cluster |
||
Tcrb |
T-cell receptor β-chain |
TRB@ (formerly TCRB; @ signifies gene family; see 15.6.2 Human Gene Nomenclature) |
|
Tel10p |
telomeric sequence, Chr 10, centromere end |
Tel: telomere; 10: Chr 10; p: short arm |
|
Tg(APOE)1Vln |
transgene insertion 1, Fred Van Leuven |
Tg: transgene; parenthetic material: inserted gene, in this case the human gene APOE; Vln: founder or “laboratory of” designation |
Conventions and rules for mouse allele symbols are shown in Table 11. In a phenotype expression, a superscript plus sign indicates wild-type, eg:
Nf1tm1Fcr/Nf1+
Table 11. Rules and Conventions for Mouse Allele Terms (Examples)
Allele Symbol |
Allele Name |
Convention or Rule Illustrated |
|---|---|---|
abn |
abnormal |
recessive trait, thus begins with lowercase; because there is no superscript indicating an allelic term, use context to clarify |
Dbf |
doublefoot |
dominant trait, thus begins with capital; because there is no superscript indicating an allelic term, use context to clarify |
Dnahc11iv |
situs inversus viscerum allele of dynein, axon, heavy chain 11 gene |
allele superscript designation is lowercase (recessive) |
Ins2Akita |
Akita allele of insulin 2 gene |
allele superscript designation has initial capital (dominant) |
Lama2dy-2J |
dystrophia muscularis allele, Jackson 2, of α2-laminin gene (second allele discovered at the Jackson Laboratory) |
laboratory code included in superscript (see “Laboratory Codes” section); hyphens used |
MatpUw-dbr |
underwhite dominant brown alleles of membrane-associated transporter protein gene |
multiple alleles separated by hyphen in superscript |
which indicates a phenotype with a mutant neurofibromatosis allele (targeted mutation 1, Fredrick Cancer Research and Development Center) and the wild-type neurofibromatosis allele.
Mouse Chromosomes.
Chromosome nomenclature is similar for mice and humans (see 15.6.4, Human Chromosomes). However, in mice, rearrangement terms are capitalized. The following listing and subsequent examples are from the International Committee on Standardized Genetic Nomenclature for Mice10:
Cen |
centromere |
Del |
deletion |
Df |
deficiency |
Dp |
duplication |
Hc |
pericentric heterochromatin |
Hsr |
homogeneous staining region |
In |
inversion |
Is |
insertion |
MatDf |
maternal deficiency |
MatDi |
maternal disomy |
MatDp |
maternal duplication |
Ms |
monosomy |
Ns |
nullisomy |
PatDf |
paternal deficiency |
PatDi |
paternal disomy |
PatDp |
paternal duplication |
Rb |
robertsonian translocation |
T |
translocation |
Tc |
transchromosomal |
Tel |
telomere |
Tet |
tetrasomy |
Tg |
transgenic insertion |
Tp |
transposition |
Ts |
trisomy |
UpDf |
uniparental deficiency |
UpDi |
uniparental disomy |
UpDp |
uniparental duplication |
As with human chromosomes, lowercase p represents the short arm and lowercase q the long arm. When specific mouse chromosomes are referred to, the word Chromosome is capitalized (and abbreviated Chr after first mention), eg:
Human chromosome 1 shows extensive homology to several mouse chromosomes, especially Chromosome (Chr) 4 and Chr 1.
Chromosome anomaly symbols usually include a unique laboratory code (see the “Laboratory Codes” section) and a series number, eg:
In5Rk |
fifth inversion found by Roderick |
T37H |
37th translocation found at Harwell |
Chromosome number appears in parentheses:
In(2)5Rk |
inversion in Chr 2 |
Semicolons separate numbers of chromosomes involved in translocations:
T(4;X)37H |
translocation involving Chr 4 and Chr X |
Periods indicate the centromere in robertsonian translocations:
Rb(9.19)163H |
robertsonian translocation involving Chr 9 and Chr 19 |
In insertions, the donor chromosome number comes first:
Is(7;1)40H |
insertion from Chr 7 to Chr 1 |
For further rules and conventions for chromosomes, see the chromosome nomenclature section of the Mouse Genome Informatics website.10
Laboratory Codes. Laboratory registration codes appear as 1- to 4-letter symbols in animal genetic terminology, including chromosomal, DNA locus, and mouse strain nomenclature (see below). Such codes help identify specific colonies, useful in genetic studies that can extend over many generations. Laboratory codes are registered with the Institute of Laboratory Animal Resources at the National Academy of Sciences in Washington, DC, and may be located at http://dels.nas.edu/ilar_n/ilarhome.11 These codes uniquely identify an investigator, laboratory, or institution that breeds rodents or rabbits. Laboratory codes have initial capitals and appear without expansion. Examples are as follows:
Arb: Arthritis and Rheumatism Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases
Ddd: University of Durham, Drug Dependence Group
J: The Jackson Laboratory
N: National Institutes of Health
Ty: Benjamin A. Taylor, The Jackson Laboratory
Wil: Jean Wilson, University of Texas
Mouse Strains. Mouse strain names12 are registered at the Mouse Genome Informatics website (http://www.informatics.jax.org/mgihome/submissions/submissions_menu.shtml). Mouse strain names are available at http://www.informatics.jax.org/external/festing/search_form.cgi. (Rat strain names are registered at the Rat Genome Database.9)
Mouse strain names consist of capital letters or combinations of capital letters and numbers:
A
BXH
CBA
C57BL
FVB
HDA32
A few earlier strains have names that are entirely numeric, eg:
129
A substrain is indicated by a term following the strain name after a virgule, usually the laboratory registration codes (see above), eg:
129/J
A/J
atherosclerosis in CBA/J mice
FVB/N mice used as controls
A serial number may precede the laboratory code, eg, the 10 before the J in this example:
C57BL/10J
Exceptions to the initial capital after the virgule exist in the case of 2 well-known strains (not substrains) of mouse:
BALB/c
C57BR/cd
Many standard laboratory mouse strains are derived from crosses dating back to the early 20th century or even older lines, and the names reflect abbreviations for characteristics:
A |
albino |
BALB |
Bagg, albino |
DBA |
dilute, brown, nonagouti |
However, mouse strain names are not expanded.
Strain names may be abbreviated using approved abbreviations, eg:
B |
C57BL |
C |
BALB/c |
Note that some abbreviations are the same as some names of different strains (eg, the strain C and the abbreviation C), so context must clarify. Additional abbreviations are available at http://www.informatics.jax.org/mgihome/nomen/strains.shtml.
Abbreviations and the letter X are used to indicate recombinant inbred strains (female parental strain first), eg:
CXB |
BALB/c x C57BL |
Capital F followed by a number in parentheses may appear after a strain designation to indicate the number of inbred generations:
F(20) |
20 inbred generations |
For further guidelines on mouse strain nomenclature, see the Mouse Genome Informatics website at http://www.informatics.jax.org/mgihome/nomen/strains.shtml.12
Invertebrates
Drosophila melanogaster. Gene symbols for the fruitfly Drosophila melanogaster are generally capital and lowercase or all lowercase for recessive phenotypes. This convention is also observed for gene names. Gene symbols may include punctuation.13,14 A source for background on Drosophila gene names is FlyNome.15 Nomenclature rules and symbol search are available at FlyBase.13
Gene Symbol |
Name |
|---|---|
Ppi |
Preproinsulinlike |
SerT |
Serotonin transporter |
su(Hw) |
suppressor of Hairy wing |
tRNA:S7:23Ea |
transfer RNA:ser7:23Ea (ser7: seventh isoform of serine; 23E: map position) |
As with mouse alleles, Drosophila alleles are indicated with superscripts:
Hnr, Hnr2 (Henna gene, eye color-defective alleles)
Caenorhabditis elegans. The gene symbols for this nematode (roundworm) consist of 3 lowercase letters, hyphen, arabic numeral (sometimes a decimal), and, sometimes, a roman numeral after a space14,16:
dpy-1
dpy-5 I
let-37 X
sir-2.1
Parentheses indicate mutation in the gene:
let-37(mn138)
Mutation symbols consist of 1- or 2-letter terms plus a number:
mn138
A characteristic of a mutation may be indicated by a 2-letter ending set in roman type:
hc17ts (ts: temperature sensitive)
OMIA.
Online Mendelian Inheritance in Animals is the counterpart to Online Mendelian Inheritance in Man (OMIM; see 15.6.2, Human Gene Nomenclature).17,18
Microorganism Gene Nomenclature
Yeasts. Gene symbols for the fungus Saccharomyces cerevisiae consist of 3 capital letters plus a number (or, occasionally, a number-letter) ending19:
Gene Symbol |
Name |
|---|---|
ACT1 |
actin |
CDC25 |
adenylate cyclase regulatory protein |
COX5A |
cytochrome c oxidase chain Va |
This represents a change from earlier style in which all-lowercase symbols were used for loci named for recessive mutations (the preponderance of symbols) and all-capital symbols for loci named for dominant mutations. Allele symbols still follow the case convention (ie, capital for dominant, lowercase for recessive).
Bacterial Gene Nomenclature. Gene terms typically consist of an italicized lowercase 3-letter abbreviation often with an uppercase locus designator. The phenotype or encoded entity (eg, enzyme) is in all roman letters with an initial capital.14,20,21
Gene Symbol |
Phenotype (Explanation) |
|---|---|
araA |
AraA (L-arabinose isomerase) |
asr |
Asr (acid shock protein) |
imp (formerly ostA) |
OstA (organic solvent intolerance; imp: increased membrane permeability) |
katE |
KatE (catalase) |
sodA |
SodA (superoxide dismutase, manganese) |
sodB |
SodB (superoxide dismutase, iron) |
A number of bacterial genome databases are available on the Internet. The National Center for Biotechnology Information sponsors Entrez Genome (http://www.ncbi.nlm.nih.gov/entrez: under Search, select Gene, then search for the gene in question).
Alleles are designated with a number after the uppercase letter or following a hyphen, when not assigned to a locus. Wild-type alleles are designated with a superscript plus sign:
ara+
araA1
ara-23
sodA1
Retroviral Gene Nomenclature. Human immunodeficiency virus and other retro-viruses contain 3 main structural genes and a number of regulatory genes22 (see also 15.6.3, Oncogenes and Tumor Suppressor Genes):
Structural: |
|
env |
envelope gene |
gag |
group-specific core antigen gene |
pol |
polymerase gene |
|
|
Regulatory: |
|
nef |
negative factor |
rev |
regulator of viral protein expression |
tat |
transactivator of viral transcription |
vif |
viral infectivity |
vpr |
viral protein R |
vpu |
viral protein U |
vpx |
viral protein X |
Compare typographic style of gene names and their products (p stands for protein, gp for glycoprotein):
Gene |
Gene Product (Protein or Polypeptide) |
Protein Products (Examples) |
|---|---|---|
env |
Env |
gp41, gp120 |
gag |
Gag |
p6, p7, p17, p24 |
pol |
Pol |
p12, p32, p66/51 |
nef |
Nef |
p27 |
rev |
Rev |
p19 |
tat |
Tat |
p14 |
vif |
Vif |
p24 |
vpr |
Vpr |
p15 |
vpu |
Vpu |
p16 |
vpx |
Vpx |
p14 |
References
1. Morse HC III. The laboratory mouse—a historical perspective. In: Foster HL, Fox F, eds. The Mouse in Biomedical Research. Vol 1. Orlando, FL: Academic Press Inc; 1981:6-10.
Find This Resource
2. Marshall Graves JA, Wakefield MJ, Peters J, Searle AG, Womack JE, O'Brien SJ. Report of the Committee on Comparative Gene Mapping. In: Cuticchia AJ, ed. Human Gene Mapping 1994: A Compendium. Baltimore, MD: Johns Hopkins University Press; 1995:962-1016.
Find This Resource
3. Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nat Genet. 2000; 25(1):25-29. Also available at http://www.geneontology.org/GO_nature_genetics_2000.pdf. Accessed April 21, 2006.
Find This Resource
4. ARKdb. http://www.thearkdb.org. Accessed April 21, 2006.
5. RatMapGroup. RATMAP: the Rat Genome Database. http://ratmap.gen.gu.se/. Accessed April 21, 2006.
6. International Committee on Standardized Genetic Nomenclature for Mice and Rat Genome and Nomenclature Committee. Rules for nomenclature of genes, genetic markers, alleles, and mutations in mouse and rat. http://www.informatics.jax.org/mgihome/nomen/gene.shtml#genenom. Updated January 2005. Accessed April 21, 2006.
7. Maltais LJ, Blake JA, Chu T, Lutz CM, Eppig JT, Jackson I. Rules and guidelines for mouse gene, allele, and mutation nomenclature: a condensed version. Genomics. 2002;79(4):471-474. Also available at http://www.informatics.jax.org/mgihome/nomen/short_gene.shtml. Accessed April 21, 2006.
Find This Resource
8. Jackson Laboratory. MGI: Mouse Genome Informatics. http://www.informatics.jax.org. Updated April 20, 2006. Accessed April 21, 2006.
9. RGD: Rat Genome Database. http://rgd.mcw.edu. Updated April 17, 2006. Accessed April 21, 2006.
10. International Committee on Standardized Genetic Nomenclature for Mice. Rules for nomenclature of chromosome aberrations. http://www.informatics.jax.org/mgihome/nomen/anomalies.shtml. Accessed April 21, 2006.
11. ILAR: Institute for Laboratory Animal Research. Laboratory Code Registry. http://dels.nas.edu/ilar_n/ilarhome. Accessed April 21, 2006.
12. International Committee on Standardized Genetic Nomenclature for Mice and Rat Genome and Nomenclature Committee. Rules for nomenclature of mouse and rat strains. http://www.informatics.jax.org/mgihome/nomen/strains.shtml. Updated January 2005. Accessed April 21, 2006.
13. FlyBase: a database of the Drosophila genome. http://flybase.net. Accessed April 21, 2006.
14. Stewart A, ed. TIG Genetic Nomenclature Guide. Tarrytown, NY: Elsevier Trends Journals; 1995.
Find This Resource
15. FlyNome: a database of Drosophila nomenclature. http://www.flynome.com. Accessed April 21, 2006.
16. Hodgkin J. Recommended genetic nomenclature for Caenorhabditis elegans. http://elegans.swmed.edu/Genome/nomen.html.01_10_25. Accessed April 21, 2006.
17. Nicholas FW. Online Mendelian Inheritance in Animals (OMIA). http://www.angis.org.au/omia. Updated October 16, 2003. Accessed April 21, 2006.
18. Rangel P, Giovannetti J. Genomes and Databases on the Internet: A Practical Guide to Functions and Applications. Norfolk, England: Horizon Scientific Press; 2002.
Find This Resource
19. SGD gene naming guidelines. http://www.yeastgenome.org/gene_guidelines.shtml. Accessed April 21, 2006.
20. Demerec M, Adelberg EA, Clark AJ, Hartman PE. A proposal for a uniform nomenclature in bacterial genetics. Genetics. 1966;54(1):61-76.
Find This Resource
21. Journal of Bacteriology 2006 instructions to authors. http://jb.asm.org/misc/itoa.pdf. Accessed April 21, 2006.
22. Guatelli JC, Siliciano RF, Kuritzkes DR, Richman DD. Human immunodeficiency virus. In: Richman DD, Whitley RJ, Hayden FD, eds. Clinical Virology. 2nd ed. Washington, DC: ASM Press; 2002:685-729.
Find This Resource