Human Gene Nomenclature
15.6.2 Human Gene Nomenclature
The International System for Human Gene Nomenclature (ISGN) was inaugurated in 19791,2 and has been continually updated. The Human Gene Mapping Nomenclature Committee, which developed the ISGN, put forth a “one human genome–one gene language” principle:
Certainly there exists a genetic and molecular basis for a single human gene language without dialects. All human nuclear genes as we know them follow the same genetic, molecular, and evolutionary principles…. Thus it is reasonable and logical to develop a standard and consolidated gene nomenclature system rather than have a human gene language based on different gene systems.3(p12)
The committee, known as the HUGO Gene Nomenclature Committee (HGNC), is 1 of 7 committees of the Human Genome Organisation (HUGO) and is “responsible for gene name validation.”4(p115) Gene names and symbols are assigned by the HGNC.5 The human genome is estimated to have approximately 30 000 genes, more than 20 000 of which are represented by active symbols,6 with the remainder to be named in a consistent fashion as genes are discovered.
▪ Gene Symbols: A gene symbol is a short term, typically 3 to 7 characters long, that conveys in abbreviated form the name or other attribute of a gene. Human gene symbols usually consist of uppercase letters and may also contain (but never begin with) numerals. Approved gene symbols do not contain Greek letters, roman numerals, superscripts, or subscripts and usually contain no punctuation. In JAMA and the Archives Journals, gene symbols are italicized, per official recommendations.7 Italicizing is a useful way to make clear that a gene, and not a similarly named entity such as a condition or product of the gene, is being discussed. Italics are not necessary in published catalogs of gene symbols.7 For style rules for gene symbols, see Table 3.
Approved symbols may represent other entities, such as chromosomal regions, certain syndromes, genes whose existence is inferred (supported by linkage analysis or association with known markers), cloned DNA segments, pseudogenes, and DNA fragments.
Within larger terms, only the gene symbol is italicized:
ADRB246G>A (not: ADRB2 46G>A)
(For an explanation of 46G>A and Gly16Arg, see “Sequence Variations, Nucleotides,” and “Sequence Variations, Amino Acids,” in 15.6.1, Nucleic Acids and Amino Acids.)ADRB2 Gly16Arg (not: ADRB2 Gly16Arg)
Authors are encouraged to use the most up-to-date gene symbol, which may be verified at the HGNC website in the Human Gene Nomenclature Database (Searchgenes feature),6,8,9 or Entrez Gene.10 The records available in Searchgenes contain “23 fields, with 14 links to other resources,” such as Online Mendelian Inheritance in Man (OMIM, see later in this section), LocusLink, and Swiss-Prot (see 15.6.1, Nucleic Acids and Amino Acids).9 Consistent use of the approved gene symbol provides advantages when searching for information in multiple databases.11
▪ Gene Names: Genes are usually named for the molecular product of the gene, the function of the gene, or the condition associated with the gene if known. Gene names are not italicized. As shown directly below, the approved gene names, available in the above mentioned databases, expand Greek letters and do not use subscripts, etc (so that, for instance, in using Searchgenes to find a gene name with a, one would type in “alpha”). Descriptions based on the approved gene names but styled according to the journal in question (eg, using Greek letters and subscripts) or omitting some terms from the full name are permissible in general medical journals.
approved gene name: the alpha-fetoprotein gene
description: the α-fetoprotein gene
approved gene name: the gene for beta-2-microglobulin
description: the gene for β2-microglobulin
A number of conventions are followed when gene symbols and names are officially designated. Related genes are often assigned symbols by sequentially numbering a stem, the root symbol for the gene family:
ABC: root symbol
genes: ABCA1, ABCG4, etc
TNF: root symbol
genes: TNF, TNFAIP1, TNFAIP2, TNFAIP3, etc
Other conventions involve stereotypic abbreviations, eg, CR will often signify a “chromosome region.” (However, a given letter or letter combination does not always signify a conventional usage. For instance, L at or near the end of a symbol often, but not always, indicates “like.”) In Table 4, the conventions in column 3 reflect HGNC recommendations.7 (Note: DNA sequences are available from the Genome Database, http://www.genenames.org/.7)
Table 4. Conventions for Gene Names and Gene Symbols (Examples)
Gene Description |
Gene Symbol |
Convention Illustrated |
|---|---|---|
Angelman syndrome chromosome region |
ANCR |
CR: chromosome region |
BRCA1-associated protein |
BRAP |
AP: associated protein |
bromodomain containing 1 |
BRD1 |
D: domain-containing |
chromosome 11 open reading frame 10 |
C11orf10 |
orf: lowercase exception for “open reading frame” |
calcium modulating ligand |
CAMLG |
LG: ligand |
caspase 1, 2, 3, etc, apoptosis-related cysteine protease |
CASP1, CASP2, CASP3, etc |
stem (CASP), sequentially numbered |
cyclin-dependent kinase inhibitor 1 B |
CDKN1B |
N: inhibitor |
Cornelia de Lange syndrome 1 |
CDL1 |
named for condition; L at end in this case does not signify “like” |
carpal tunnel syndrome 1 |
CTS1 |
named for syndrome |
cystic fibrosis transmembrane conductance regulator |
CFTR |
formerly CF; name modified after discovery of gene product |
collagen (type VI, α1), overlapping transcript 1 |
COLOT1 |
OT: overlapping transcript |
DNA segment sequence |
D19S1177E |
D: DNA; 19: chromosome 19; S: (unique DNA) segment; E expressed |
Down syndrome chromosome region |
DCR |
CR: chromosome region |
deafness, autosomal dominant 4 |
DFNA4 |
named for condition |
DNA segment sequence |
DXS522E |
as above; X: X chromosome |
DNA segment sequence |
DXYS155E |
as above; XY: sequence present at homologous sites on chromosomes X and Y |
family with sequence similarity 7, member A1 |
FAM7A1 |
FAM: family with sequence similarity |
fragile site, aphidicolin type, common, fra(10)(q11.2) (see also 15.6.4, Human Chromosomes) |
FRA10G |
FRA: fragile site; 10: chromosome 10; G: series letter |
fragile site, folic acid type, rare, fra(X)(q28) |
FRAXF |
X: X chromosome; final F: series letter |
glucose 6-phosphatase, catalytic (glycogen storage disease type I, von Gierke disease) |
G6PC |
C: catalytic |
glucose-6-phosphate dehydrogenase |
G6PD |
named for gene product |
glucose-6-phosphate dehydrogenase–like |
G6PDL |
L: “like” sequence |
hemoglobin, α1 |
HBA1 |
named for gene product |
hemoglobin, α1 pseudogene |
HBAP1 |
P: “pseudogene” (compare term directly above) |
hair color 1 (brown) |
HCL1 |
named for characteristic |
human immunodeficiency virus 1 enhancer binding protein 2 |
HIVEP2 |
P: does not always signify “pseudogene” |
major histocompatibility complex, class I, A |
HLA-A |
punctuation exception for HLA genes |
homeobox A7 |
HOXA7 |
HOX signifies “homeobox” gene family |
insulinlike growth factor 2, antisense |
IGF2AS |
AS: antisense |
insulin-dependent diabetes mellitus 10 |
IDDM10 |
a type 1 diabetes susceptibility locus, number 10 |
interleukin 18 binding protein |
IL18BP |
BP: binding protein |
insulin |
INS |
named for gene product |
insulin receptor |
INSR |
R: receptor |
insulin receptor–like |
INSRL |
R: receptor L: like |
loss of heterozygosity 3, chromosomal region 2, gene A |
LOH3CR2A |
LOH: loss of heterozygosity |
melanoma antigen, family A, 2 |
MAGEA2 |
named for condition and gene product |
mitochondrial ribosomal protein 63 |
MRP63 |
M: mitochondrial RP: ribosomal protein |
7S mitochondrial DNA |
MT7SDNA |
MT: mitochondrial |
mitochondrially encoded 12S RNA |
MT-RNR1 |
MT: mitochondrial, used with hyphen (punctuation exception) |
programmed cell death 1 |
PDCD1 |
named for function |
pepsinogen A gene cluster |
PGA@ |
@: gene family or cluster |
renin |
REN |
named for gene product |
renin binding protein |
RENBP |
named for gene product; BP: binding protein |
5S RNA, cluster 1 |
RN5S1@ |
@: gene family or cluster; RN: RNA |
schwannomin interacting protein 1 |
SCHIP1 |
IP: interacting protein |
T-cell, immune regulator 1 |
TCIRG1 |
RG: regulator |
α2-tubulin |
TUBA2 |
named for gene product |
zinc finger protein 160 |
ZNF160 |
initial ZNF indicates zinc finger protein |
When a gene name or symbol has been changed, both the new and former names (previous symbols) are available in gene databases.6,10 Authors should use the most up-to-date term. The previous symbol may be included parenthetically at first mention:
CYP2A6 (formerly CYP2A3)
SOD1 (formerly ALS and ALS1)
Writing About Genes and Italicizing Gene Symbols.
Observing the rule of italicizing gene symbols makes clear whether the writer is referring to a gene or to another entity that might be confused with a gene.
In any discussion of a gene, it is recommended that the approved gene symbol be mentioned at some point, preferably in the title and abstract if relevant. However, the gene symbol need not be mentioned every time the writer refers to the gene. Authors may refer to genes (or gene loci) by their official gene names or other descriptive expression. Any of these is acceptable, depending on context and syntax. Of names, descriptions, and symbols, only the gene symbol is italicized. Examples are shown below:
Acceptable Expression |
Gene Description |
Gene Symbol |
|---|---|---|
the breast and ovarian cancer susceptibility gene |
breast cancer 1, early-onset gene |
BRCA1 |
the cystic fibrosis locus |
cystic fibrosis transmembrane conductance regulator gene |
CFTR |
the factor VIII locus |
coagulation factor VIII, procoagulant component (hemophilia A) gene |
F8 |
the hemophilia A locus |
coagulation factor VIII, procoagulant component (hemophilia A) gene |
F8 |
the gene for synapsin I |
synapsin I gene |
SYN1 |
the p53 gene |
tumor protein p53 (Li-Fraumeni syndrome) gene |
TP53 |
In the foregoing examples, the gene names and descriptions are readily distinguishable from the gene symbols. Sometimes, however, the gene symbol may be easily confused with the abbreviation for the product or condition associated with the gene unless the gene symbol is italicized; for instance:
Gene |
Potentially Confusing Nongene Term |
|---|---|
ABO |
ABO blood group system (see also 15.1, Blood Groups, Platelet Antigens, and Granulocyte Antigens) |
APOE |
apoE (apolipoprotein E) |
EPO |
erythropoietin (Epo) |
GRIFIN |
GRIFIN protein (galectin-related interfiber protein) |
HLA-A, HLA-B, etc |
HLA-A, HLA-B, etc (see also 15.8.5, Immunology, HLA/Major Histocompatibility Complex) |
MS |
multiple sclerosis (MS) |
many hormone genes, eg, CRH, GHRH, GNRHR, PTH, TRH |
hormone name abbreviations, eg, CRH, GHRH, GNRH receptor, PTH, TRH |
In some expressions, italics may be moot, for instance, if a gene is named for an enzyme it produces:
Term |
Meaning |
|---|---|
TH gene |
gene for tyrosine hydroxylase |
TH gene |
gene for tyrosine hydroxylase |
In other expressions, italics distinguish different meanings:
Term |
Meaning |
|---|---|
HD |
gene for huntingtin (protein), Huntington disease gene |
HD |
Huntington disease |
person with HD |
person with the HD gene, whether the disease-causing or normal form |
person with HD |
person with Huntington disease |
prevalence of HD |
prevalence of the HD gene |
prevalence of HD |
prevalence of Huntington disease; not necessarily equal to prevalence of the HD gene |
TH deficiency |
impaired functioning of the TH gene |
TH deficiency |
deficiency of the enzyme TH |
Therefore, it is best to make clear by italicizing gene symbols and through context whether the gene or another entity is being discussed.
Gene symbols do not immediately follow the term in the gene name that they might seem to abbreviate, but rather, should relate to the word gene, usually following it:
the guanylate cyclase 2D gene, GUCY2D
Not: the guanylate cyclase 2D (GUCY2D) gene
the Huntington disease gene, HD
the tyrosine hydroxylase gene, TH
The cystic fibrosis transmembrane conductance regulator gene, CFTR, is implicated in cystic fibrosis.
In the following examples, both gene aliases and approved symbols are used (see also 14.11, Abbreviations, Clinical, Technical, and Other Common Terms):
the retinal guanylate cyclase 2D (GUCY2D) gene, GUCY2D
the retinal guanylate cyclase 2D (RetGC1) gene, GUCY2D
Not: the guanylate cyclase 2D (GUCY2D) gene
the Huntington disease (HD) gene, HD
the tyrosine hydroxylase (TH) gene, TH
The cystic fibrosis (CF) transmembrane conductance regulator gene, CFTR, is implicated in CF.
In discussions of mutations, the gene symbol remains italicized; specific mutations, however, are not italicized (see “Sequence Variations, Nucleotides,” and “Sequence Variations, Amino Acids” in 15.6.1, Nucleic Acids and Amino Acids):
ADRB2 46G>A
mutation of the GUCY2D gene
mutation of GUCY2D
GUCY2D mutation
mutated GUCY2D gene
Objective: To describe the phenotype in 4 families with dominantly inherited cone-rod dystrophy, 1 with an R838C mutation and 1 with an R838H mutation in the guanylate cyclase 2D gene (GUCY2D) encoding retinal guanylate cyclase-1.
LRP5v171: valine substitution at codon 171 of the LRP5 gene
In gene mapping, when the order of genes along the chromosome is known, the genes are listed from short-arm end (pter) to the centromere (cen) or long-arm end (qter) (see 15.6.4, Human Chromosomes):
pter-ENO1-PGM1-AMY1-cen
When the order of genes along the chromosome is not known, the genes are listed alphabetically and parentheses are used:
pter-PGD-AK2-(ACTA,APOA2,REN)-qter
Table 5 presents gene names and symbols from fields covered elsewhere in this chapter.
Table 5. Gene Names and Symbols From Fields Covered Elsewhere in This Chapter
Gene Symbol |
Gene Description |
|---|---|
15.1, Blood Groups, Platelet Antigens, and Granulocyte Antigens | |
A4GALT |
α-1,4-galactosyltransferase (P blood group) |
ABO |
ABO blood group (transferase A, α-1-3-N-acetylgalactosaminyltransferase; transferase B, α-1-3-galactosyltransferase) |
ACHE |
acetylcholinesterase (Yt blood group) |
AQP1 (was CO) |
aquaporin 1 |
ART4 (was DO) |
ADP ribosyltransferase 4 (Dombrock blood group) |
BCAM (was LU) |
basic cell adhesion molecule (Lutheran blood group) |
BSG |
basigin (OK blood group) |
C4A |
complement component 4A |
C4B |
complement component 4B |
CD44 |
CD44 antigen (homing function and Indian blood group system) |
CD151 (was MER2) |
antigen identified by monoclonal antibodies 1D12, 2F7 |
CR1 |
complement component (3b/4b) receptor 1, including Knops blood group system |
CD55 (was DAF) |
CD55, decay accelerating factor (DAF) for complement (Cromer blood group system) |
DARC (was FY) |
chemokine receptor (Duffy blood group) |
ERMAP (was SC) |
erythroblast membrane–associated protein (Scianna blood group) |
FUT1 |
fucosyltransferase 1 |
FUT3 |
fucosyltransferase 3 |
GYPA |
glycophorin A (includes MN blood group) |
GYPB |
glycophorin B (includes Ss blood group) |
GYPC |
glycophorin C (Gerbich blood group) |
GYPE |
glycophorin E |
ICAM4 |
intercellular adhesion molecule 4, Landsteiner-Wiener blood group |
KEL |
Kell blood group |
P1 |
P blood group (P1 antigen) |
RHCE |
Rh blood group, CcEe antigens |
RHD |
Rh blood group, D antigen |
SLC4A1 |
solute carrier family 4, anion exchanger, member 1 (erythrocyte membrane protein band 3, Diego blood group) |
SLC14A1 |
solute carrier family 14 (urea transporter), member 1 (Kidd blood group) |
XG |
Xg blood group (pseudoautosomal boundary-divided on the X chromosome) |
XK |
Kell blood group precursor (McLeod phenotype) |
15.2, Cancer (See Also 15.6.3, Oncogenes and Tumor Suppressor Genes) | |
|---|---|
ACTN1 |
α1-actinin |
ACTN2 |
α1-actinin |
BCL2 |
B-cell/CLL lymphoma 2 |
BCL7A |
B-cell/CLL lymphoma 7A |
CCND1 (formerly BCL1) |
cyclin D1 |
CDC2 |
cell division cycle 2, G1 to S and G2 to M |
CDK2 |
cyclin-dependent kinase 2 |
CDKN1A |
cyclin-dependent kinase inhibitor 1A (p21, Cip1) |
CTNNB1 |
β1-catenin |
MEN1 |
multiple endocrine neoplasia 1 |
RB1 |
retinoblastoma 1 (including osteosarcoma) |
RET (formerly MEN2A, MEN2B) |
ret proto-oncogene (multiple endocrine neoplasia and medullary thyroid carcinoma 1, Hirschsprung disease) |
TGFA |
transforming growth factor α |
TGFB1 |
transforming growth factor β1 (Camurati-Engelmann disease) |
TNF |
tumor necrosis factor (TNF superfamily, member 2) |
TNFRSF1A |
TNF receptor superfamily, member 1A |
TP53 |
tumor protein p53 (Li-Fraumeni syndrome) |
15.3, Cardiology |
|
|---|---|
ANK2 (formerly LQT4) |
ankyrin 2 (neuronal; formerly long QT syndrome 4) |
APOA1 |
apolipoprotein AI |
APOB |
apoliprotein B |
APOC2 |
apoliprotein CII |
APOD |
apoliprotein D |
APOE |
apolipoprotein E |
GPR1 |
G protein–coupled receptor 1 |
HDLBP |
high-density lipoprotein-binding protein (vigilin) |
KCNH2 (formerly LQT2) |
potassium voltage-gated channel, subfamily H (eag-related), member 2 |
KCNQ1 (formerly LQT1) |
potassium voltage-gated channel, KQT-like subfamily, member 1 |
LDLR |
low-density lipoprotein receptor (familial hypercholesterolemia) |
LPL |
lipoprotein lipase |
NOS1 |
nitric oxide synthase 1 (neuronal) |
NOS2A |
nitric oxide synthase 2A (inducible, hepatocytes) |
NOS2B |
nitric oxide synthase 2B |
NOS2C |
nitric oxide synthase 2C |
NOS3 |
nitric oxide synthase 3 (endothelial cell) |
PLAT |
tissue plasminogen activator |
SCN5A (formerly LQT3) |
sodium channel, voltage-gated, type V, alpha polypeptide (long QT syndrome 3) |
TNNC1 |
troponin C, slow |
TNNC2 |
troponin C2, fast |
TNNI1 |
troponin I, skeletal, slow |
TNNC2 |
troponin C2, fast |
TNNI1 |
troponin I, skeletal, slow |
TNNI2 |
troponin I, skeletal, fast |
TNNI3 |
troponin I, cardiac |
TNNT1 |
troponin T1, skeletal, slow |
TNNT2 |
troponin T2, cardiac |
TNNT3 |
troponin T3, skeletal, fast |
VLDLR |
very-low-density lipoprotein receptor |
15.7, Hemostasis | |
|---|---|
A2M |
α2-macroglobulin |
CALM1 |
calmodulin 1 (phosphorylase kinase, δ subunit) |
CCL5 |
chemokine (C-C motif), ligand 5 |
CLEC3B (was TNA) |
C-type lectin domain family 3, member B |
F2 |
coagulation factor II (thrombin) |
F2R |
coagulation factor II (thrombin) receptor |
F2RL1 |
coagulation factor II (thrombin) receptorlike 1 |
F3 |
coagulation factor III (tissue factor, thromboplastin) |
F5 |
coagulation factor V |
F7 |
coagulation factor VII |
F7R |
coagulation factor VII regulator |
F8 |
coagulation factor VIII, procoagulant component (hemophilia A) |
F8A1 |
coagulation factor VIII associated (intronic transcript) 1 |
F9 |
coagulation factor IX |
F10 |
coagulation factor X |
F11 |
coagulation factor XI |
F12 |
coagulation factor XII |
F13A1 |
coagulation factor XIII, A1 polypeptide |
F13A2 |
coagulation factor XIII, A2 polypeptide |
F13B |
coagulation factor XIII, B polypeptide |
FGA |
fibrinogen A, α polypeptide |
FGB |
fibrinogen B, β polypeptide |
FGG |
fibrinogen, γ polypeptide |
FGL1 |
fibrinogenlike 1 |
FGL2 |
fibrinogenlike 2 |
GP5 |
glycoprotein V (platelet) |
GP6 |
glycoprotein VI (platelet) |
GP9 |
glycoprotein IX (platelet) |
GP1BA |
glycoprotein Ib, (platelet), α-polypeptide |
ICAM1 |
intercellular adhesion molecule 1 (CD54) |
ICAM2 |
intercellular adhesion molecule 2 |
ITGA1 |
α1 integrin |
ITGA2 |
α2 integrin |
ITGA2B |
α2b integrin (platelet glycoprotein [Gp] IIb of IIb/IIIa complex, antigen CD41B) |
ITGA3 |
α3 integrin |
ITGA6 |
α6 integrin |
ITGAV |
αv integrin (vitronectin receptor, a polypeptide, antigen CD51) |
ITGB1 |
β1 integrin (fibronectin receptor,β polypeptide, antigen CD29) |
ITGB3 |
integrin (platelet GpIIIa, antigen CD61) |
ITPKA |
inositol 1,4,5-triphosphate (IP3) A |
KLKB1 |
kallikrein B, plasma |
KNG1 |
kininogen 1 |
NOS3 |
nitric oxide synthase 3 (endothelial cell) |
PDGFA |
platelet-derived growth factor α-polypeptide |
PDGFC |
platelet-derived growth factor C |
PDGFRA |
platelet-derived growth factor receptor, α-polypeptide |
PDGFRL |
platelet-derived growth factor receptor–like |
PECAM1 |
platelet/endothelial cell adhesion molecule (CD31 antigen) |
PLAT |
plasminogen activator, tissue (tPA) |
PLAU |
plasminogen activator, urokinase (uPA) |
PLAUR |
uPA receptor |
PLG |
plasminogen |
PLGLA1 |
plasminogenlike A1 |
PLGLB1 |
plasminogenlike B1 |
PPBP |
proplatelet basic proteins (includes β-thromboglobulin) |
PROC |
protein C |
PROS1 |
protein S |
PROSP |
protein S pseudogene |
PROZ |
protein Z, vitamin K–dependent plasma glycoprotein |
PTGDR |
prostaglandin D2 receptor |
PTGDS |
prostaglandin D2 synthase |
PTGFR |
prostaglandin F receptor |
PTGFRN |
prostaglandin F2 receptor negative regulator |
PTGIR |
prostaglandin I2 (prostacyclin) receptor |
PTGIS |
prostaglandin I2 (prostacyclin) synthase |
PTGS1 |
prostaglandin-endoperoxide synthase 1 (prostaglandin G/H synthase and cyclo-oxygenase) |
SELE |
E-selectin (endothelial adhesion molecule 1) |
SELP |
P-selectin |
SERPINA1 |
serine (or cysteine) proteinase inhibitor, clade A (α1-antiproteinase, antitrypsin), member 1 |
SERPINC1 |
serine (or cysteine) proteinase inhibitor, clade C (antithrombin), member 1 |
SERPINE1 |
serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 1 |
SERPINF2 |
serine (or cysteine) proteinase inhibitor, clade F (α2-antiplasmin, pigment epithelium derived factor), member 2 |
TBXA2R |
thromboxane A2 receptor |
TBXAS1 |
thromboxane A synthase 1 |
TFPI |
tissue factor pathway inhibitor |
TFPI2 |
tissue factor pathway inhibitor 2 |
THBD |
thrombomodulin |
VCAM1 |
vascular cell adhesion molecule 1 |
VWF |
von Willebrand factor |
VWFP |
von Willebrand factor pseudogene |
15.8, Immunology | |
|---|---|
15.8.1, Chemokines |
|
CCL1 |
CCL1 |
CX3CL1 |
CX3CL1 |
CXCL1 |
CXCL1 |
PF4 |
platelet factor 4 (CXCL4) |
XCL1 |
XCL1 |
15.8.2, CD Cell Markers |
|
CD14 |
CD14 antigen |
CD19 |
CD19 antigen |
CD1A |
CD1a |
CD3D |
CD3δ |
CD46(was MCP) |
complement regulatory protein, CD46 |
CD55, (was DAF) for complement (Cromer blood group system) |
CD6 |
CD6 |
CD6 |
CD79A |
CD79A, Igα |
CD97 |
CD97 |
CR1 |
complement receptor type 1, CD35 |
FCGR3A |
FcγRIIIa, CD16 |
ICAM3 |
intracellular adhesion molecule 3, CD50 |
MME |
membrane metalloendopeptidase, CD10, CALLA |
15.8.3, Complement |
|
C1QA |
C1qα |
C1QB |
C1qβ |
C1QBP |
C1qbp |
C1QR1 |
C1qR1 |
C1R |
C1r |
C1S |
C1s |
C2 |
C2 |
C3 |
C3 |
C4A |
C4a |
C4B |
C4b |
C4BPA |
C4bp-α |
C5 |
C5 |
C5AR1 |
C5aR1 |
C6 |
C6 |
C7 |
C7 |
C8A |
C8α |
C8B |
C8β |
C9 |
C9 |
CD55 (was DAF) |
CD 55, DAF for complement (Cromer blood group system) |
CFH |
complement factor H |
CFP |
complement factor properdin |
15.8.4, Cytokines |
|
CRLF1 |
cytokine receptorlike factor 1 |
CRLF2 |
cytokine receptorlike factor 2 |
CSF1 |
M-CSF |
CSF2 |
GM-CSF |
CSF3 |
G-CSF |
CSF3R |
G-CSF receptor |
EPO |
erythropoeitin (Epo) |
EPOR |
Epo receptor |
GH1 |
growth hormone (GH) 1 |
GH2 |
GH 2 |
GHR |
GH receptor |
IFNA1 |
IFN-α1 |
IFNA2 |
IFN-α2 |
IFNB1 |
IFN-β1 |
IFNG |
IFN-γ |
IFNW1 |
IFN-α |
IL1A |
IL-1α |
IL1B |
IL-1β |
IL1R1 |
IL-1RI |
IL1R2 |
IL-1RII |
IL1RAP |
IL-1R accessory protein |
IL1RN |
IL-1 receptor antagonist (IL-1ra) |
IL2 |
IL-2 |
LEP |
leptin |
LEPR |
leptin receptor |
PRL |
prolactin |
SOCS1 |
suppressor of cytokine signaling 1 |
TGFA |
transforming growth factor α (TGF-α) |
TGFB1 |
TGF-β1 (Camurati-Engelmann disease) |
THPO |
thrombopoietin |
TNF |
tumor necrosis factor (TNF superfamily member 2) |
15.8.5, HLA/Major Histocompatibility Complex |
|
HLA-A |
HLA-A |
HLA-B |
HLA-B |
HLA-C |
HLA-C |
HLA-DMA |
HLA-DM α |
HLA-DMB |
HLA-DM β |
HLA-DOA |
HLA-DO α |
HLA-DOB |
HLA-DOβ |
HLA-DPA1 |
HLA-DP α1 |
HLA-DQA1 |
HLA-DQ α1 |
HLA-DQB1 |
HLA-DQ β1 |
HLA-DRA |
HLA-DR α |
HLA-DRB1 |
HLA-DRβ1 |
HLA-E |
HLA-E |
HLA-F |
HLA-F |
HLA-G |
HLA-G |
HLA-H |
HLA-H (pseudogene) |
HLA-J |
HLA-J (pseudogene) |
15.8.6, Immunoglobulins |
|
IGHA1 |
Cα1 |
IGHA2 |
Cα2 |
IGHD |
Cδ |
IGHD1-1 |
DH1 subgroup member 1 |
IGHE |
Cε |
IGHG1 |
Cγ1 |
IGHG2 |
Cγ2 |
IGHG3 |
Cγ3 |
IGHG4 |
Cγ4 |
IGHJ1 |
JH1 |
IGHM |
IgM μ CH |
IGHV@ |
VH |
IGHV1-2 |
VH1 subgroup member 2 |
IGHV1-18 |
VH1 subgroup member 18 |
IGKC |
Cκ |
IGKJ@ |
Jκ |
IGKJ2 |
Jκ2 |
IGKV@ |
Vκ |
IGKV1-5 |
Vκ1 subgroup member 5 |
IGLC@ |
Cλ |
IGLC1 |
Cλ1 |
IGLJ@ |
Jλ |
IGLJ1 |
Jλ1 |
IGLV@ |
Vλ |
IGLV10-54 |
Vλ10 subgroup member 54 |
15.8.7, Lymphocytes |
|
TRAC |
T-cell receptor α chain (TCRα) |
TRBC1 |
TCRβ1 |
TRBC2 |
TCRβ2 |
TRBV10-3 |
TCRβ variable 10 subgroup member 3 |
TRGC1 |
TCRγ C1 |
TRGJ1 |
TCRγ J1 |
TRGJ2 |
TCRγ J2 |
TRDC |
TCRδ C |
15.10, Molecular Medicine | |
|---|---|
APBA1 |
amyloid-β peptide precursor |
ADIPOQ |
adiponectin (Acrp30), C1Q and collagen domain containing |
ADIPOR1 |
adiponectin receptor 1 |
ADIPOR2 |
adiopnectin receptor 2 |
ACSL1 |
acyl-CoA synthetase long-chain family member 1 |
ADAMTS1 |
ADAM metallopeptidase with thrombospondin type 1 motif, 1 |
AHCY |
S-adenosylhomocysteine (adoHcy) |
AMD1 |
adenosylmethionine decarboxylase 1 |
AKT1 |
ν-akt murine thymoma viral oncogene homolog 1 |
ATP1A1 |
ATPase, Na+/K+ transporting, alpha 1 polypeptide |
BPGM |
2,3-bisphosphoglycerate mutase |
CALM1 |
calmodulin 1 |
CCAR1 |
cell division cycle and apoptosis regulator 1 |
CCPG1 |
cell cycle progression 1 |
CCRK |
cell cycle–related kinase |
CDC2 |
cell division cycle 2, G1 to S and G2 to M |
CDCA1 |
cell division cycle–associated 1 |
CDK2 |
cyclin-dependent kinase (DCK) 2 |
CDK7 |
CDK-activating enzyme (CAK) cyclinH/CDK7 |
CDKN1A |
CDK inhibitor 1A (p21) |
CDKN1C |
CDK inhibitor 1C (p57) |
CDKN2A |
cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4 [p16Ink4a]) |
COASY |
Coenzyme A (CoA) synthetase |
COX4I1 |
cytochrome c oxidase subunit IV isoform 1 |
COX5B |
Cytochrome c oxidase subunit νb |
CRP |
C-reactive protein, pentraxin-related |
CYP1A2 |
cytochrome P450 1A2 isozyme (CYP1A2) |
DHFR |
dihydrofolate reductase |
DKK1 |
Dickkopf homolog 1 |
ERBB2 |
v-erb-b2 erythroblastic leukemia viral oncogene homolog 2, neuroblastoma-/glioblastoma-derived oncogene homolog (avian) (formerly HER2/neu) |
FBP1 |
fructose 1,6-bisphosphatase 1 |
FDX1 |
ferredoxin (Fd) 1 |
FDX2 |
Fd 2 |
FHIT |
fragile histidine triad (Fhit) gene |
GNA12 |
G protein Gα12 |
GNG2 |
Gγ2 |
GALNT1 |
GalNAc transferase 1 |
G6PD |
glucose-6-phosphate dehydrogenase |
B3GALT1 |
UDP-Gal:β-GlcNAcβ-1,3-galactosyltransferase, polypeptide 1 |
CDKN2A |
CDK4 inhibitor 2A |
GFI1 |
growth factor independent 1 |
GRB2 |
growth factor receptor-bound protein 2 |
GRIN1 |
glutamate receptor, inotropic, N-methyl-d-aspartate (NMDA) 1 |
HBA1 |
hemoglobin (Hb) α1 |
HBB |
Hbβ |
HMGCS1 |
3-hydroxy-3-methylglutaryl CoA synthase 1 |
IGF1 |
insulinlike growth factor 1 (IGF-1) |
IGF1R |
IGF-1 receptor (IGF-R1) |
IKBKB |
IκB kinase β (IKKβ) |
ITPKA |
inositol 1,4,5-triphosphate (IP3) A |
MNAT1 |
menage a trois 1 (CAK assembly factor) |
MB |
myoglobin (Mb) |
MCM2 |
Mcm 2 minichromosome maintenance deficient 2, mitotin (Saccharomyces cerevisiae) |
NMNAT1 |
nicotinamide nucleotide adenyltransferase 1 |
NPY |
neuropeptide |
NPPA |
natriuretic peptide precursor α |
OGDH |
oxoglutarate (a-ketoglutarate) dehydrogenase (lipoamide) |
PIB5PA |
phosphatidylinositol 4,5-biphosphate (PIP2) 5-phosphatase A |
PYY |
peptide YY |
RBBP4 |
retinoblastoma binding protein 4 |
RNASE1 |
ribonuclease, RNase A family 1 (pancreatic) |
SFPQ |
splicing factor proline/glutamine-rich |
SNCA |
α-synuclein |
TAF1 |
TAF1 RNA polymerase II, TATA box binding protein (TBP)–associated factor |
TBP |
TATA box binding protein |
THPO |
thrombopoietin |
TNFSF11 (alias: RANKL) |
TNF (ligand) superfamily member 11 |
TP53 |
tumor protein p53 |
UCP1 |
uncoupling protein 1 (UCP-1) |
WNT1 |
wingless-type mouse mammary tumor virus (MMTV) integration site family, member 1 |
15.11, Neurology | |
|---|---|
ACCN1 |
amiloride-sensitive cation channel 1, neuronal |
ACHE |
acetylcholinesterase (Yt blood group) |
ADORA1 |
adenosine A1 receptor |
ADRA1A |
α1A-adrenergic receptor |
ADRB1 |
β1-adrenergic receptor |
BDNF |
brain-derived neurotrophic factor |
CACNA1A |
Ca2+, voltage-dependent, P/Q type, α1A subunit |
CHRM1 |
cholinergic receptor, muscarinic 1 |
CHRNA1 |
cholinergic receptor, nicotinic, α-polypeptide 1 (muscle) |
CNTF |
ciliary neurotrophic factor |
COMT |
catechol-O-methyltransferase |
DRD1 |
dopamine receptor D1 |
EGF |
epidermal growth factor |
GABBR1 |
γ-aminobutyric acid (GABA) B receptor 1 |
GDNF |
glial cell line–derived neurotrophic factor |
GRIA1 |
glutamate receptor, inotropic, α-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid (AMPA) 1 |
GRIN1 |
glutamate receptor, NMDA 1 |
HRH1 |
histamine receptor 1 |
HTR1A |
serotonin (5-hydroxytryptamine) receptor 1A |
ITPKA |
inositol 1,4,5-triphosphate (IP3)A |
KCNJ3(formerly GIRK1) |
potassium inwardly rectifying channel, subfamily J, member 3 |
MAOA |
monoamine oxidase A |
NGFB |
nerve growth factor β-polypeptide |
NGFR |
nerve growth factor receptor |
NMB |
neuromedin B |
NOS1 |
nitric oxide synthase 1 (neuronal) |
NPY |
neuropeptide Y |
NPY1R |
neuropeptide Y receptor Y1 |
NRTN |
neurturin NTF3 neurotrophin 3 |
NTS |
neurotensin |
NTSR1 |
neurotensin receptor 1 |
OPRD1 |
opioid δ receptor |
OPRK1 |
opioid κ receptor |
OPRM1 |
opioid m receptor OPRS1 opioid receptor σ1 |
PCP2 |
Purkinje cell protein 2 |
SLC1A1(formerly EAAT3) |
solute carrier family 1 |
SLC18A1 |
solute carrier family 18 (vesicular monoamine), member 1 |
SNAP25 |
synaptosomal-associated protein, 25 kDa |
SNCA |
a-synuclein |
TAC1 |
tachykinin, precursor 1 (substance K, substance P, neurokinin 1, neurokinin 2, neuromedin L, neurokinin a, neuropeptide K, neuropeptide γ) |
TAC3 |
tachykinin 3 (neuromedin K, neurokinin β) |
TRPA1 |
transient receptor potential cation channel, subfamily A, member 1 |
TSNARE1 |
t-SNARE domain containing 1 [see 15.11, Neurology, for expansion] |
VAMP1 |
vesicle-associated membrane protein 1 (synaptobrevin 1) |
AAVS1 |
adeno-associated virus integration site 2 |
BNIP1 |
BLC2/adenovirus E1B 19kDa interacting protein 1 |
CR2 |
complement component (3d/Epstein-Barr virus receptor 2) |
CXADR |
coxsackievirus and adenovirus receptor |
CXB3S |
coxsackievirus B3 sensitivity |
E11S |
echovirus (serotypes 4, 6, 11, 19) sensitivity |
EBI2 |
Epstein-Barr virus–induced gene 2 |
EBVM1 |
Epstein-Barr virus modification site 1 |
EBVS1 |
Epstein-Barr virus insertion site 1 |
HAVCR1 |
hepatitis A virus cellular receptor 1 |
HBXAP |
hepatitis B virus X-associated protein |
HBXIP |
hepatitis B virus X-interacting protein |
HCVS |
human coronavirus sensitivity |
HIVE1 |
human immunodeficiency virus 1 (HIV-1) expression (elevated) 1 |
HPV6AI1 |
human papillomavirus type 6a integration site 1 |
HTLF |
human T-cell leukemia virus enhancer factor |
HV1S |
herpes simplex virus type 1 sensitivity |
ICAM1 |
intercellular adhesion molecule 1 (CD54), human rhinovirus receptor |
MX1 |
myxovirus (influenza virus) resistance 1 |
PVR |
poliovirus receptor |
PRND |
prion protein 2 (dublet) |
PRNP |
PrP27-30 (Creutzfeld-Jakob disease, Gerstmann-Strausler-Scheinker syndrome, fatal familial insomnia) |
PRNPIP |
prion protein interacting protein |
PRNT |
prion protein testis specific |
Alleles.
Alleles denote alternative forms of a gene. Alleles are often characterized by particular variant sequences (mutations). For variant sequence nomenclature see “Sequence Variations, Nucleotides, and Sequence Variations, Amino Acids,” in 15.6.1, Nucleic Acids and Amino Acids.
Because alleles are alternative forms of a particular gene, they are expressed by means of both the gene name or symbol and an appendage that indicates the specific allele.
Classically, allele symbols consist of the gene symbol plus an asterisk plus the italicized allele designation,7 eg:
HBB*S |
S allele of the HBB gene |
As with gene terms, Greek letters are changed to Latin letters in allele terms:
APOE*E4 |
allele producing the e4 type of apolipoprotein E |
If clear in context, the allele symbol may be used in a shorthand form that omits the gene symbol and includes only the asterisk and the allele designation that follows, eg:
*S
*E4
In the case of alleles of the major histocompatibility locus, which are not italicized (see 15.8.5, Immunology, HLA/Major Histocompatibility Complex), a portion of the gene name is usually included in the shortened form:
Full Name |
Shortened Form |
|---|---|
HLA-DRB1*0301 |
DRB1*0301 |
In practice, common or trivial names for alleles, which take various forms, are used. The same allele is often expressed in different ways that diverge from the recommended nomenclature. For example:
s: short allele of serotonin transporter gene (SLC6A4)
l: long allele of SLC6A4
As another example of common allele names, the following expressions are all used for APOE*E4; follow author preference:
ε 4 allele
epsilon 4 allele
E4 allele
APOE*4
apo e4
APOEE4
A system of nomenclature that takes evolutionary divergence into account has been proposed for alleles.12 Stylistically, it is consistent with the above system of nomenclature, ie, asterisk followed by italicized alphanumeric allele designator. Examples (from Nebert12):
NAT2*4
*1A1
*3A3
*7A28T17L47B88
Genotype and Phenotype Terminology
The genotype comprises the set of alleles in an individual. Because individuals almost always have 2 of each autosome (nonsex chromosome) (see 15.6.4, Human Chromosomes), individuals have 2 alleles (which may be the same alleles or 2 different alleles) for each autosomal gene.
The simplest genotype term for an individual would describe 1 gene and consist of the names of 2 alleles. Larger genotypes would contain 2 or more allele symbol pairs.
As originally formulated in ISGN, allele groupings may be indicated by placement above and below a horizontal line or on the line. As seen in the following examples (from Shows et al2,3), such placement, as well as order, spacing, and punctuation marks (virgules [/], semicolons, spaces, and commas), has specific meanings.
Alleles of the same gene are indicated by placement above and below a horizontal line or with a virgule:
In theoretical discussions when a single letter is substituted for the allele symbol, the line or virgule may be dispensed with:
AA
Aa
aa
ss
ll
sl
Semicolons separate pairs of alleles at unlinked loci:
or
ADA*1/ADA*2; ADH1*1/ADH1*1; AMY1*A/AMY1*B
or
ADA*1/*2; ADH*1/*1;AMY1*A/*B
A single space separates alleles together on the same chromosome from alleles together on another chromosome (phase known):
or
AMY1*A PGM1*2/AMY1*B PGM1*1
Commas indicate that alleles above and below the line (or on either side of the virgule) are on the same chromosome pair, but not on which chromosome of the pair specifically (phase unknown):
or
PGM1*1/PGM1*2, AMY1*A/AMY1*B
A special form for hemizygous males is
G6PD*A/Y
When genotype is being expressed in terms of nucleotides (eg, a polymorphism), italics and other punctuation are not needed (see also 15.6.1, Nucleic Acids and Amino Acids):
MTHFR677 TT genotype
CC genotype
the “long/short” (5HTTLPR) polymorphism in SLC6A4
(LPR: length polymorphism region)
When the subject is being described in terms of the 2 possible amino acids at 1 position in the protein owing to a single nucleotide polymorphism (nonsynonymous mutation), the corresponding amino acids are separated by a virgule (see also 15.6.1, Nucleic Acids and Amino Acids):
Val/Val (homozygous)
Met/Val (heterozygous)
Met/Met (homozygous)
Such terms should be explained at first mention with the amino acid terms expanded:
the common methionine/valine (Met/Val) polymorphism at codon 129
The virgule is not needed in expressions such as the following:
α1-antitrypsin MZ heterozygotes
individuals with the ZZ phenotype
The phenotype is the collection of traits in an individual resulting from his or her genotype. When phenotypes are expressed in terms of the specific alleles, the phenotype term derives from the genotype term, but no italics are used, and, instead of asterisks, spaces are used. Genotypes usually contain pairs of symbols, while phenotypes contain single symbols. The following examples are from Shows et al3:
Genotype |
Phenotype |
|---|---|
ADA*1/ADA*1 |
ADA 1 |
ADA*1/ADA*2 |
ADA 1–2 |
C2*C/C2*QO |
C2 C,QO |
HBB*A/HBB*6V |
HBB A,S [traditional, Hb A/S] |
ABO*A1/ABO*O |
ABO A1 |
CFTR*N/CFTR*R |
CFTR N |
G6PD*A/Y |
G6PD A |
NAT2*4/*4 |
rapid acetylator |
CYP2D6*4A/*5 |
poor metabolizer |
OMIM
Online Mendelian Inheritance in Man (OMIM) is a database of genetic syndromes.13 The site is located at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM.
When a specific syndrome is mentioned, it is helpful to include the OMIM number:
bronchomalacia (Online Mendelian Inheritance in Man [OMIM] 211450)
DiGeorge syndrome (OMIM #188400)
Explanation of symbols that precede many OMIM numbers (eg, #, *, or %) is found in the OMIM frequently answered questions (FAQs) site, http://www.ncbi.nlm.nih.gov/Omim/omimfaq.html#numbering_system, and in Hamosh et al.13
References
1. Klinger HP. Progress in nomenclature and symbols for cytogenetics and somatic-cell genetics. Ann Intern Med. 1979;91(3):487-488.
Find This Resource
2. Shows TB, Alper CA, Bootsma D, et al. International system for human gene nomenclature (1979). Cytogenet Cell Genet. 1979;25(1-4):96-116.
Find This Resource
3. Shows TB, McAlpine PJ, Boucheix C, et al. Guidelines for human gene nomenclature: an international system for human gene nomenclature (ISGN, HGM9). Cytogenet Cell Genet. 1987;46(1-4):11-28.
Find This Resource
4. Rangel P, Giovannetti J. Genomes and Databases on the Internet: A Practical Guide to Functions and Applications. Norfolk, England: Horizon Scientific Press; 2002.
Find This Resource
5. HUGO Gene Nomenclature Committee website. http://www.gene.ucl.ac.uk/nomenclature/. Updated March 29, 2006. Accessed April 21, 2006.
6. Searchgenes. Human Gene Nomenclature Database Search Engine. http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/searchgenes.pl. Updated April 21, 2006. Accessed April 21, 2006.
7. Wain HM, Bruford EA, Lovering RC, Lush MJ, Wright MW, Povey S. Guidelines for human gene nomenclature. Genomics. 2002;79(4):464-470. Also available at http://www.gene.ucl.ac.uk/nomenclature/guidelines.html. Updated April 20, 2006. Accessed April 21, 2006.
Find This Resource
8. Wain HM, Lush M, Ducluzeau F, Povey S. Genew: the Human Gene Nomenclature Database. Nucleic Acids Res. 2002;30(1):169-171.
Find This Resource
9. Wain HM, Lush MJ, Ducluzeau F, Khodiyar VK, Povey S. Genew: the Human Gene Nomenclature Database, 2004 updates. Nucleic Acids Res. 2004;32(database issue): D255-D257. doi:10.1093/nar/gkh072.
Find This Resource
10. Entrez Gene. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene. Accessed April 21, 2006.
11. HGNC FAQs. http://www.gene.ucl.ac.uk/nomenclature/information/FAQs.html. Updated April 20, 2006. Accessed April 24, 2006.
12. Nebert DW. Proposal for an allele nomenclature system based on the evolutionary divergence of haplotypes. Hum Mutat. 2002;20(6):463-472.
Find This Resource
13. Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucl Acids Res. 2005;33(database issue):D514-D517. doi:10.1093/nar/gki033.
Find This Resource