Spoken language -- the ability to deliver a high-speed string of vocal signals -- is considered to be one of important differences between humans and other animals, and especially between humans and other large primates. It is unlikely that this complex activity resides in a single gene, but recently, one gene associated with language skills was identified by studying a large family in which some members had difficulty both in producing and in processing speech. The gene was localized to the seventh chromosome and subsequently identified as the FOXP2 gene. FOXP2 is one of a family of proteins containing the "forkhead" homeobox sequence, a broadly conserved DNA binding domain. These proteins act as developmental regulators. In mice, the FOXP2 gene is active during the development of the cerebral cortex; presumably the action is similar in humans. In 2002, studies comparing the human FOXP2 gene with the equivalent gene in other animals showed that the human gene varies little from that in other mammals. In a sequence of 715 amino acids, the human sequence differs from the mouse sequence by only a few amino acids, and of these few, two amino acids, N(asparagine)-303 and S(serine)-325, are unique to humans.
References:
Enard, et al. Molecular evolution of FOXP2, a gene involved in speech and language. Nature, 14 August 2002 (advanced online publication).
Pinker, Steven, 2001. Talk of genetics and vice versa. Nature (News and Views, 4 October 2001) 413: 455-456.
FOXP2 Protein Sequence
Human (Homo sapiens)
< >=zinc finger domain
QQQQQ = polyQ tracts
[ ]=forkhead domain
{ }=unique amino acids in human sequenceMMQESATETISNSSMNQNGMSTLSSQLDAGSRDGRSSGDTSSEVSTVELL
HLQQQQALQAARQLLLQQQTSGLKSPKSSDKQRPLQVPVSVAMMTPQVIT
PQQMQQILQQQVLSPQQLQALLQQQQAVMLQQQQLQEFYKKQQEQLHLQL
LQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQHPGKQAKE
QQQQQQQQQQLAAQQLVFQQQLLQMQQLQQQQHLLSLQRQGLISIPPG
QAALPVQSLPQAGLSPAEIQQLWKEVTGV
HSMEDNGIKHGGLDLTTNNSSSTTSS
{N}TSKASPPITHHSIVNGQSSVL{S}
ARRDSSSHEETGASHTLYGH
<GVCKWPGCESICEDFGQFLKHLNNEH>
ALDDRSTAQCRVQMQVVQQLEIQLSKERERLQAMMTHLHMRPSEPKPSPK
PLNLVSSVTMSKNMLETSPQSLPQTPTTPTAPVTPITQGPSVITPASVPN
VGAIRRRHSDKYNIPMSSEIAPNYEFYKNADV
[RPPFTYATLIRQAIMESSDRQLTLNEIYSWFTRTFAYFRRNAATWK
NAVRHNLSLHKCFVRVENVKGAVWTVDEVEYQKRRSQKITGSPTL]
VKNIPTSLGYGAALNASLQAALAESSLPLLSNPGLINNASSGLLQAVHED
LNGSLDHIDSNGNSSPGCSPQPHIHSIHVKEEPVIAEDEDCPMSLVTTAN
HSPELEDDREIEEEPLSEDLENotes on the music.
This long sequence is played through only once, with the different protein domains marked by different combinations of instruments. It begins with a duet between low (low solubility) and high (high solubility) voices of a string chamber group. The poly-Q tracts are identified by long sustained tones accompanied by the encoding DNA sequence-- a vibraphone sounds the repeating CAG and CAA DNA codons. Shortly after the end of the poly-Q region is the part of the protwin in which the two unique amino acids are found. This region is marked the the entrance of human voices. The zinc finger region follows, played by chimes, and then the chamber duet returns. The "forkhead" domain is announced by the return of the human choir and is again accompanied by vibraphone playing the underlying DNA. The choir fades, and the chamber duet then plays to the end of the piece.