Huntingtin
(Huntington Disease Protein)

Huntington's Disease is a human neurodegenerative disease characterized by uncontrolled movements and progressive dementia. The condition is due to a dominant gene and is not expressed until middle age. The gene has been localized to human chromosome 4. Huntington's Disease is one of several known human disorders associated with trinucleotide repeats, in this case a repeated CAG codon that produces a PolyQ (repeated glutamine residues) region in the protein. In the music, you will hear the sequence seem to stall temporarily in this repeated region. This region of the gene is susceptible to expansion that increases the number of glutamines to 100 or more. Severity of the disease increases with increased number of repeated glutamines. Huntington protein also contains a number of HEAT repeats; the acronym comes from four proteins in which these repeated sequences have been described. These HEAT repeat sequences fold up into a helix-turn-helix motif, and may serve as docking points for other proteins. The Huntington protein contains 10 copies of this 40-residue repeat.

Huntington Protein Sequence
PolyQ region
PolyP region
(HEAT repeat regions)
Shorter music sample ends with this section

MATLEKLMKAFESLKSF

QQQQQQQQQQQQQQQQQQQQQQQ

PPPPPPPPPPPQLPQPPPQAQPLLPQPQPPPPPPPPPPGP

AVAEEPLHRPKKELSATKKDRVNHCLTICENIVAQSVRNS
PEFQKLLGIAMELFLLCSDDAESDVRMVADECLNKVIKAL
MDSNLPRLQLELYKEIKKNGAPRSLRAALWRFAELAHLVR
PQKC

(RPYLVNLLPCLTRTSKRPEESVQETLAAAVPKIMASFGN)
FAN
(DNEIKVLLKAFIANLKSSSPTIRRTAAGSAVSICQHSRR)
TQYFY
(SWLLNVLLGLLVPVEDEHSTLLILGVLLTLRYLVPLLQQ)

QVKDTSLKGSFGVTRKEMEVSPSAEQLVQVYELTLHHTQHQDHNVVTGALELLQQLFRTP
PPELLQTLTAVGGIGQLTAAKEESGGRSRSGSIVELIAGGGSSCSPVLSRKQKGKVLLGE
EEALEDDSESRSDVSSSALTASVK
DEISGELAASSGVSTPGSAGHDIITEQPRSQHTLQA
DSVDLASCDLTSSATDGDEEDILSHSSSQVSAVPSDPAMDLNDGTQASSPISDSSQTTTE
GPDSAVTPSDSSEIVLDGTDNQYLGLQIGQPQDEDEEATGILPDEASEAFRNSSMALQQA
HLLKNMSHCRQPSDSSVDKFVLRDEATEPGDQENKPCRIKGDIGQSTDDDSAPLVHCVRL
LSASFLLTGGKNVLVPDRDVRVSVKALALSCVGAAVALHPESFFSKLYKVPLDTT

(EYPEEQYVSDILNYIDHGDPQVRGATAILCGTLICSILS)
RSRFHVGDWMGTIRTLTGN

(TFSLADCIPLLRKTLKDESSVTCKLACTAVRNCVMSLCS)
(SSYSELGLQLIIDVLTLRNSSYWLVRTELLETLAEIDFR)
LVSFLEAKAENLHRGAHHYTGL

(LKLQERVLNNVVIHLLGDEDPRVRHVAAASLIRLVPKLFY)

KCDQGQADPVVAVARDQSSVYLKLLMHETQPPSHFSVSTITRIYRGYNLLPSITDVTMEN
NLSRVIAAVSHELITSTTRALTFGCCEALCLLSTAFPVCIWSLGWHCGVPPLSASDESRK
SCTVGMATMILTLLSSAWFPLDLSAHQDALILAGNLLAASAPKSLRSSWASEEEANPAAT
KQEEVWPALGDRALVPMVEQLFSHLLKVINICAHVLDDVAPGPAIKAALPSLTNPPSLSP
IRRKGKEKEPGEQASVPLSPKKGSEASAASRQSDTSGPVTTSKSSSLGSFYHLPSYLKLH
DVLKATHANYKVTLDLQNSTEKFGGFLRSALDVLSQILELATLQDIGKCVEEILGYLKSC
FSREPMMATVCVQQLLKTLFGTNLASQFDGLSSNPSKSQGRAQRLGSSSVRPGLYHYCFM
APYTHFTQALADASLRNMVQAEQENDTSGWFDVLQKVSTQLKTNLTSVTKNRADKNAIHN
HIRLFEPLVIKALKQYTTTTCVQLQKQVLDLLAQLVQLRVNYCLLDSDQVFIGFVLKQFE
YIEVGQFRESEAIIPNIFFFLVLLSYERYHSKQIIGIPKIIQLCDGIMASG

(RKAVTHAIPALQPIVHDLFVLRGTNKADAGKELETQKEVVVS)
MLLRLIQYHQVLEMFILVLQQCHKENEDKWKRLS

(RQIADIILPMLAKQQMHIDSHEALGVLNTLFEILAPSSL)
RPVDMLLRSMFVTPNTMASVS

(TVQLWISGILAILRVLISQSTEDIVLSRIQELSFSPYLIS)

CTVINRLRDGDSTSTLEEHSEGKQIKNLPEETFSRFLLQLVGILLEDIVTKQLKVEMSEQ
QHTFYCQELGTLLMCLIHIFKSGMFRRITAAATRLFRSDGCGGSFYTLDSLNLRARSMIT
THPALVLLWCQILLLVNHTDYRWWAEVQQTPKRHSLSSTKLLSPQMSGEEEDSDLAAKLG
MCNREIVRRGALILFCDYVCQNLHDSEHLTWLIVNHIQDLISLSHEPPVQDFISAVHRNS
AASGLFIQAIQSRCENLSTPTMLKKTLQCLEGIHLSQSGAVLTLYVDRLLCTPFRVLARM
VDILACRRVEMLLAANLQSSMAQLPMEELNRIQEYLQSSGLAQRHQRLYSLLDRFRLSTM
QDSLSPSPPVSSHPLDGDGHVSLETVSPDKDWYVHLVKSQCWTRSDSALLEGAELVNRIP
AEDMNAFMMNSEFNLSLLAPCLSLGMSEISGGQKSALFEAAREVTLARVSGTVQQLPAVH
HVFQPELPAEPAAYWSKLNDLFGDAALYQSLPTLARALAQYLVVVSKLPSHLHLPPEKEK
DIVKFVVATLEALSWHLIHEQIPLSLDLQAGLDCCCLALQLPGLWSVVSSTEFVTHACSL
IYCVHFILEAVAVQPGEQLLSPERRTNTPKAISEEEEEVDPNTQNPKYITAACEMVAEMV
ESLQSVLALGHKRNSGVPAFLTPLLRNIIISLARLPLVNSYTRVPPLVWKLGWSPKPGGD
FGTAFPEIPVEFLQEKEVFKEFIYRINTLGWTSRTQFEETWATLLGVLVTQPLVMEQEES
PPEEDTERTQINVLAVQAITSLVLSAMTVPVAGNPAVSCLEQQPRNKPLKALDTRFGRKL
SIIRGIVEQEIQAMVSKRENIATHHLYQAWDPVPSLSPATTGALISHEKLLLQINPEREL
GSMSYKLGQVSIHSVWLGNSITPLREEEWDEEEEEEADAPAPSSPPTSPVNSRKHRAGVD
IHSCSQFLLELYSRWILPSSSARRTPAILISEVVRSLLVVSDLFTERNQFELMYVTLTEL
RRVHPSEDEILAQYLVPATCKAAAVLGMDKAVAEPVSRLLESTLRSSHLPSRVGALHGVL
YVLECDLLDDTAKQLIPVISDYLLSNLKGIAHCVNIHSQQHVLVMCATAFYLIENYPLDV
GPEFSASIIQMCGVMLSGSEESTPSIIYHCALRGLERLLLSEQLSRLDAESLVKLSVDRV
NVHSPHRAMAALGLMLTCMYTGKEKVSPGRTSDPNPAAPDSESVIVAMERVSVLFDRIRK
GFPCEARVVARILPQFLDDFFPPQDIMNKVIGEFLSNQQPYPQFMATVVYKVFQTLHSTG
QSSMVRDWVMLSLSNFTQRAPVAMATWSLSCFFVSASTSPWVAAILPHVISRMGKLEQVD
VNLFCLVATDFYRHQIEEELDRRAFQSVLEVVAAPGSPYHRLLTCLRNVHKVTTC

Huntington Protein: PolyQ region and HEAT repeats

Huntington Protein Full Sequence

Sources:

Sequence information: SwissProt
http://ca.expasy.org/cgi-bin/get-sprot-entry?P42858

HEAT repeat reference: European Bioinformatics Institute
http://www.ebi.ac.uk/proteome/index.html?http://www.ebi.ac.uk/proteome/ANASP/interpro/top15r.html