Coronavirus SARS 3C-like Protease

The coronavirus that causes Severe Acute Respiratory Syndrome (SARS) has an RNA genome that encodes a large multiunit protein. Included in this complex are two proteases that cleave the polyprotein into its subunits, including the viral replicase, the coat protein and the "crown" of spikes that gives the coronaviruses their name.

Photomicrograph from a SARS Tutorial at the University of Leicester.

The sequence seen below is that of the 3C-like protease. Although the SARS protease has not been publicly annotated, other 3C proteases are composed primarily of beta-strands, like this 3C protease of a rhinovirus. The letters in the sequence below are grouped according to the sections assigned to different instrumental ensembles in the musical translation. They are not intended to represent functional domains of the protein. The viral proteases have recently been identified as a possible target for development of a drug or vaccine against SARS.

<=Back to Samples

Coronavirus SARS 3C-like Protease


QSGFRKMAFPSGKVEGCMVQVTCGTTTLNGLW
LDDTVYCPRHVICTAEDMLPNYEDLLIRKSN
HSFLVQAGNVQLRVIGHSMQNCLLRLKVDTSN
PKTPKYKFVRIQPGQTFSVLACYNGSPSGVYQCA
MRPNHTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMEL
PTGVHAGTDLEGKFYGPFVDRQTA
QAAGTDTTITLNVLAWLYAAVINGDRWFLNRF
TTTLNDFNLVAMKYNYEPLTQDHVDILGPLSA
QTGIAVLDMCAALKELLQNGMNGRT
ILGSTILEDEFTPFDVVRQCSGVTFQG

Link to the Music: SARS 3C-like Protease

Notes on the music.

This piece is a single read-through of the sequence of the 3C-like protease of Coronavirus SARS. Because the protein has not been publicly annotated, the segments played by a particular set of instruments were selected by letting each "theme" play through before switching to a new set of voices. These were musical rather than biological choices and are not meant to represent functional domains of the protein. However, as in previous pieces, the pitch assigned to each amino acid is a function of its water-solubility, with lower pitches representing the more insoluble amino acids.