Site name and logo

Immense chemical names

Q From Michael Snyder, and Phil Glatz: I recently read of a chemical term which boasted an incredible 1,000+ letters. According to the brief piece, the word has appeared only once or twice in journals. The article went on to point out that such words can be constructed exactly as one constructs molecular compounds. I’d still like to know what it is.

A It’s indeed possible to create words as long as you like for complex compounds such as proteins, which consist of large numbers of amino acids joined together. You just add the names of the amino acids one after another until you run out of compound or, more probably, time and patience. The longest one I’ve seen in print is this, which makes even supercalifragilisticexpialidocious look tame:

methionylglutaminylarginyltyrosylglutamylserylleucyl phenylalanylalanylglutaminylleucyllysylglutamylarginyl lysylglutamylglycylalanylphenylalanylvalylprolylphenyl alanylvalylthreonylleucylglycylaspartylprolylglycylisol eucylglutamylglutaminylserylleucyllysylisoleucylaspartyl threonylleucylisoleucylglutamylalanylglycylalanylaspartyl alanylleucylglutamylleucylglycylisoleucylprolylphenyl alanylserylaspartylprolylleucylalanylaspartylglycylprolyl threonylisoleucylglutaminylasparaginylalanylthreonylleucyl arginylalanylphenylalanylalanylalanylglycylvalylthreonyl prolylalanylglutaminylcysteinylphenylalanylglutamyl methionylleucylalanylleucylisoleucylarginylglutaminyllysyl histidylprolylthreonylisoleucylprolylisoleucylglycylleucyl leucylmethionyltyrosylalanylasparaginylleucylvalylphenyl alanylasparaginyllysylglycylisoleucylaspartylglutamylphenyl alanyltyrosylalanylglutaminylcysteinylglutamyllysylvalyl glycylvalylaspartylserylvalylleucylvalylalanylaspartylvalyl prolylvalylglutaminylglutamylserylalanylprolylphenylalanyl arginylglutaminylalanylalanylleucylarginylhistidylasparaginyl valylalanylprolylisoleucylphenylalanylisoleucylcysteinyl prolylprolylaspartylalanylaspartylaspartylaspartylleucyl leucylarginylglutaminylisoleucylalanylseryltyrosylglycyl arginylglycyltyrosylthreonyltyrosylleucylleucylserylarginyl alanylglycylvalylthreonylglycylalanylglutamylasparaginyl arginylalanylalanylleucylprolylleucylasparaginylhistidyl leucylvalylalanyllysylleucyllysylglutamyltyrosylasparaginyl alanylalanylprolylprolylleucylglutaminylglycylphenylalanyl glycylisoleucylserylalanylprolylaspartylglutaminylvalyllysyl alanylalanylisoleucylaspartylalanylglycylalanylalanylglycyl alanylisoleucylserylglycylserylalanylisoleucylvalyllysylisol eucylisoleucylglutamylglutaminylhistidylasparaginylisoleucyl glutamylprolylglutamyllysylmethionylleucylalanylalanylleucyl lysylvalylphenylalanylvalylglutaminylprolylmethionyllysyl alanylalanylthreonylarginylserine.

This is the full name, 1,913 characters long, for tryptophan synthetase, a protein, which has 267 amino acids in it. I extracted this monster from The Word Lover’s Dictionary by Josefa Heifetz, but it is also cited in Mrs Byrne’s Dictionary of Unusual, Obscure, and Preposterous Words by the same author. If you want to break it down into its components, it consists of many repetitions of the adjectival forms of the names of amino acids, such as alanyl, methionyl, threonyl, and valyl, all of which end in yl, with one instance of serine at the end.

After this piece originally appeared, Alan Wachtel wrote from California to tell me that this word was first printed in the journal Chemical Abstracts in the 1960s. He commented: “At one time, proteins whose structure was known were named just as you described, by the sequence of amino acids composing them. In the 1960s, when techniques for sequencing long proteins were developed, this rule began to generate extremely long chemical names. .. As longer and longer proteins were analyzed, this naming convention quickly grew unmanageable, and Chemical Abstracts reverted to calling these proteins by descriptive names. I think the 1,913-letter chemical name for tryptophan synthetase that you cited must have been the longest term published before the rule was modified”.

The results of such amalgamations can be as big as you like, but they are extremely difficult to parse and comprehend as other than extended chemical formulae. It was the German influence on chemical matters in the nineteenth century which left us a legacy in which we tend to record:

dichlorodiphenyltrichloroethane,
dichlorodiphenyltrichloroethane,
dimethylsulphoniopropionate,
octamethylcyclotetrasiloxane, and
ribulosebisphosphatecarboxylaseoxygenase

as long strings of characters, though it is usual these days to break them into more manageable sections, or use abbreviations (for example, the first of these is better known as DDT and the last is usually referred to as RUBISCO, a crucial enzyme for life on Earth that catalyses the first stage in photosynthesis).

Support this website and keep it available!

There are no adverts on this site. I rely on the kindness of visitors to pay the running costs. Donate via PayPal by selecting your currency from the list and clicking Donate. Specify the amount you wish to give on the PayPal site.

Copyright © Michael Quinion, 1996–. All rights reserved.

Page created 18 Dec 1999; Last updated 08 Jan 2000