Bookshelp header image for page World Wide Words logo

Immense chemical names

Q From Michael Snyder, and Phil Glatz: I recently read of a chemical term which boasted an incredible 1,000+ letters. According to the brief piece, the word has appeared only once or twice in journals. The article went on to point out that such words can be constructed exactly as one constructs molecular compounds. I’d still like to know what it is.

A It’s indeed possible to create words as long as you like for complex compounds such as proteins, which consist of large numbers of amino acids joined together. You just add the names of the amino acids one after another until you run out of compound or, more probably, time and patience. The longest one I’ve seen in print is this, which makes even supercalifragilisticexpialidocious look tame:

methionylglutaminylarginyltyrosylglutamylserylleucyl phenylalanylalanylglutaminylleucyllysylglutamylarginyl lysylglutamylglycylalanylphenylalanylvalylprolylphenyl alanylvalylthreonylleucylglycylaspartylprolylglycylisol eucylglutamylglutaminylserylleucyllysylisoleucylaspartyl threonylleucylisoleucylglutamylalanylglycylalanylaspartyl alanylleucylglutamylleucylglycylisoleucylprolylphenyl alanylserylaspartylprolylleucylalanylaspartylglycylprolyl threonylisoleucylglutaminylasparaginylalanylthreonylleucyl arginylalanylphenylalanylalanylalanylglycylvalylthreonyl prolylalanylglutaminylcysteinylphenylalanylglutamyl methionylleucylalanylleucylisoleucylarginylglutaminyllysyl histidylprolylthreonylisoleucylprolylisoleucylglycylleucyl leucylmethionyltyrosylalanylasparaginylleucylvalylphenyl alanylasparaginyllysylglycylisoleucylaspartylglutamylphenyl alanyltyrosylalanylglutaminylcysteinylglutamyllysylvalyl glycylvalylaspartylserylvalylleucylvalylalanylaspartylvalyl prolylvalylglutaminylglutamylserylalanylprolylphenylalanyl arginylglutaminylalanylalanylleucylarginylhistidylasparaginyl valylalanylprolylisoleucylphenylalanylisoleucylcysteinyl prolylprolylaspartylalanylaspartylaspartylaspartylleucyl leucylarginylglutaminylisoleucylalanylseryltyrosylglycyl arginylglycyltyrosylthreonyltyrosylleucylleucylserylarginyl alanylglycylvalylthreonylglycylalanylglutamylasparaginyl arginylalanylalanylleucylprolylleucylasparaginylhistidyl leucylvalylalanyllysylleucyllysylglutamyltyrosylasparaginyl alanylalanylprolylprolylleucylglutaminylglycylphenylalanyl glycylisoleucylserylalanylprolylaspartylglutaminylvalyllysyl alanylalanylisoleucylaspartylalanylglycylalanylalanylglycyl alanylisoleucylserylglycylserylalanylisoleucylvalyllysylisol eucylisoleucylglutamylglutaminylhistidylasparaginylisoleucyl glutamylprolylglutamyllysylmethionylleucylalanylalanylleucyl lysylvalylphenylalanylvalylglutaminylprolylmethionyllysyl alanylalanylthreonylarginylserine.

This is the full name, 1,913 characters long, for tryptophan synthetase, a protein, which has 267 amino acids in it. I extracted this monster from The Word Lover’s Dictionary by Josefa Heifetz, but it is also cited in Mrs Byrne’s Dictionary of Unusual, Obscure, and Preposterous Words by the same author. If you want to break it down into its components, it consists of many repetitions of the adjectival forms of the names of amino acids, such as alanyl, methionyl, threonyl, and valyl, all of which end in yl, with one instance of serine at the end.

After this piece originally appeared, Alan Wachtel wrote from California to tell me that this word was first printed in the journal Chemical Abstracts in the 1960s. He commented: “At one time, proteins whose structure was known were named just as you described, by the sequence of amino acids composing them. In the 1960s, when techniques for sequencing long proteins were developed, this rule began to generate extremely long chemical names. .. As longer and longer proteins were analyzed, this naming convention quickly grew unmanageable, and Chemical Abstracts reverted to calling these proteins by descriptive names. I think the 1,913-letter chemical name for tryptophan synthetase that you cited must have been the longest term published before the rule was modified”.

The results of such amalgamations can be as big as you like, but they are extremely difficult to parse and comprehend as other than extended chemical formulae. It was the German influence on chemical matters in the nineteenth century which left us a legacy in which we tend to record:

dichlorodiphenyltrichloroethane,
dichlorodiphenyltrichloroethane,
dimethylsulphoniopropionate,
octamethylcyclotetrasiloxane, and
ribulosebisphosphatecarboxylaseoxygenase

as long strings of characters, though it is usual these days to break them into more manageable sections, or use abbreviations (for example, the first of these is better known as DDT and the last is usually referred to as RUBISCO, a crucial enzyme for life on Earth that catalyses the first stage in photosynthesis).

Share this page
Facebook Twitter StumbleUpon Google+ Email

Search World Wide Words

Support World Wide Words!

Donate via PayPal. Select your currency from the list and click Donate.


Buy from Amazon and get me a small commission at no cost to you. Select your preferred site and click Go!

OTHER WAYS TO HELP

Copyright © Michael Quinion, 1996–. All rights reserved.
Page created 18 Dec. 1999
Last updated: 8 Jan. 2000

Advice on copyright

The English language is forever changing. New words appear; old ones fall out of use or alter their meanings. World Wide Words tries to record at least a part of this shifting wordscape by featuring new words, word histories, words in the news, and the curiosities of native English speech.

World Wide Words is copyright © Michael Quinion, 1996–. All rights reserved.
This page URL: http://www.worldwidewords.org/qa/qa-imm1.htm
Last modified: 8 January 2000.