The Longest Now

On disambiguation and The Atomization of Meaning
Thursday June 25th 2009, 7:46 pm
Filed under: Glory, glory, glory,metrics,Uncategorized,wikipedia

Disambiguate has been a somewhat obscure term for ‘specify’ for ages.  And the noun form, disambiguation, has been used even more sparingly.  At some point in the last century, perhaps in the 1950s, it became a popular term in computational linguistics.   And before that it was basically only used by one person, writing about logic and semantics in the early 19th century.  All of this sprang to my mind because of the tremendous popularity of the word in and through Wikipedia.  In the encyclopedia, it is the canonical way to describe the clarification of an ambiguous term, the indication of type used to specify the context of an article title.

Google n-grams and other public domain searches suggest disambiguation was not popular at all before the 50s.  It is used in quotes in a 1954 federal court case, expressly referencing the earlier work of the one philosopher and author who consciously used it for a specific purpose: Jeremy Bentham.  But who introduced it into the jargon of linguistics?  And to the original point, who introduced it to Wikipedia?


The word’s recent history touches on Rush, Nirvana, Invictus, Larry, and Magnus… and started with a policy page on Naming conventions/Disambiguating.

Earlier uses of the term in the 50s are in the context of ‘disambiguation programs’ and ‘automatic word disambiguation’.   Then by 1960 comes Dwight Bolinger, using it boldly and provatively.  “understanding presupposes disambiguation.  Disambiguation presupposes the processes that make it possible”.    This was picked up by literary critics in France, and by other linguists such as Anthony oettinger writing about automatic translation.

Certainly the Wikipedia usage was guided by the linguistic usage before it:  “word sense disambiguation” and disambiguation in semantic analysis were all the rage across linguistics in the 1990s, as the term had moved out of computational linguistics into the field’s mainstream.

The dominance of the current year’s Internet makes finding the original on-wiki use easier in ways and harder in others.  We can track revisions of most Wikipedia pages, but the use of this term predates the creation of the new software to preserve all revisions,  in August 2001 — so some guesswork is required there.

Sometime around March 20, 2001, the issue of distinguishing between two similarly-named Wikipedia articles comes up.  User:Invictus (NB: no userpages back then!) creates the article [[RushBand]], following the earlier model of [[NirvanaBand]].   And who should respond with a philosophical note on the right way to disambiguate but Larry Sanger, starting the page Naming_conventions/Disambiguating.   Within eight months, Magnus Manske had written new code allowing the use of parentheses in article titles, and the use of the term disambiguation to describe appending a parenthetical clarifier at the end of an article name, or listing a number of similarly-named titles, has taken off.

I like this coinage better than the alternative suggested for those lists:  ‘jump pages’.  

Since first writing this essay, I noticed that in Robert McHenry’s brief history of Britannica Online, he notes that disambiguation was already used for word-sense clarification in searching among encyclopedia editors at this time (presumably back in 1994); so perhaps it was a commonplace among editors offline as well.

At any rate, here is the original quotation from Bentham’s papers laying out the place of Disambiguation in the heirarchy of Exposition.  This is from George Bentham’s 1827 work Outline of a New System of Logic, in which he reviews his uncle Jeremy’s papers and a recent set of writings on logic by one Dr. Whately.

He diagrams the elder Bentham’s 12 modes of exposition (physical designation, translation, etymologization, definition, individuation, paraphrasis, archetypation, description, parallelism including antithesis, enumeration, exemplification, and illustration), and says:

If exposition be considered with respect to its immediate object, it may be divided into Onomatopoea, or the giving a new name to an idea, and into Exposition of existing words.

In following the same principle of division, exposition of existing words may be subdivided into the following operations:  —
1. Substitution of a new sense to the one in which a word has already been used, an operation resembling onomatopoea, but attended with much more practical inconvenience, excepting where the use of the word in its old sense be at once disadvantageous, and of rare occurrence.
2. Elucidation—where the object is to give clearness to an obscure term.
3. Disambiguation—where it is to fix the sense of an ambiguous term. This operation has been termed distinction by some Logicians, and erroneously reckoned as a species of division.
4. Ampliation—where it is to extend the sense of a term.
5. Restriction—where it is to restrict the sense of a term.

Let us take pains never to erroneously reckon dabbing a species of division!

As an extra special nostalgia bonus: note the link-preserving awesomeness buried in the first post to wikipedia-l : that link still works, through a dozen TLD, domain, software and naming changes.  (though note Jacob’s comment: it was broken for 30 months)

2 Comments so far
Leave a comment

Re: that link, you’ll note that the BrilliantProse page was added back to Wikipedia in Jan 2009, and made to redirect, precisely so that the link in that email would again work. Edit summary: “(redirect with historical purposes” … so not quiteas impressive a feat, but still, pretty neat.

Comment by Jacob Rus 07.01.09 @ 8:41 pm

Well, it was deleted only on June 28, 2006:

Comment by Nemo 08.01.09 @ 4:35 am

Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Bad Behavior has blocked 441 access attempts in the last 7 days.