Grammar Translation Machine in Researchers Will Fix Try

The
makers of a University of Southern California computer translation system
consistently rated among the world’s best are teaching their
software something new: English grammar.

Most modern "machine translation" systems, including the highly
rated one created by USC’s Information Sciences Institute, rely on brute
force correlation of vast bodies of pre-translated text from such sources
as newspapers that publish in multiple languages.

Software matches up phrases that consistently show up in parallel fashion
— the English "my brother’s pants" and Spanish "los pantalones
de mi hermano," — and then use these matches to piece together translations
of new material.

Quite apart from the mysteries of my brother’s pants,
human grammar is one of the gnarliest knots of logic and lack of logic
ever to be conceived by the human brain. The basic problem is that the
meaning of any group of words is determined not only by the words themselves,
but also by the underlying relations betweeen the words, by the situation
in which they were strung to gether, by the personality and mood of the
speakers, and by previous conversations or knowledge on the part of participants.

For example, last Saturday we were waiting for a tennis court at the “Just Don’t Suck Tennis Club” and noticed one of our co-conspirators reading the New York Times. Asking for a section we were disappointed to discover that it was from Thursday, three days old. The next day, Sunday, we arrived early to find the same friend again reading the Times. “Still working on Thursday’s?” I asked, nodding at the paper.

“Yeah,” he answered, surprised, “8:30 to 4, same as ever.”

A really smart guy at MIT named Noam Chompsky wrote
about something he called "deep grammar" over 25 years ago, and if we
could figure that one out true machine translation would
be possible, but so far not even Deep Blue can grok Deep Grammar.

Good machine trnaslation is a sort of Holy Grail
of one whole school of cybrenetics.  If achieved, it would have profound
implications for how we relate to information and each other. Imagine,
all of the resources of the Interent available instantly in what ever
language you are most comfortable with; contextural indexing of all podcasts,
soundtracks, webcasts and conference recordings, Hal-like computers that
can truly understand natural speech and talk back intelligently; even
the Star Trek-like handheld universal translator.

So it is not surprising that as we write some of the
smartest scientists on the planet are working on this nut, or that millions
of dollars and mondo super-computing resources are committed to the effort.The
study in this story, a $285,000 effort, called the Advanced Language
Modeling for Machine Translation project out of USC is a tiny part of
that effort, but the article includes a pretty good review of the state
of the overall effort.

from a University of Southern California press release

This entry was posted in Technology. Bookmark the permalink.