The Longest Now


On Transcription [ the Joys and Sorrows of ]
Sunday February 27th 2005, 3:15 pm
Filed under:

First, here is a quick overview
of the professional webcred
transcripts, as initially received.  (By the time you read this,
they are probably already cleaned up, so this is a moot point. 
This is presented for transcription geeking value only.)  More
detailed
commentary and examples at the end.  

[Now I really want
to interview someone who runs a transcription business… what kind of
special effort do they take with celebrities’ names and quotes? 
What other VIP services are available?  What kind of insurance do they need, and what happens when you transcribe slander
How hard is it to produce a perfect real-time transcript, and how close
to “real-time” can one get?   My private hunch: as close as
needed to exactly real time, including ‘post’-processing.]


The good:
  1. The transcripts are beautiful.  In contrast to some gov
    and court transcripts I have seen, they are well laid out and easy to scan.
  2. The English is clean.
    Interruptions and stuttering are cleanly handled, and most sessions
    have useful sentence and paragraph breaks (even when the speaker was
    rambling).
  3. They get most names, proper nouns, organizations, and technology
    references right.  More than I would get without a list of names
    in front of me.
  4. Their attention to small connecting words and comments
    under-the-breath is generally excellent.  In general, their
    accuracy is fabulous [which is, as I’m sure you all know by now, the Official Word of 2005], some 99.5%
  5. This was done quickly : a ~1 week turnaround for 15 hours of dense audio.


The bad:
  1. They are inconsistent. In some places, [inaudible] is used and
    every
    speaker gets his/her own line and paragraph.  In others, just a
    few dashes or underscores “___” are used to indicate something
    inaudible.  Full names are used in some places, and not in others,
    at times ambiguously.  Some sessions have poor formatting,
    paragraph breaks; the transcription of podcasting audio clips, for
    instance.
  2. They get many names, proper nouns, organizations, and technology
    references wrong.  They should have lists of these terms in front
    of them, and should ask for what they don’t have.  This would help
    them spell podcasting without a space, write “blogosphere” without a
    cap, and remember that yes, Jon Bonn


    No Comments so far
    Leave a comment



    Leave a comment
    Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>




Bad Behavior has blocked 503 access attempts in the last 7 days.