Filed under: %a la mod
Professional webcred transcripts have come back
from distant lands. I am awfully excited about this —
unreasonably excited, even. I have a particular primal affinity
for clean, freshly-minted transcripts; particularly modern ones in all their searchable glory.
I’ll spare most of you my full review of the transcripts, but here is a quick overview:
- The transcripts are beautiful.
- The English is clean.
- They get most names, proper nouns, organizations, and technology
references right. - Their attention to small connecting words and comments
under-the-breath is excellent, some 99.5% capture. - They were fast : 1 week turnaround for 15 hours of dense audio.
- They are inconsistent :
[inaudible] usage, accuracy of names — full names in some places, and
not in others — session formatting,
paragraph breaks. The transcription of podcasting audio clips was
super-shaky. - They get names, proper nouns, organizations, and technology
references wrong. They should have lists of these terms to avoid that… - The transcript is wrong or confused in rare places.
- Many passages were misattributed.
- They
get key names wrong; not just one or two. They inconsistently
attribute the same speaker’s voice to more than one name. They
get a
name right on one line, only to misspell it (apparently using
audio-to-text software – “I am Hussain Direction”) on another.
I imagine the first two ugly problems are in part
because many different people
transcribe any given session… but names are important enough to
merit separate passes just to get them right.
People always tell me I’m too critical ^_^. Don’t get me
wrong; I’m immensely grateful that such services exist, glad this was
done for webcred, and hope we will do the same for Wikimania. Rest assured that this criticism is presented with love and a sense of kinship. I also enjoyed
working up verbatim transcripts of two of the sessions. For comparison, here
are my transcript of friday morning and the professional version.
No Comments so far
Leave a comment
Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>