Friday, March 5, 2010

YouTube videos automatically captioned in English

If you're learning English and wondering where you can get video with subtitles, look no further than your friendly neighborhood YouTube. Google is launching some new features on YouTube that will make for some good language-learning resources for those whose target language is English. In particular:
  1. For selected English video content, YouTube is implementing automatic captioning.

  2. For all English video content, YouTube is implementing that ability for video owners to automatically caption their video by simply uploading a transcript of the video (YouTube will do all the work in terms of putting the captions in the right place).
Uploaded transcripts should in theory be perfect, but how about auto captioning? According to Google:
The captions will not always be perfect … but even when they're off, they can still be helpful—and the technology will continue to improve with time.
Although I haven't tried it out, it seems like the owner of the video can edit the caption files, so there appears to be a means to correct incorrect machine captioning. It'd also be great if this was opened up in some way to crowd source, but that doesn't seem available at this point.

For you English learners, consider yourself lucky to have this tool available to you. For the rest of us, let's keep an eye out for Google expanding this to other languages.

Automatic captions in YouTube [The Official Google Blog]
YouTube Expands Auto-Captioning Program [WebProNews]
YouTube Launches Auto-Captions For All Videos [TechCrunch]


  1. I saw the news about this a few months ago and have been eagerly awaiting mass transcripting of videos, but YouTube has been slow in rolling it out. But you can do a search for just the videos with transcripts turned on by doing a search and then changing the Search Options. I wrote up instructions on how to find them on my blog a while back: Finding videos with captions for listening practice

    A couple other sites that have captioned videos or transcripts are Speakertext and DotSub.

  2. The transcriptions can be shockingly bad, but this shouldn't be a surprise to anyone who has tried using dictation software on their home computer. Dictation software is great... if you've got a high-spec voice recognition headset. If not, it messes up words, it interprets every breath as a word, etc etc.

    Incidentally, this is why the voice recognition in language learner packages is so inadequate -- it's being sold to people without VR headsets, so it has to be possible to get green without the computer really understanding what you've said.

    By analogy to painting, it accepts a blur in the rough form of a person as a person.

  3. It looks good,I have learn a recruit!
    Recently,I found an excellent online store, the
    are completely various, good quality and cheap price,it’s worth buying!