Ask The Experts

How To Transcribe A YouTube Video

Matthew Hughes 27-08-2015

Satyannarayana Asks:

I have some university lectures saved on YouTube. How can I convert these into text?

Matthew’s Reply:

So, this is a really interesting question.

We’ve looked at how to download videos from YouTube 4K Video Downloader Makes It Easy To Get Videos From YouTube 4K video is all the rage and YouTube is a great place to watch them. Streaming 4K video can be too bandwidth-intensive, so save and play them with 4K Video Downloader. Read More before. We’ve dissected all the ways you can convert a video file to an MP3 How Can I Extract Audio From an MP4 or YouTube Video? Do you sometimes just want to listen to a YouTube video? Stripping the audio from a video lets you enjoy the content anywhere. Or you could distribute a Hangouts video podcast as an audio podcast! Read More . But we’ve never looked at how to convert a YouTube video to text.

It turns out, it’s surprisingly easy, with a couple of caveats. Here’s how to do it in the browser, on your computer, and with the help of someone else.

The Firebug Way

This approach requires you use the Firefox browser. If you haven’t already got it, download it. If you haven’t used it in a while, you should update it. This approach was tested with the latest version of Firefox (40.0), on OS X 10.10.5.

Then, download and install Firebug. For those who haven’t came across this before, this is a tool frequently employed by developers when creating websites. It allows them to dynamically tweak the design, markup and structure of a webpage, test JavaScript code What is JavaScript, And Can the Internet Exist Without It? JavaScript is one of those things many take for granted. Everybody uses it. Read More , and debug any problems. But it’s also got a number of uses outside of web development. Don’t worry if you’re not a coder. You can still follow along with this tutorial.


It’s worth pointing out that there’s a version of FireBug How To Install Firebug on IE, Safari, Chrome & Opera Read More for Chrome, IE, Opera and Safari. This spin – called FireBug Lite – doesn’t work with this tutorial. You have to use Firefox.

Once it’s successfully installed, open it and click ‘Net’. By default, the Net Panel is automatically turned off. You’ll have to activate it.


Then head to the YouTube video you wish to transcribe. Click on CC, and pause the video.


YouTube can also translate captions in real time, although the accuracy isn’t great. If you wish to get a transcription in a foreign language, click the gear icon, then “Subtitles”, select “Translate”, and choose your language.


Back in the Net tab, you’re going to need to search for “timed text”.


Once you’ve found it, click it. In the drop-down, select “Request”. This will contain the entirety of your transcription in an XML format.


Select it, and paste it into your favorite text editor. Then get prepared to do some serious tidying up. The YouTube auto-transcriber is questionable at best, and in all of my tests, it produced some pretty strange stuff.


You said you’re planning to use it in lectures, however. This might be a less noisy environment, and therefore produce better results. As always, your mileage will vary.

Don’t forget, some lectures come with pre-written subtitles. This means you don’t have depend on the ones auto-generated by YouTube. You can use this method to gain access to them.

With Express Scribe Free

I feel the Firebug approach is the best one. It’s free, and despite some dubious transcriptions, it works. Although it’s certainly not the only way.

There are also some free packages that make it easy to transcribe audio files, either by hand or using the build-in speech recognition software in Microsoft Windows. One of the best I’ve came across is Express Scribe Free, available as a free download for OS X and Windows.

This is a professional-quality software package, used by people who actually work as transcribers. If you get frustrated with the quality of YouTube’s automatic captions and want to actually manually transcribe your own lectures, this is for you.

There’s only one prerequisite: you will need to convert your YouTube video to an MP3 How Can I Extract Audio From an MP4 or YouTube Video? Do you sometimes just want to listen to a YouTube video? Stripping the audio from a video lets you enjoy the content anywhere. Or you could distribute a Hangouts video podcast as an audio podcast! Read More . Then, you’re ready to start transcribing.  The version for OS X isn’t too dissimilar to the Windows one. It allows you to drop an audio file in, and control it in a way that makes it easy for you to accurately take a record of what’s being said. But there’s one downside: it doesn’t allow you to use OS X’s built-in voice recognition software.


The Windows version does let you use the built in voice recognition, but don’t expect much from it. It’s still very much mistake prone. For more information, check out Ryan Dube’s detailed run-down How To Use Voice To Text Dictation on Express Scribe Read More of Express Scribe here.

Pay Someone To Do It For You

Of course, there’s also a third option.

Depending on how your budget is, you could get someone to transcribe your document for you. This doesn’t necessarily have to be expensive. On (a popular services marketplace Fiverr - A $5 Marketplace For Anyone Looking For A Service Imagine you’ve got a photo of yourself or a friend which you really like, but you know it could be made perfect with just a little Photoshop magic. What you really need is a friend... Read More where tasks start at $5), there are 458 different vendors of transcription services.


Some of the most highly rated of these offer 10 minutes of transcription for the bottom rate of one Abraham Lincoln. Although, with Fiverr, you sacrifice expediency for price. If you pay the rock-bottom price, you can expect to wait as long as two weeks to get your work done. Although you can pay extra to get the task rushed.


Alternatively, there’s also the likes of PeoplePerHour and ELance. Fellow MakeUseOf writer Harry Guinness depends on the latter for the transcription of his interviews:

“It’s easy to find people online through sites like who offer transcription services. Simply post your job to the site and they’ll make a pitch saying how much they can do it for.

When you’re posting the job you want to be clear with exactly what you need. Link to the video and ask them to transcribe the first 15 seconds in their reply. I’d pick the cheapest person who’s profile looks good and their transcription is accurate.

I’ve found that I pay about $20 for an hour or two’s transcription. However, I normally need it done on the hurry up. If you can afford to wait, or you have a lot that needs doing, I’d expect you to be able to find a competent person who’ll do it for around $10 an hour.”

It also goes without saying that transcribing work is a great way to earn money online Your Guide to Making Money Online: Writing, Transcribing and Tutoring Gigs This is your guide to making money online. There are plenty of legitimate ways to earn money if you're savvy enough. Read More .

Lower Your Expectations

There are a number of ways to transcribe audio recordings. But the question is whether they’re any good.

I found that the auto-captions on YouTube were simply not good enough. It produced far too many mistakes, often rendering the produced text unintelligible. Ryan was similarly unimpressed with Microsoft’s built-in speech recognition software.

If you want an accurate transcription, then you’re going to have to make some pretty steep compromises. Either you hire someone to do it for you, which can be expensive. Or, you do it yourself, which is time consuming. The choice is yours.

Image Credits: letters falling by Creativa Images via Shutterstock

Related topics: Mozilla Firefox, Speech to Text, YouTube.

Affiliate Disclosure: By buying the products we recommend, you help keep the site alive. Read more.

Whatsapp Pinterest

Leave a Reply

Your email address will not be published. Required fields are marked *

  1. Emily
    July 23, 2019 at 2:57 am

    nice one

  2. Sam Young
    April 24, 2018 at 2:25 am

    Hi there, have you got an update for 2018?

  3. Sutton Turner
    April 9, 2018 at 2:48 pm

    I like how you talked about how transcribing a youtube production for a lecture can help with a less noisy environment. I have a lecture coming up and we are supposed to watch a youtube video. Thanks for the option on how to transcribe a video.

  4. KJ
    September 8, 2016 at 4:12 pm

    This worked very well for me for videos of a lecture series. I copied the text from the 'response' tab into excel, and that magically got rid of all the tags etc, leaving only the key text behind.

  5. MS
    June 11, 2016 at 1:26 am

    You have an error in the text. You wrote "Request" but you meant "Response". Also, "the dropdown" is misleading. Perhaps "tab" would be more accurate.

    Thank you very much for this tutorial.

    • Bethy
      October 11, 2016 at 8:32 pm

      This comment helped immensely. Thankyou.

    • Bethy
      October 11, 2016 at 8:33 pm

      This comment helped immensely. Thank you!

  6. Ernest
    March 20, 2016 at 3:05 pm

    Given how painful transcribing audio is, people repeatedly ask us why there is still no software that can automatically take an audio and spit out its transcribed text with good accuracy. Now, it’s not entirely true that there is no such software - there are many, but they don’t help in transcribing real-world audio which typically involves handling multiple voices and all kinds of background noises.

    Everybody has got their own style of speaking

    Training a machine to recognize human voice has proven to be very difficult due to the variations in how people speak a particular language. Despite being the most widely spoken language, English itself sounds considerably different in various parts of the world.

    Even if everyone spoke a language the exact same way, there is still the added difficulty of training the system for different voices - from young to old to male to female to hoarse to soft to - you get the drift. Even the same person tends to speak differently in different situations, for example, during a moment of excitement or a bout of cold. Let’s not also forget that some people speak faster, while others speak slower, with lots of ums, ers and uhs, which aren’t even part of any spoken language! Arriving at a speech model that can handle all these variations (like humans do) is really tricky.

    So, are we stuck?

    Based upon what I've read, and my experience, it takes a minimum of 4-5 hours to transcribe 1 hour of digitally recorded interviews. Now the real question is, what's your time worth?

    Transcription process takes precision and accuracy, commitment and dedication, focus and patience..

    My suggestion is to use one of transcription services like

    They are used by Harward, Stanford.. The fact that such top universities trust with their transcripts proves quality credentials.

  7. glykeria
    March 17, 2016 at 3:42 pm

    What are the options for a paid software? I am looking to get something like that for my company. It would be great if we can have subtitles in our video but we would rather buy a software

  8. AN
    February 20, 2016 at 6:47 pm

    Mike, This is absolutely fantastic. Its like you popped open my brain and saw what I need. Great indeed

  9. Kate Toon Copywriter
    January 8, 2016 at 10:20 pm

    Thanks for the recommendation of Express Scribe Free Transcription. I'm trying it out right now.

  10. Anonymous
    September 3, 2015 at 4:15 am

    Might I suggest

    It's a site I built for editing the automatic captions that YouTube produces for videos.

    Also, if you go to the Transcription Pad available from the home page, you can download a copy of the automatic captions to your computer.

    Questions, feel free to contact me.


    • Mihir Patkar
      September 3, 2015 at 7:30 am

      This is really cool! Thanks Mike. We usually don't allow self-promotional comments, but this one is extremely useful and on topic so I'm going to keep it in.

    • Borino
      October 17, 2016 at 7:11 pm

      Very helpful link, Thanks!