How to Convert A PDF to Text With Text Extractor

pdfHeadSo you have some important data all caught up in a PDF file. A PDF is a document that has been committed to its format and most likely cannot be opened for editing or copying unless the author has allowed that.

How do you convert that PDF to text? You could print it and try and scan it back into your machine using a free OCR software or you could grab this awesome little application called PDF Text Extractor.

We have covered several applications for retrieving data from the confines of a PDF document but this application concentrates on the text. If that is what you need ““ your text and nothing but the text – then this is the program for you.

Let’s take a look at it and how it can convert a PDF to text.

I started by downloading and installing the program. It was a quick process and the MSI installer file was about 1.13 megabytes. When I ran the application this is what I saw:

how to convert pdf to text


This is a really easy layout and graphical user interface ““ simply select the PDF you would like to convert to plain text. Then you select the folder you want the converted plain text file written to. You can type in a full path for each or hit the button next to the blank field to browse.

When you are ready your window will look something like this:

how to convert pdf to text

Now our PDF document that contained our text ““ the text that we need to reformat and put into a manual is in a file. That file looks like this before we begin:

convert pdf to text

7 full pages of text, text and more text. That would be a lot of typing for poor Betty our departments secretary. So I went Googling and found our little application. I fired it up and hit that magic convert button. Literally it took 2 seconds and then a window popped up that said it was complete:

converting pdf to text

I went looking in my d:\ drive for the file. I actually realized I had no idea what it was called or what the extension would be. I sorted the files by date created and found what i was looking for:

pdf text convert

I opened the text file up and as you can see above the files name seems to always be TextFile.txt. The contents of the text file were just that”¦.all of the text that I needed! Yeah!

This is what I saw:

convert pdf to text

My formatting was not 100% there but all of the glorious text was and now it is a just a matter of copying, pasting and formatting. Nowhere near as big of a job as it was before! And we do not need Adobe Acrobat Reader, Writer or any other nonsense on our machine. Just this little application and a PDF file. You can open the text file in Word, Notepad++, Wordpad or whatever your favorite editor is.

How do you extract text from a PDF? We would love to hear about it in the comments!

Download: Text Extractor

Tagged:

Karl L. Gechlik

Karl L. Gechlik here from AskTheAdmin.com doing a weekly guest blogging spot for our new found friends at MakeUseOf.com. I run my own consulting company, manage AskTheAdmin.com and work a full 9 to 5 job on Wall street as a System Administrator.

Similar Stuff

The comments were closed because the article is more than 90 days old.

If you have any questions related to stuff mentioned in the article or need help with any computer issue, just ask it on MakeUseOf Answers.

  • smb

    I just choose “Save as Text” from the file menu in Adobe Reader.

    • http://www.asktheadmin.com Karl Gechlik

      Not if the author has disabled it – then it will be grey-ed out and you will be out of luck.

  • Alex

    generated an empty txt file of 1 KB

    • http://www.asktheadmin.com Karl Gechlik

      Was there text in the original PDF? What Operating System are you using and did you get any errors?

  • Alex

    a book !windows xp,no error..processed completed

  • Brian

    I’ve found this website to be a pretty good way of converting pdf to text. Sure you have to upload it and then they e-mail it to you, but it works. It converts pdf to .doc or .rtf and it’s free. http://www.pdftoword.com/default.aspx

  • kye

    http://www.hellopdf.com is simply the best…

  • Satrianez

    You guys should have put warnings… or found a better program!
    Warnings for the fact this little proggie asked me to update my .NET Framework and sent me to install the hefty 231MB .NET Framework 3.5 SP1 which took a good 20 minutes to install and soiled my Firefox with the nasty uninstallable Microsoft .NET Framework Assistant addon!!!

    I then had to search the net to find a tool to remove it… here it is for the people in trouble with it -> http://bit.ly/removedotnetfirefoxaddon

    Now, the PDF Text Extractor… it works BUT it’s not even multi-language!
    I tried converting a pdf in French and all the accentuated letters were not even converted properly… é become 351 and so on.

    Verdict: not chuffed! (no mentioning the time I wasted)

    I really like makeuseof.com but this article should have warned users of those possible problems.

    A quick search for an alternative pointed me to http://www.convertpdftotext.net which worked wonders (don’t know about protected PDFs though) with the accents and would have been a nice add on to the article and would have avoided me wasting so much time and energy for such poor results.

    • http://www.asktheadmin.com Karl Gechlik

      I already had .net 3.5 sp1 on my machine and was not required to download anything. It prompted me to make sure I was up to date – but that was it. I have since tried it on another machine and could not replicate your issues. I am sorry you had problems but I would bet it was something limited to your system.

    • http://www.wghartford.com/ Beth_in_PEI

      For years I’ve used the free tools of http://www.software995.com/ without any problems. Having the ability to manipulate PDFs totally (extract/delete/combine individual pages from any PDF files, convert to TXT or JPG, append a signature etc) is so useful; admittedly it’s nagware but that’s a minor nuisance considering the breadth of its features.

  • http://www.ilovefreesoftware.com Ishan@ILoveFreeSoftware

    This is a nice little software.

  • melvin James

    I’m using AnyBizSoft PDF to Word Converter to convert PDF to Word, it works well for me so far. But an OCR converter is always needed. Very nice.

  • dhavala

    It is asking me to install .NET platform 3.5 ( I have windows XP SP3). I refused. This program doesn’t belong on MUO (I really like the web site and found plenty of good programs!!) and I concur with Satrianez. The writer should have warned of the .NET issue… atleast after Satrianez raised it.

  • http://www.simpopdf.com bhupi

    yes, this is really a nice idea ti convert pdf to text,i always use simpo pdf to word to convert my pdf, i never know about it,very nice!