Pinterest Stumbleupon Whatsapp
Ads by Google

identify language of textHave you ever stared blankly at a block of text wondering what language it was? With the Internet as powerful as it is nowadays you can translate almost any language to any other language lickty split. But there is one caveat – you need to know what language it is to start with. SO how can you identify language of text?

I use Yahoo’s Babel Fish almost daily translating languages or web pages. But if I don’t know what language it is to begin with I am out of luck. Plain and simple. I have tried may things to identify language of text over the years. I have Googled individual words or tried looking them up in multi-language dictionaries but this is hands down as simple as it can get.

If I was Yahoo I would look into buying this technology ASAP!

Alrighty then, let’s check out how it works. The software is called PolyGlot 3000. A very nifty name I might add – meaning multiple languages.

This language identifier application is only a 2.2MB in size and runs on Windows 95, 98, ME, NT, 2000, XP or 2003.

Simply fire it up when it finishes downloading and you will see this:

Ads by Google

polyglot - language identifier

It looks pretty straight forward… You type in or paste your text that you want to recognize and hit that magic “Recognize Language” button or the F9 hot key and bingo bango your language is recognized:

language recognition tool

It came back with an answer super quick and the answer was correct. 62% accuracy. Not bad with a few little sentences.  Let’s try another language. Do you know what it is?

language recognition

I hit F9 and Polyglot not only knows that it is Russian, it is 100% sure of it and even specifies a more specific dialect as being Pre-Reform.

This is pretty damn impressive. There is only one real preference or option you can modify. That is the amount of languages it is using to compare your text or document. Let’s take a look at it:

By selecting less languages you can speed things up a bit. Even though in all my tests I did not have to wait more than 30 seconds.

But I guess if you have wild foreigners yelling at you, time is probably of the essence :)

According to their website over 400 languages are supported:

The current version of Polyglot 3000 distinguishes 474 languages and dialects. This is  biggest number of recognized languages for a language identification software to date.

Among the more than 400 supported languages only about 110 languages can be called popular. The others are very rare or even already extinct.

One of the most rare and, unfortunately, dying out languages is Pipil. In 1970 there were about 40 persons who spoke on it. Now only about 20 persons remain.

Another rare language is Yukaghir which about 170 persons speak. The Yukaghir live in the northeast of Russia, in the Republic of Yakutia, above the arctic circle. One of developers of the Polyglot 3000 lived near that region for a while.

In this list you can see all supported languages. Some languages have several possible names which differ in spelling, but coincide in pronunciation. In the given list all variants of the name are listed wherever possible.

Do you use something similar? How do you get your translating or language distinguishing on? Let us know in the comments kiddies!

  1. VivekM
    February 8, 2009 at 6:05 am

    You really seem to be living in a parallel universe!
    Google's auto language detector has worked for me for Korean, Chinese, Japanese, Polish, French and Spanish. You don't even need to the translate.google.com page. Just save one these as your bookmarks - http://translate.google.com/translate_buttons

    Google's auto-detect works like it's supposed to!

  2. sumit
    February 6, 2009 at 8:15 pm

    google's auto language detector works perfectly as well

    • Karl L. Gechlik
      February 7, 2009 at 8:44 am

      What language did you try it with and how much text did you use?

  3. Karl L. Gechlik
    February 6, 2009 at 1:04 pm

    Thanks guys. Yes there are loads of programs/sites that translate text but how do you know what the language is that you are translating from? I have tried Google's auto detect feature and it is not very good nor does it give the % of it being correct.

  4. mamassis kostis
    February 6, 2009 at 11:48 am

    i think nicetranslator.com does this as well. I like their service (no affiliation with them in any way).

  5. Daniel
    February 6, 2009 at 11:22 am

    For translating needs, i use qtl extension for Firefox and Writer's Tools for OpenOffice.

  6. mark
    February 6, 2009 at 10:36 am

    google has built in language recognition. in google translate select auto detect

    • Karl L. Gechlik
      February 6, 2009 at 11:05 am

      I should have addressed this in the article - Yes Google has this feature but No it does not work well.

      http://translate.google.com/translate_t?hl=en#auto|en|%D0%A0%D1%83%D1%81%D1%81%D0%BA%D0%B8%D0%B9%20%D1%8F%D0%B7%D1%8B%D0%BA%20Russkiy%20yazyk

      I ran that russian text through the google translator, I selected detect as the language type and it returned me no translation and an assumption that it was in Albanian...

      Google translator FAIL!

  7. Vanatic
    February 6, 2009 at 3:12 pm

    Not necessary!
    USE: frengly.com/ !!
    It got auto-detection!

    or as "mamassis kostis" already said:

    nicetraslator.com

    Keep goin'! - MakeUseOf.com is incredible good!

    Regards

Leave a Reply

Your email address will not be published. Required fields are marked *