60 Million Captchas a Day

Products by Dave Naffziger on May 25, 2007 at 5:37 pm

That’s how many captchas are filled out by people proving themselves to be human.

Enter reCaptcha - they have figured out a way to make all of this captcha solving useful - digitizing books.

Each ReCaptcha captcha has two words:

  1. An unidentified word from a scanned book
  2. A known word

So, you get to digitize the world’s information by doing something that machines can’t. Assume an average book is 80K words (200 pgs by 400 words per page) and that OCR is 95% accurate: 4K captchas mean 1 book is digitized. That creates a potential of 15K books a day.

Cool. It kind of reminds me of the first SETI screensavers.

I’ve installed it on my comment form.

Related Posts

  • Nik

    I have been reading about this for sometime now. But this is the simplest explanation of what its about. Now I get it! What a great idea…

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. | Dave Naffziger’s Blog | Dave & Iva Naffziger