ASP.NET Captcha – Image Generation

ASP.NET Team Blog
04 March 2010

ASPxCaptcha Control Last week, you saw a sneak peek of the new ASP.NET Captcha control that’s coming out in the DXperience v2010.1 release. In this follow-up post, you’ll learn about the ASPxCaptcha control and:

  • how it creates the Captcha images
  • why ASPxCaptcha doesn’t use background noise
  • how ASPxCaptcha performed in an OCR test versus a competitor’s Captcha control

Guidelines And Character Set

While designing the ASPxCaptcha, our team reviewed proper guidelines and recommendations:

"Strong CAPTCHA Guidelines by Jonathan Wilkins" [http://www.scribd.com/doc/24497942/Strong-CAPTCHA-Guidelines-v1-2]

Using one of the article’s recommendations, the ASPxCaptcha uses a default character set which excludes symbols that are hard for the end user to recognize. However, you can still define your own set of characters and length using the ASPxCaptcha’s properties.

ASPxCaptcha CharacterSet Property

Algorithm Challenge

The primary goal of the Captcha control is to make it easy to decipher for people but difficult for machines. And the problem of “machine recognition” is really divided into three parts:

  1. Pre-processing - removal of your background and noise
  2. Segmentation - selection of regions in the original image that contain the individual characters
  3. Classification - identification of the characters in each region

The first and third problems are easily solved by most modern and public optical character recognition (OCR) software.

However, the segmentation problem seems to be the last one for spammers to crack easily. So far, there is no universal and trivial algorithm for this step of machine Captcha solving. This ‘recognition’ step requires more researching and computing power from the spammers. And lucky for us, most spammers either do not have this computing power or they do not want to invest in it.

Revealing Microsoft Study

Microsoft HIPS Research Don’t believe me? Check out this study conducted by Microsoft Research which confirms these findings:

"Building Segmentation Based Human-Friendly Human Interaction Proofs (HIPS)" - [http://research.microsoft.com/en-us/um/people/kumarc/pubs/chellapilla_hip05.pdf]

The research reveals some very unexpected results.

Based on the research from Microsoft, machines understand skewed images with noise much better than humans. So, we can conclude that noise and strong distortions are not very effective methods of protection. In fact, the noise in images hampers recognition for the end-users.

This is why the ASPxCaptcha does not use noise in the images. Instead, the image generation algorithm focuses more on the segmentation, specifically, cutting away the segments between the characters.

OCR Test

And the results are easy to verify. Using any moderately priced OCR (for example, FineReader), we verified that NONE of the ASPxCaptcha images were recognized!

For comparison, we took a look at one of our competitor’s Captcha controls. The competitor’s Captcha was identified about 90% of the time by the OCR.

So What Are The Big Companies Using?

Most giants like Google are using the approach of generating a picture similar to how the ASPxCaptcha generates ones.

Google Captcha ASPxCaptcha - Google Style
Google ASPxCaptcha

Vector Not Bitmap Fonts

The ASPxCaptcha uses vector fonts for characters instead of bitmap fonts.

Why does this matter to you? Because it gives you more customization options with the ASPxCaptcha!

All other implementations that we examined used bitmap fonts during the process of gap removal between characters. This way of removing gaps is easier to implement but you can’t fit rendered text in a bigger image than the one that it was designed for without a loss in quality. Therefore, these other Captchas are not as customizable as the ASPxCaptcha which allows developers to customize its image size.

Anti-Aliasing FTW!

To give you better looking images, our ASP.NET team went the extra mile and developed a better way to render the Captcha image. Specifically, they developed a modification for bilinear filter for the anti-aliasing. This bilinear filter makes the image smoother and avoids the pixel staircase effect during skewing.

anti-aliasing is the technique of minimizing the distortion artifacts known as aliasing when representing a high-resolution signal at a lower resolution. [Wikipedia]

This type of modification is used for rendering textures in computer graphics. For example, when you are close to a wall, you see blurred texture, not large square pixels. And this anti-aliasing also works very well for the ASPxCaptcha.

Coming in DXperience v2010.1

ASPxCaptcha will be available in DXperience v2010.1 which should be released sometime around the April timeframe.

DXperience? What's That?

DXperience is the .NET developer's secret weapon. Get full access to a complete suite of professional components that let you instantly drop in new features, designer styles and fast performance for your applications. Try a fully-functional version of DXperience for free now: http://www.devexpress.com/Downloads/NET/

Free DevExpress Products - Get Your Copy Today

The following free DevExpress product offers remain available. Should you have any questions about the free offers below, please submit a ticket via the DevExpress Support Center at your convenience. We'll be happy to follow-up.
No Comments

Please login or register to post comments.