Spam Bots and CAPTCHAs Stories from the web spam fighters Tue, 19 Oct 2010 10:53:48 +0000 en “Raed taht” doesn’t work Tue, 25 Mar 2008 05:23:02 +0000 olpa

Acodricng to raserech at Cmaribgde uitisnevry, it dsoen’t mtaetr waht oredr the lteerts in a wrod are …

Can we use word mangling in captcha software? Unfortunately, no. According to the report of Dmytry Lavrov, “in many cases computer reads taht btteer tahn hmaun cluod”.

Trackback Confirmation plugin for WordPress Mon, 17 Mar 2008 03:35:23 +0000 olpa Finally, I’ve released yet another but unique anti-spam plugin for WordPress. The formal description:

Publishing of a trackback or a pingback is postponed till someone, usually a trackback author, approves it. Trackbacks which are not approved in 20 hours are automatically deleted. It stops spam from bots and allows trackbacks from the real humans.

A better description is probably a description by an use case:

* Alice has Trackback Confirmation installed.
* Bob has written a blog post with a link to Alice’s blog post.
* Bob checks if a trackback appears in the Alice’s blog.
* No, the trackback hasn’t appeared. But Bob notices the link “approve trackbacks”.
* He follows the links, finds his trackback and approves it.
* Now Alice’s blog links to the Bob’s post.

It’s just a web 2.0 style of trackback moderation. Comments are approved not by the blog’s author, but by the blog’s visitors.


How do you spell “CAPTCHA”? Tue, 04 Mar 2008 05:24:32 +0000 olpa Once, browsing the server’s log file, I noticed a visit from Google. Someone mistyped the word “captcha” as “capcha”, and by accident I mistypes this word in a post too. Now I decided to look at the popularity of possible misspells.

* captcha — about 46,700,000
* capcha — about 579,000 (Google suggests the correct spelling.)
* captca — about 3,300 (Google suggests the correct spelling.)
* kaptcha — about 3,920 (the project name)
* kapcha — about 16,600 (the project name, not related to captchas)
* kaptca — about 228 (Mostly login names. Google suggest the correct spelling.)

According to Google, 1.2% of people misspell “captcha” as “capcha”. Other possible ways of misspell are not popular.

The second experiment was with Technorati. It returned 5,378 results for “captcha” and 86 results for “capcha”. Surprisingly, the percent grows: 1.6%.

Disclamer: Akismet is good Thu, 28 Feb 2008 04:38:29 +0000 olpa One can decide that I hate Akismet. It’s not so. Akismet is a great system developed by the great programmers, and I highly respect it.

However, I think that the idea of a centralized database doesn’t work. Time to time, I write about the problems in the blog. And obviously, I refer to Akismet as it is the main player in this field.

Most likely, when I write “Akismet”, I just use the short word instead of the long phrase “the idea of a centralized database to filter comment and trackback spam”, and I don’t want to offence the Akismet developers.

WordPress spam - III. Trackbacks. Tue, 26 Feb 2008 04:12:46 +0000 olpa From Wikipedia:

A Trackback is a method for Web authors to request notification when somebody links to one of their documents. … Some individuals or companies have abused the TrackBack feature to insert spam links on some blogs. This is similar to comment spam but avoids some of the safeguards designed to stop the latter practice. As a result, TrackBack spam filters similar to those implemented against comment spam now exist in many weblog publishing systems. Many blogs have stopped using trackbacks because dealing with spam became too burdensome.

I’ve already published an idea how to resurrect trackbacks: trackbacks should be performed through an intermediate, not directly. (For details, read this post, “decline and fall of the trackbacks; rise and resurrection of the trackbacks”.) Unfortunately, such protection depends on a third party.

And here is yet another idea, which doesn’t require an external service.


WordPress spam - II. Types of spam. Wed, 13 Feb 2008 05:40:11 +0000 olpa I know 3 types of spam in WordPress, and speculate one more:

* comment spam
* trackback (or pingback) spam
* registration spam
* plugin spam


Stop WordPress spam - I Wed, 06 Feb 2008 05:12:30 +0000 olpa As the most popular blogging platform, WordPress is a target of spammers. If you are an average blogger, your everyday job is to delete spam from you blog, even if you have an anti-spam tool installed.

The default antispam weapon is Akismet. But I dislike it. More rambling follows later, now just google for “akismet sucks”.

What I develop, use myself and highly recommend to everyone is …


Too much good is also bad Wed, 30 Jan 2008 04:16:59 +0000 olpa A simple, but very effective phpBB antispam tool Textual Confirmation (TC) asks newly registering user a question. If the answer is wrong, TC rejects the registration.

How much questions do you need for the best protection? Hard to say, but definitely not 50.

Earlier or later, a cheap outsourced monkey answers some of your questions and adds the answer into the spammer’s database. As a counteraction, you need to change you question. When you have 50 questions, it’s a tedious task.

In my opinion, 2 or 3 questions is enough.

Satisfactory recognition rate Thu, 24 Jan 2008 04:44:11 +0000 olpa The Yahoo CAPTCHA is broken press-release reveals some numbers. I’m highlighting them:

It’s not necessary to achieve high degree of accuracy when designing automated recognition software. The accuracy of 15% is enough when attacker is able to run 100 000 tries per day, taking into the consideration the price of not automated recognition — one cent per one CAPTCHA.

* 15% recognition rate is enough
* 1 cent per 1 CAPTCHA when using monkeys

Yahoo CAPTCHA is broken Tue, 22 Jan 2008 04:37:29 +0000 olpa According to the hmm… press-release (formally, it’s a blog entry, but the style is very press-releasish), the Yahoo CAPTCHA is broken.