10 Characters for Your WordPress Blacklist
Quick WordPress tip for easily and quietly blocking a ton of comment spam. Akismet and other programs are good at catching most spam, but every now and then a bunch of weird, foreign-language spam will sneak past the filters and post live to your site. Here’s a good example of the kind of stuff that’s easy to block:
This type of spam hits in waves, with similar character patterns running throughout each batch. So you’ll see a bunch of nonsensical spam comments that vary in IP, name, email address, and so on. If other spam mechanisms fail, using WordPress’ built-in anti-spam functionality is a great way to immunize against junk like this:
We can stop that sort of garbage from scaring away visitors by adding a few lines to your Comment Moderation or Comment Blacklist (both located in your Discussion Settings). Simply add these codes to either list.
The beauty of this technique is its simplicity. WordPress uses regular expressions to scan comments for any of these characters. The comments aren’t deleted, so there’s no real risk, and the chances of someone actually using one of these characters in a real comment is slim to none. What WordPress does with matching comments depends on where you put the list:
- Added to the Comment Moderation list will result in blocked comments getting sent to the Moderation queue.
- Added it to the Comment Blacklist will result in blocked comments getting flagged as spam and sent to the Spam bin.
It’s probably safest to add these characters to your Moderation list just in case anything worthwhile happens to show up (it won’t). Once you Save your changes, forget about it. Just monitor (or don’t) your comments as usual and let WordPress’ built-in anti-spam skillz do the work.
Exceptions
Although an elegant and effective technique, you may want to skip using if either of the following apply:
- You have trackbacks/pingbacks enabled and displaying on your site
- You allow comments in languages that use any of the blocked glyphs
Otherwise, the list makes an excellent addition to any anti-spam strategy. Especially if you are only using Akismet, this is a great way to further improve the overall security and integrity of your site. For more information and more extensive WordPress blacklists, check these:
Note: To suggest additional characters in the comments, remember to wrap each one with a <code>
tag. Thanks :)
28 responses to “10 Characters for Your WordPress Blacklist”
Comments like that land in my moderation queue, even without adding those characters and Akismet gets them, anyway. Nice tipp though, if anyone has problems with spam like that. (I always check “Comment author must have a previously approved comment” so this kind of spam never gets through)
Great trick to limit the spammers, am seeing more and more spam every day from that corner of the world.
I should make it clear that this technique is aimed at stopping specific types of spam by selectively choosing a few characters to block. It’s not aimed at any particular place or language, just whatever happens to be showing up in the comment box. Lately it’s been a lot of Cyrillic-based spam, but the technique may be generalized to block spam from any language.
That said, thanks for the comment! :)
The only downside to this is if it’s actually a legitimate comment. I have seen one comment come in like this (Akismet held it for moderation), but it didn’t seem to have a harmful backlink and when I ran the comment through the Google translator, it seemed legitimate. So my concern would be looking like an a$$hat to legitimate visitors from other countries.
Yes, good point – and addressed in the “exceptions” section:
The other exception would be for trackbacks/pingbacks, which usually are the only place that other languages appear on blogs. Either way, it’s ultimately up to the Admin to use their own judgment and not look like an ignoramus to people trying leave feedback in some other language.
Very timely–I had just been trying to figure out how to stop the Cyrillic spam that Akismet kept missing. Thanks! I asked Akismet if there was anything I could do and the best they could come up with was to say they were working on it for a future version.
Nice idea, but. I found that “The Invisible Captcha” does the same job as askimet for FREE, yes FREE, and it really does work!
it is no.1 on my top wordpress plug-ins:
http://asummersdesign.com/blog/wordpress/the-10-most-essential-wordpress-pluggins-2
And what about I sign my name in cyrilics and I’m not a prostitute from Kiev????
No doubt it’s a nice way to block foreign spam, but if I want to leave a comment, say, in Russian, that’s gonna be spam. Sad.
btw, you may also consider blocking the “?” and “?” characters (these are Cyrillic), they are one of the most frequent in Russian.
And yes, I use Akismet to block tons of Cyrillic spam. It works.
Thanks for the tip on the “a” and “e” – will experiment and update the list.
Re: “Sad.” If someone sends you an email in some foreign language, do you take it seriously? I usually bin those ones.
Well yes, that’s right. But I try to Google translate it first )
I’ve read comments and people complain on Akismet for missing Cyrillic spam. I thought Akismet is making some kind of shared spam-base; on Russian blog it’s blocking all the spam – Cyrillic and not.
And one more thing on Cyrillic spam.
Based on unique words analysis in random texts (that’s a rough estimate) ten most frequent letters in Russian (Ukrainian and others) are:
? — 9.36%
? — 8.40%
? — 8.08%
? — 6.91%
? — 6.12%
? — 5.67%
? — 5.49%
? — 5.30%
? — 5.00%
? — 4.67%
But even this technique is not 100% bombproof. Spammers often change Russian letters to English that look alike to freak spam-filters out, so blocking Cyrillic “?”, “?”, “?”, “?”, “?”, “?”, “c” may not help.
Good luck with fighting spam )
Editor’s note: looks like the characters were lost during the database merge slash clean-up..