Spring Sale! Save 30% on all books w/ code: PLANET24
Web Dev + WordPress + Security

10 Characters for Your WordPress Blacklist

Quick WordPress tip for easily and quietly blocking a ton of comment spam. Akismet and other programs are good at catching most spam, but every now and then a bunch of weird, foreign-language spam will sneak past the filters and post live to your site. Here’s a good example of the kind of stuff that’s easy to block:

[ Screenshot: Comment Spam in Moderation ]

This type of spam hits in waves, with similar character patterns running throughout each batch. So you’ll see a bunch of nonsensical spam comments that vary in IP, name, email address, and so on. If other spam mechanisms fail, using WordPress’ built-in anti-spam functionality is a great way to immunize against junk like this:

[ Screenshot: Comment Spam in Moderation ]

We can stop that sort of garbage from scaring away visitors by adding a few lines to your Comment Moderation or Comment Blacklist (both located in your Discussion Settings). Simply add these codes to either list.

The beauty of this technique is its simplicity. WordPress uses regular expressions to scan comments for any of these characters. The comments aren’t deleted, so there’s no real risk, and the chances of someone actually using one of these characters in a real comment is slim to none. What WordPress does with matching comments depends on where you put the list:

  • Added to the Comment Moderation list will result in blocked comments getting sent to the Moderation queue.
  • Added it to the Comment Blacklist will result in blocked comments getting flagged as spam and sent to the Spam bin.

It’s probably safest to add these characters to your Moderation list just in case anything worthwhile happens to show up (it won’t). Once you Save your changes, forget about it. Just monitor (or don’t) your comments as usual and let WordPress’ built-in anti-spam skillz do the work.

Exceptions

Although an elegant and effective technique, you may want to skip using if either of the following apply:

  • You have trackbacks/pingbacks enabled and displaying on your site
  • You allow comments in languages that use any of the blocked glyphs

Otherwise, the list makes an excellent addition to any anti-spam strategy. Especially if you are only using Akismet, this is a great way to further improve the overall security and integrity of your site. For more information and more extensive WordPress blacklists, check these:

Note: To suggest additional characters in the comments, remember to wrap each one with a <code> tag. Thanks :)

About the Author
Jeff Starr = Creative thinker. Passionate about free and open Web.
Blackhole Pro: Trap bad bots in a virtual black hole.

28 responses to “10 Characters for Your WordPress Blacklist”

  1. Comments like that land in my moderation queue, even without adding those characters and Akismet gets them, anyway. Nice tipp though, if anyone has problems with spam like that. (I always check “Comment author must have a previously approved comment” so this kind of spam never gets through)

  2. Great trick to limit the spammers, am seeing more and more spam every day from that corner of the world.

    • Jeff Starr 2011/04/26 9:30 am

      I should make it clear that this technique is aimed at stopping specific types of spam by selectively choosing a few characters to block. It’s not aimed at any particular place or language, just whatever happens to be showing up in the comment box. Lately it’s been a lot of Cyrillic-based spam, but the technique may be generalized to block spam from any language.

      That said, thanks for the comment! :)

  3. The only downside to this is if it’s actually a legitimate comment. I have seen one comment come in like this (Akismet held it for moderation), but it didn’t seem to have a harmful backlink and when I ran the comment through the Google translator, it seemed legitimate. So my concern would be looking like an a$$hat to legitimate visitors from other countries.

    • Yes, good point – and addressed in the “exceptions” section:

      You may want to skip using if … you allow comments in languages that use any of the blocked glyphs

      The other exception would be for trackbacks/pingbacks, which usually are the only place that other languages appear on blogs. Either way, it’s ultimately up to the Admin to use their own judgment and not look like an ignoramus to people trying leave feedback in some other language.

  4. Very timely–I had just been trying to figure out how to stop the Cyrillic spam that Akismet kept missing. Thanks! I asked Akismet if there was anything I could do and the best they could come up with was to say they were working on it for a future version.

  5. Aaron Summers 2011/04/26 1:29 pm

    Nice idea, but. I found that “The Invisible Captcha” does the same job as askimet for FREE, yes FREE, and it really does work!

    it is no.1 on my top wordpress plug-ins:

    http://asummersdesign.com/blog/wordpress/the-10-most-essential-wordpress-pluggins-2

  6. And what about I sign my name in cyrilics and I’m not a prostitute from Kiev????

  7. No doubt it’s a nice way to block foreign spam, but if I want to leave a comment, say, in Russian, that’s gonna be spam. Sad.

    btw, you may also consider blocking the “?” and “?” characters (these are Cyrillic), they are one of the most frequent in Russian.

    • And yes, I use Akismet to block tons of Cyrillic spam. It works.

    • Jeff Starr 2011/04/26 1:24 pm

      Thanks for the tip on the “a” and “e” – will experiment and update the list.

      Re: “Sad.” If someone sends you an email in some foreign language, do you take it seriously? I usually bin those ones.

      • Well yes, that’s right. But I try to Google translate it first )

        I’ve read comments and people complain on Akismet for missing Cyrillic spam. I thought Akismet is making some kind of shared spam-base; on Russian blog it’s blocking all the spam – Cyrillic and not.

      • And one more thing on Cyrillic spam.
        Based on unique words analysis in random texts (that’s a rough estimate) ten most frequent letters in Russian (Ukrainian and others) are:

        ? — 9.36%
        ? — 8.40%
        ? — 8.08%
        ? — 6.91%
        ? — 6.12%
        ? — 5.67%
        ? — 5.49%
        ? — 5.30%
        ? — 5.00%
        ? — 4.67%

        But even this technique is not 100% bombproof. Spammers often change Russian letters to English that look alike to freak spam-filters out, so blocking Cyrillic “?”, “?”, “?”, “?”, “?”, “?”, “c” may not help.

        Good luck with fighting spam )

        Editor’s note: looks like the characters were lost during the database merge slash clean-up..

Comments are closed for this post. Something to add? Let me know.
Welcome
Perishable Press is operated by Jeff Starr, a professional web developer and book author with two decades of experience. Here you will find posts about web development, WordPress, security, and more »
USP Pro: Unlimited front-end forms for user-submitted posts and more.
Thoughts
I live right next door to the absolute loudest car in town. And the owner loves to drive it.
8G Firewall now out of beta testing, ready for use on production sites.
It's all about that ad revenue baby.
Note to self: encrypting 500 GB of data on my iMac takes around 8 hours.
Getting back into things after a bit of a break. Currently 7° F outside. Chillz.
2024 is going to make 2020 look like a vacation. Prepare accordingly.
First snow of the year :)
Newsletter
Get news, updates, deals & tips via email.
Email kept private. Easy unsubscribe anytime.