Friday, September 11, 2009

The Solution for: Too many spam comments in wordress, even more than Akismet can process

Today I helped a client get rid of 400,000 spam comments.

This a ridiculous amount, but I can share how we cleaned this up quickly and effectively.

First, I think you should know, if you follow this post as a solution, you should know the worst that can happen is you completely screw up your site forever. You're responsible for your own site's fate, not my blog post, so make sure you know how what you are doing is effecting your site's database.

Secondly, I think it is important to note that this was a special situation where Akismet could not process all of the comments, it would just stall out.

It is also important to note, that I don't have enough knowledge of the limitations of Akismet to understand why it could not process 200-400k comments.

Also, before moving on to the solution, do not continue until making a full backup of your database. If you have this many spam comments, then please be patient with your database backup it will take a very, very long time to dump the sql.

So to the solution:

We looked for common strings in the comment content, here are a few:

buy
casino
win

Pretty much you need to find words in the content that repeat over and over.

Then we ran this sql statement on every common spam string we found:

delete from `wp_comments` where `comment_content` regexp `SPAMSTRING`;

For example you might use the string buy to single out a bunch of them, but be careful if your readers are using the word buy a lot, don't do this! use something unique to the spam. If your readers never use the word buy, or maybe on did, then let that 1 comment go for the greater good and do this:

delete from `wp_comments` where `comment_content` regexp `buy`;

On my client's site, this got rid of 40k comments.

After doing this a few times, we had the comments down to 400 and could finally run Akismet.

Let me know if this post is helpful with your comments.