It’s common knowledge that having a form on your website is an open invitation for spam submissions. Unfortunately, there’s no one tried-and-true way to block bots from your form, but rather, several different anti-spam measures out there, each with their own pros and cons. We have to be careful to walk the line between our convenience (blocking as many spam submissions as possible) and the user’s convenience (frustrating them so much that they decide to abandon the form altogether).

Today we’ll take a look at the different ways of stopping spam and how effective they are at stopping spam, and how annoying they are to users. I’ll also point you in the direction of implementing each of these.

Different Ways to Catch Spam

Google reCAPTCHA Options

CAPTCHA (or Completely Automated Public Turing test to tell Computers and Humans Apart.) is the most popular choice for stopping spam. It requires the user to look at a generated grouping of letters and numbers and to type it in perfectly before the form would submit. The original CAPTCHA was created in the early 2000’s, and was pretty un-user-friendly, as the letters and images were scrambled and edited pretty heavily to “trick” bots.

unreadable captcha version 1

Even people with perfect eyesight had problems figuring out what the scrambled letters and numbers were. It led to a lot of users abandoning the form, while spammers were catching on and hacking it pretty easily.

Advancements in AI have all but guaranteed that bots would eventually find ways around CAPTCHA, leaving users with the same spam problem they would have without it.

It also failed to account for human annoyance.

Gravity Forms

reCAPTCHA sought to make this a bit easier for humans but showing images of real text that you had to then input. Unfortunately it still suffered from many of the same problems as the original CAPTCHA, often creating words that were hard to read.

screenshot of recaptcha version 2

Google bought reCAPTCHA and quickly began improving it.

reCAPTCHA Version 2

reCAPTCHA version 2 is the current most popular version of spam catching for forms. It is a lot easier for users and is often just a checkbox asking if you are a robot. While this seems like it wouldn’t actually fool a computer system, the technology is behind the form – basically Google monitored your activity and IP address, and if it deemed you a normal user, you got a plain checkbox.

google recaptcha v3 example

However if you were deemed “suspicious” (as I personally often am, probably due to my frequent use of a VPN) Google gave you something a bit more challenging – and annoying. Instead of a simple checkbox, you had to answer a puzzle – usually involving choosing the requested pictures from a grid. This not only takes time for your users, but can often result in a never-ending loop if the API messes up. (I once completed 10 of these in a row before giving up and moving on).

example of google recaptcha 2 puzzle

If you’re using something like Contact Form 7 or Gravity Forms, you just have to enter in your CATPCHA API keys in the appropriate places. Implementing reCAPTCHA versions 2 on custom coded forms requires some knowledge of code. You can see an introduction here.

reCAPTCHA Version 3

The new version of Google’s reCAPTCHA is completely invisible. It works by once again giving the user a score, and if you passed the test as “human” you didn’t see anything at all in the form, no checkboxes or otherwise. Based on this score, you can take different actions depending on what type of form the user is trying to fill out. For example, if it’s a login form and they fail the score, you can require them to use 2-factor authentication.

Ways to implement:

Honeypot Invisible Filtering

The “Honeypot” method of spam filtering is one of the least obtrusive ways of stopping spam, but can also not be as reliable as Google reCAPTCHA for catching bots depending on how you implement it. The Honeypot method is used pretty successfully by Mailchimp and entails creating an hidden input field. A regular user won’t know it’s there and will therefor not fill it out. A bot however, only parses the HTML so it doesn’t realize that the field is hidden, and fills it out. Therefor, if you filter out the forms that have this hidden field filled out, you filter out spam submissions.

One of the downsides to this method is that it has some accessibility issues. You can’t use display:none on the field because bots have gotten smart enough to realize this means it’s hidden. Therefor, the best method is give that input a position of absolute and a large negative left position. Screen readers however can still read these inputs, so make sure you label the input clearly for users to leave blank, otherwise you could lose legitimate form submissions.

Honeypot implementations require some basic knowledge of coding. You can find a find a great tutorial here.

Math & Word Problems

Math and word problems are simpler versions of CAPTCHA that don’t change with the user. These are easily implemented with some basic code knowledge since they’re just a regular input field that will accept only one answer. If the answer is wrong, the form doesn’t submit.

math problem captcha exampe

 

word problem captcha example

These are less common these days, but pretty effective at stopping bots. One of the downsides however, is that you have to come up with a simple enough question that any user will be able to figure it out. Too hard of a question and you risk losing submissions.

 

Ways to implement:

IP Blockers

IP Blocking services are another invisible spam service that maintain a list of and block known spam IP’s. These services shouldn’t be used alone but in addition to one of the above.

The filter works by combining information about spam captured on all participating sites, and then using those spam rules to block future spam.

-Wikipedia

Ways to implement:

Takeaway

Unfortunately it’s impossible to block 100% of spam – and not all spam are bots! Some are real humans and/or hackers hired cheaply to bypass spam blockers. The key in choosing the right anti-spam tool is to find a balance between stopping spam and not frustrating users – would you rather receive a few extra spam messages or turn away 10 potential customers?

This is one problem that a lot of people have encountered. As pst points out, the bot can just submit information directly to the server, bypassing the javascript (see simple utilities like cURL and Postman). Many bots are capable of consuming and interacting with the javascript now. Hari krishnan points out the use of captcha, the most prevalent and successful of which (to my knowledge) is reCaptcha. But captchas have their problems and are discouraged by the World-Wide Web compendium, mostly for reasons of ineffectiveness and inaccessibility.

And lest we forget, an attacker can always deploy human intelligence to defeat a captcha. There are stories of attackers paying for people to crack captchas for spamming purposes without the workers realizing they’re participating in illegal activities. Amazon offers a service called Mechanical Turk that tackles things like this. Amazon would strenuously object if you were to use their service for malicious purposes, and it has the downside of costing money and creating a paper trail. However, there are more erhm providers out there who would harbor no such objections.

Patrick M, on StackOverflow

Resources