Comments on 7 simple bot detection methods that won’t inconvenience users

Be civil and read the entire article first. This is not a support forum. Comments from new contributors are moderated. English only.

Leave a comment

Required. Optional. E.g. your homepage, Twitter. or Email required unless anonymous. Not published or shared. Reuse to be recognized as the same commenter.
Plain-text only. Begin lines with a > character to quote.

Martin

I laughed out loud at "swap the name attributes in the name and email fields". So devious! The #hash method and the Origin header were entirely new to me. It sounds like they should work, though.

cweiske

I recently added two hidden honeypot fields to a website signup form that dropped spam rate from 30/day to 0:

- Hidden field with name my_name that has an empty value. The form gets rejected when it's filled. The field has been made invisible by CSS.

- Hidden field with an encrypted token containing the page load timestamp. When the form submission is made within 4 seconds, it gets silently rejected.

James Celery

Bravo! They're all easy to bypass. However, I agree that this should be enough to block the endless flow of low-effort high-volume spam operations. Your site will require some effort to spam after implementing these protections. The economics of the spam operations means it won't be worth adopting the bots to get past your countermeasures.

rightonthelips

I freaking hate CAPTCHAs! The image recognition ones are so annoying and vague! Don't use it on your site!! Try anything or everything before resorting to those pests!

superkuh

I tried to post a comment in my normal browser and your faulty browser version detection system blocked me with, "Ouch. Validation error: Your web browser submitted the form from an unexpected source (e.g. proxied), or your web browser is too old or misbehaved. You can go back, correct the error, and restore or resubmit your comment." My browser is only a couple months old. Your detection system is probably thinking it's Firefox 29, but it's Palemoon 29, which is quite new. I've now switched to a copy of modern'ish Firefox (97) to see if I can get through. But most users will just walk away after such inconvenience.

And even if you can get your browser version detector working right, which you didn't, Github's survey of *developers* that use *github*, a site actively hostile to anything but bleeding edge browsers, does not even come close to reflecting the real browser version distribution on the web.

Don't block old browsers. It does inconvenience users.

Dave

Thanks! Some of these are really neat ideas. I've already used some of these to protect contact forms on websites I've built for clients. I'll make sure to use all of them going forward!

> - Hidden field with name my_name that has an empty value. The form gets rejected when it's filled. The field has been made invisible by CSS.

cweiske, that might confuse screen readers and other accessibility technologies. It’s absolutely possible to do it right, but I’ve found a type=hidden field gets the job done without ever getting in the visitors’ way.

> - Hidden field with an encrypted token containing the page load timestamp.

cweiske, that helps prevent replay attacks, but it requires each page view to be generated on the fly. I rely too much on caching to adopt this method.

> Your detection system is probably thinking it's Firefox 29, but it's Palemoon 29, which is quite new

superkuh, Pale Moon 29 identifies itself as Firefox 68 (from 2019). (The version check cap is currently at Firefox 75.) My logs indicate that 99,89 % of visitors to the blog uses a supported browser.

The error message you quote indicates that your doesn’t send the expected Origin or Host request headers. Firefox added support for the Origin header in version 70.

>> - Hidden field with name my_name that has an empty value. The form gets rejected when it's filled. The field has been made invisible by CSS.

> cweiske, that might confuse screen readers and other accessibility technologies. It’s absolutely possible to do it right, but I’ve found a type=hidden field gets the job done without ever getting in the visitors’ way.

If you hide stuff with display: none or visibility: hidden, screen readers shouldn't be able to access it.

Be careful, the labels in this form are inverted too! This means that if you click on "name" you'll be directed to the email box. Also, if you were using a screen reader you would be instructed to input your e-mail in the name box and your name in the e-mail box (that's what labels do, too, they are used to label form fields when tabbing or when navigating with hotkeys).

> Be careful, the labels in this form are inverted too!

Good catch, thanks! That wasn’t my intention.

Now Firefox autofill suggests entering my email into the "name" box and name - into the "email". Konfusing!

I also used a version of cweiske's idea of a field which must be left empty - it was hidden via CSS, and for CSS-less browsers had a label "Please leave this field blank". Also it had name="email" and naturally, all bots submitted a proper email there!

>The error message you quote indicates that your doesn’t send the expected Origin or Host request headers.

And now you know that that does not actually indicate an old, proxied, or misbehaved browser. It's valid behavior from a modern, up to date, browser. But you're going to continue to block it because you only care about corporate browsers? Not cool for a personal site but fine for a commercial one I guess. Good to know what type ctrl.blog is.

> It's valid behavior from a modern, up to date, browser.

superkuh, Pale Moon isn’t a modern nor an up-to-date browser. It’s based off a three-year-old version of Firefox. That’s ancient times from a technology perspective. The web is constantly evolving.

Pale Moon can’t keep up with the evolving standards of the web platform because it’s essentially developed by one guy. No one person can keep up with the workload required to maintain a modern web browser.

Got it. Only people with browsers from megacorps get to comment on your sites and you recommend others behave the same. Independent browsers, which are up to date and are secure (because they don't try to be entire OSes) aren't allowed because they don't implement trivial little things that you don't even use. Not a great look.

> Now Firefox autofill suggests entering my email into the "name" box and name - into the "email".

That’s odd. The major browsers have supported the autocomplete=name|email properties for some years now. It works as expected when I test it in a clean profile in Chrome, Firefox, and Safari.

tauin

Anyone who uses palemoon is actively putting themselves at a security risk, it should NEVER be used.

John Jat

SSL cipher profiling by inspecting the Client-Hello packet that initiates SSL connection. This packet contains a list of cyphers supported by the client, this is based on the applications config and SSL library and distinct in a great many cases.

Utilizing SSL fingerprinting a list such as below can be created. https://github.com/LeeBrotherston/tls-fingerprinting/blob/master/fingerprints/fingerprints.json

This can allow you to nuke a great deal of undesirable traffic out of the gate. Allows you to confirm the User-Agent matches the expected SSL fingerprint, many bots will throw a different User-Agent at the server with every request and an identical SSL fingerprint. Sophisticated bots operators are aware of SSL fingerprinting and those will evade detection but aren't likely to be hammering your forms. Keep an eye out for randomized cyber lists and end hashes, these are sus.