Skip to main content
Tips for Bots Started by vbgamer45 · · Read 1107 times 0 Members and 11 Guests are viewing this topic. previous topic - next topic

Tips for Bots

Tips for Bots
Use cloudflare for geo blocking of countries/asn's works great. You can also challenge users instead of block if you are concerned.
Code: [Select]
(ip.src.country eq "CN") or (ip.src.country eq "HK") or (ip.src.country eq "VN") or (ip.src.country eq "BR") or (ip.src.country eq "AR") or (ip.src.country eq "EC") or (ip.src.country eq "UY") or (ip.src.country eq "IR") or (ip.src.country eq "SG") or (ip.src.country eq "IQ") or (ip.src.country eq "BD") or (ip.src.country eq "VE") or (ip.src.country eq "CL") or (ip.src.country eq "PY") or (ip.src.country eq "MX") or (ip.src.country eq "PA") or (ip.src.country eq "BG") or (ip.src.asnum eq 136907) or (ip.src.country eq "SN")

Block old chrome versions or challenge if using cloudflare and block empty user agents
For apache httpd.conf below
Code: [Select]
# Block empty user agents
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^$ [NC]
RewriteRule .* - [F,L]

# Block Chrome below 120
RewriteCond %{HTTP_USER_AGENT} Chrome/([1-9][0-9]|10[0-9]|11[0-9])\. [NC]
RewriteRule .* - [F,L]
Clouldflare block chrome
Code: [Select]
(http.user_agent contains "Chrome/100." or http.user_agent contains "Chrome/101." or http.user_agent contains "Chrome/102." or http.user_agent contains "Chrome/103." or http.user_agent contains "Chrome/104." or http.user_agent contains "Chrome/105." or http.user_agent contains "Chrome/106." or http.user_agent contains "Chrome/107." or http.user_agent contains "Chrome/108." or http.user_agent contains "Chrome/109." or http.user_agent contains "Chrome/110." or http.user_agent contains "Chrome/111." or http.user_agent contains "Chrome/112." or http.user_agent contains "Chrome/113." or http.user_agent contains "Chrome/114." or http.user_agent contains "Chrome/115." or http.user_agent contains "Chrome/116." or http.user_agent contains "Chrome/117." or http.user_agent contains "Chrome/118." or http.user_agent contains "Chrome/119." or http.request.uri.query contains "action=printpage" or http.request.uri.path contains "printpage")


Turn off certain forum features for guests.

Make sure your site supports HTTP2 for your webserver.

Tweak your PHP/Database settings. User latest versions.

But in generally tweak, all settings, from webserver, php, database. The defaults are not enough for bigger sites.
Last Edit: February 26, 2026, 08:44:49 pm by vbgamer45
ElkarteMods.com - Addons, Products and more!

Re: Tips for Bots

Reply #1

Glad I'm not alone on this I've been in bot fighting mode on my sites for the last few days!

Other items that may help, depending on your site, traffic, location, etc.

Many requests are coming in coming in on groups of ipv4 /16, which is a group of ~65.5K address (xxx.xxx.123.123). For my sites that is not normal traffic but YMMV.  I wrote a script that grouped those /16 hits (from the access log) and if it finds more than xx IP's in a group (i use 10) in the last 15min's then I write it to a log file and use fail2ban to block that entire xxx.xxx.0.0 sub (use ipsec).  If you have some really small local group you can whitelist that sub. I now have over 400 of those subs blocked.

nginx has geoip2 (via max mind) so you can use that to GeoIP fence to countries and block ones you know are not in your zone. I know some folks take issue with that but honestly to bad, you have to work through an attack!  I will say, however, most of the bot traffic was out of US address (proxies) so Virginia TX and WA were common whois endpoints, but still that drops some of the crap.

Last thing that can be helpful is bots tend to flood on connection attempts.  Another script, this one groups connection limit failures (from the error.log) over a given limit/time threshold that also have PHPSESSION in the url and -> ban.  Guests are not opening 30+ connections to login or browse a site, and to be honest even with cache off and trying to beat on a site from your own IP, you will not trigger that either.

I may add that low chrome version check, more bot pain! I've seen high values but those are from variants (vivaldi for example), but i did not consider old cruft thanks for the idea!

Re: Tips for Bots

Reply #2

Yeah it was wrecking me and I had hardware firewall/software firewall, had to pull out all the stops.  Did switch to cloudflare as well which helped a ton.
I have done so much apache/fastcgi tweaking along with the database to handle the loads.    The worst is when I would get hit 50k to 100k bots at one time.
I do a lot of research the asn's i use ip2location.co.m If you use cloudflare be careful with your mx record proxing/email sending I run reports via https://mxwhiz.com/ to do double check  (mine btw)

Downside of blocking older chrome versions is windows 7 users would be cut off if they used chrom.

Re: Tips for Bots

Reply #3

The session_start on every GET request, combined with db session storage, has a dramatic impact on the server. As an immediate mitigation, I forced sessions to use cookies:

Code: [Select]
sources/Session.php:

@ini_set('session.use_only_cookies', true);

I then configured nginx to no longer serve requests that contain the session id. This only helps until the bots stop including the session id in their requests.

I'll probably move the session management to a ramdisk until I can figure out how to lean out the need for sessions by unregistered guests.

Re: Tips for Bots

Reply #4

I simply use cloudflare as my dns server, so never use any of the above. Not sure about bots as I never keep track on them.
Last Edit: April 04, 2026, 05:53:02 am by ahrasis

Re: Tips for Bots

Reply #5

Quote from: nwsw – The session_start on every GET request, combined with db session storage, has a dramatic impact on the server. As an immediate mitigation, I forced sessions to use cookies:

Code: [Select]
sources/Session.php:

@ini_set('session.use_only_cookies', true);

I then configured nginx to no longer serve requests that contain the session id. This only helps until the bots stop including the session id in their requests.

I'll probably move the session management to a ramdisk until I can figure out how to lean out the need for sessions by unregistered guests.

Note that PHP is deprecating the passing of PHPSESSID via URL in 8.x, and it will be removed in 9.0. 

That particular setting, 'use_only_cookies', will be retired soon - mainly because setting it to false is soon to be disallowed.  More here:
https://wiki.php.net/rfc/deprecate-get-post-sessions

So...  The idea is good - don't use PHPSESSID, and, since you're not generating it anymore, you can then block it via .htaccess. 

SMF implementation: https://github.com/SimpleMachines/SMF/pull/8394

One part of the SMF implementation, this commit, can save a LOT of resources.   It's causing some issues for forums that have guest-browsing disabled, though... Those issues are currently being addressed.:
https://github.com/SimpleMachines/SMF/pull/8394/changes/2f2a5e0ae404fd1adb408b87896ce00cca1715ec

The basic idea is that, since you cannot pass by URL, you MUST pass by cookie.  So...  When cookies are disabled, there is no way to pass the session.  At all...  So, don't even bother writing it.  Note certain classes of bots either block cookies or don't use them, or pass their own PHPSESSID...  All these variants cause more session writes. 

These changes will be a hard requirement before PHP 9.0.

You are effectively giving bots total control over your DB writes...   One step further, since they can flood you with writes, they can overwhelm your undo/redo logs.  Which can further lead to issues with backups.  Which can cause performance issues & even bring your site down... 

So stop that...

The savings can border on the ridiculous:
VGF-cpu-2025-01-13.png

In addition, this note outlines even further savings.  The goal is to avoid driving up CPU during bot storms.  I've been testing these on my site.  Check out the CPU charts before/after:
https://www.simplemachines.org/community/index.php?msg=4199062

The more broad notes found here might also help:
https://www.simplemachines.org/community/index.php?topic=593895.0
Last Edit: April 02, 2026, 03:18:54 pm by shawnb61

 

Re: Tips for Bots

Reply #7

Quote from: ahrasis –
Quote from: "shawnb61" – you can then block it via .htaccess.
What if we are not using apache2, but nginx instead?

This may help...
Last Edit: April 04, 2026, 11:50:12 am by Steeley

// Deep inside every dilemma lies a solution that involves explosives //

Re: Tips for Bots

Reply #8

For the record. my site uses htaccess to make a significant part of my site "private". My Forum resides "behind the wall". the 'public facing pages' explain the topic of the site and provides lots of 'general' information. The public side also informs of the existence of the forum (could certainly provide some sample screenshots if I was fishing for members, but in my case it's not necessary).

There's a link off the main menu of my site for requesting access to the restricted side. It's two-step process.. Click the link, it brings up a simple form.. you enter your email address, and submit.  The form submittal generates a reply to the entered address, and embeds a 6-digit random code.  It also copies me with that email. 

If the applicant doesn't get an email, it means it was typed wrong. Go back and try again. I want to make sure I send the access credentials to a valid address!

Meanwhile, submittal also brings up a second form for the applicant with instructions about what to do now..  specifically, says to retrieve the email just sent to them, copy the email address AND the 6-digit code into the new form, and fill out a few other fields in the form, information that I use to validate the applicant as someone allowed access to the forum. (In my case, it's people stationed at a particular location during the 'Nam war - things they know about it nobody else would. Your mileage would vary, of course).  I ask for his nickname or call sign - that will inform the username I'll give him if he provides one, otherwise I'll tweak his first name or last name.

When the applicant submits that second form, I comes to me in Email.

[Note: If I were to use the same credentials for everyone. and it gets compromised, now I have a problem - everything "behind the wall" is compromised. I'd have to change it and inform EVERYONE, and likely the unauthorized person too in the process. With unique credentials for each user, I know which account is compromised and can address that one.]
(Steeley's law: If you be lazy up front, you be hate'n your life later. All non-academic knowledge comes in suppository form).

I'll create a password for him based on info he provides.. something easy he'll remember.. (OK, so he was a hydraulics mech with Squadron 345, and a Sgt, so password "bubbles345sgt" will suffice..) I send his unique access info back..

All of the access email requests and my response email providing their unique access credentials are stored in my local computer. 

Voilla! Now they have access to the forum to register. (I don't have guest read access,  so they must register, to get in.)

BOTTOM LINE:

My forum does not get spammed. At all.  Ever. No bots, except the spiders looking for something new on the public side. Occasionally, they get lucky and find something new to link and memorialize. All the private personal information from the guys we share with each other is kept hidden so as not to scare or freak out the civilian visitors -that is back in the restricted area, and secure and snug as a bug in a old forum software version.

If you run a "really large" form with lots and lots of users, just set up a database of users that forbids duplicate username entries to keep them all unique. Otherwise, if you create a duplicate, it overwrites the original and the first username entity can't get in any more..  (and note; username "Mac" is different than "MAC" and "mac" as far as htaccess usernames are concerned).

Oddly enough, I get very few "fakers"... I do get a fair number of first emails, but the second form presented pretty much stops them in their tracks .. they know they ain't gonna fake that stuff (many don't even know what the heck it's asking for in some fields), a couple squirrels gave it a shot over the past 7+ years, but my BS detector is pretty good and an email reply back to them requesting "clarification" never gets answered..  

Anyway, if you want to see how it works, PM me for the website url.. If you're a legit user and not a bot, I'll "get back to ya.."
Last Edit: April 06, 2026, 02:02:12 am by Steeley

// Deep inside every dilemma lies a solution that involves explosives //

Re: Tips for Bots

Reply #9

I like the idea of what you did, could this be best, if we make this one of the ElkArte default features, with option to be disabled, or not, so once setup and install, an ElkArte forum is bots free or bots less?

What do you think @Spuds?