Every once and a while, when I visit my forum, I will see a large number of "guests" viewing the site. under forum stats. Today its 36, a few days ago, it was 78. This is a newly built forum that I know is not THAT popular! I assume they are 90% spam bots trying to get in?
I did have a Vanilla forum at the site, but replaced it with Elk. Vanilla was allowing 3 or 4 spam bots in a day, although they weren't able to post. And that was using Googles captcha! How are they bypassing captcha and still getting in? Are the ones that got in actual spammers? That's all I can figure.
Anyway, I have not had one spam bot get through Elk I'm happy to report, and I'm only using a simple "What is 19-5=" question.
Most likely just search bots that are trying to index your site.
A guest is any IP address that is viewing the site but not logged in as a user. Sure some of that could be spam bots but from my experience its mostly search bots (real and BS ones) and the like.
Also I think the number is related to an admin setting of logged in time minimum (sorry don't remember the text right now) ... but basically if an IP hits the site its "active" for the period of time set in the ACP even when they are no longer active on the site, so a bot could hit the site for 5 seconds and appear as active in the count for that value set in the ACP (I think it defaults to 15 mins)
I think the most of this "Guests" are Bots.
And this can only solved if the Forum detect all these Bots.
Currenty we have a list they hold more then 100 Bots .. :o .. and mostly each week we found more ...
Actually, it can be solved at the webserver or firewall level. Sure, that assumes the site is hosted on a server with firewall and/or webserver configuration access. There are resources to combat "bad bots". For example:
https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker
If you dig through some of his other scripts, there are items for taking the fight to the firewall level. This is a far more effective approach for controlling bad actors. Why let them use your resources at all? ;)
Your original question about statistics can be answered as "bots are guests and as such counted". Then, there is somewhere in the admin panel an option to identify bots that would show some of them as "bots" (though I think bots are still counted as guests, but don't quote me on that.
Then there is this sentence I'm not sure about:
There is no captcha to
view the forum.
There are anti-spam measures to register and post (and some other stuff). But really anyone can watch the forum unless they are banned.
TBH, as
@badmonkey said, stop bots should be done at the server level rather than at "php-level", that said, if you are on a shared hosting without any means to control anything on the server, then... having them access a forum page, of another page would not change
that much the resources used (well, yes, okay there are some pages that may be a little hard on the server, but usually shared hosting should not be thrown down by a few hits).
Well ... I found a simple and very fast way to detect all the bots they scann the net ...
It's a string with all the Bot Names we known/found and a preg_match ..
So I can simple detect all the bots ..
and these list have currently round about more then 150 Bot Names :o
So it's not need to add the bots manually .. if a bot not in the Bots Table, he is added automatically ...
Also we have added a "Blocking" for bots .. so these bots see only a 403 - Forbidden ;D
Fel
How do your script know a new bot (a bot with a name you don't know) is a bot and not a normal person? Isn't there a risk left you will accidentally block normal persons with your script?
That is a coding question
@Jorin. It all depends on how smart is the written code for that. I personally prefer default settings and will let bots harvest what they want which is normally harmless info.
No risk .. we use the Browser HTTP_USER_AGENT for the detection ..
Well .. not all Spider are harmless .. a lot of them can speed down the pages because the make many, many request in a short Sequenze.
So we added a function to block such spiders .. with a HTTP 301 (Forbidden)
That all works fantastic .. and we have a very long Spider list ;)
Right now I see: 8 members logged in, 98 guests and 8 spiders on my forum (in the last 15 min). I checked the resource usage and all seems normal, the CPU usage is less than 10-15 %, Ram under 500MB. I guess every shared host can handle this without problems.
That is as reliable as my knowledge of php. xD
Usually bots that identify themselves as bots are harmless. Those that
really want to be there will not identify as bot and the detecion by user agent is done for. (Been there, done that.)
When I mentioned "how are they getting past Googles Captcha, that was on the Vanilla forum that I replaced with Elk, on the same domain by the way.
Somehow, I was getting 3 or so "signups" a day, and that's with using Captcha. I don't know how they got past Captcha, unless they were real people spamming the forum. The only way these bots or spammers were being stopped from posting, was when I clicked on the "please confirm your email address" option. I had a lot of "Unconfirmed members", listed then.
So, since Elk is on the same domain, I think it's doing a better job of rejecting the spam that was plaguing Vanilla, by just showing a guest count. That's my theory anyway.
I'm going to do like Ahrasis is doing, leave the forum set the way it is. Like Radu81, I don't notice any load either on my server.
Captcha (including google's one for which that has been "broken" already several times, e.g. https://nakedsecurity.sophos.com/2017/03/03/researcher-uses-googles-speech-tools-to-skewer-google-recaptcha/ ), are far from effective nowadays.
Were it not for the fact it's there and it's easy I would have alrady removed ours...
Interesting Emanuele. I did not know that about Captcha having a weakness. I learn something new every day.
I guess the old "12-7=" challenge question is more reliable? Who would have thunk it.... I wonder why websites still use Captcha? Just curious.
If the question and answer can be Googled, it is no good. Get creative if you want effective. Captcha was worth coding bots with the ability to solve it. Still, better than nothing.
You can upgrade the default captcha if you want. Or may be use / create an addon. Anyway, I am not into this right now but rather looking on changing the way ElkArte shows its board index, message index and this display page. It seems lately, I prefer the way that are being displayed in Discourse, Flarum (EsoTalk), NodeBB etc,., but may be I open that in another thread (or may be not).
I have to agree with you again Ahrasis about changing the look to that of NodeBB or especially XenForo, I really like that one. However, I think it would be more likely for a rocket ship to be built for you and I to go to Mars, before they would change Elk to look like XenForo or NodeBB. 8)
Heck, I'd be a most happy camper to just get SEF URL's implemented.
Just sayin....
Not really.
Computers can do math pretty well.
I think he's talking about coding it himself, possibility as a mod. He's quite good at it, so I'll look forward to seeing his work. ;)
O:-) Thanks for "the compliment"
@badmonkey.
I am actually just so so with coding and most of the knowledge and skills I gained are from and since I joined SMF, then ElkArte.
If I am gonna write something like that, I will start by improvising
@inter flarum addon (https://www.elkarte.net/community/index.php?topic=4010), as that should be a good start unless ElkArte Developers are already onto it while we are talking about it. ;D
/me is still messing with the autoloader. xD
https://stackoverflow.com/a/12445309/7352111 but still I don't understand anything about it.
In the end, I took composer "class loader" and stuffed it into ext. Easier to maintain.
Autoloader and class loader are the same thing, a piece of code that include files based on the class names and namespaces. Making superfluous to use require_once and alike.
Back to topic: at the end of October I had a large amount of guests visiting my board (five times as many). I didn't notice something happening with my board though. Should I be concerned or alarmed, what do you think?
Are you concerned by the fact you caught a cold 2 years ago?
I'd say it's only periodic. Nothing much to be concerned about.
lol No, I'm not. O:-)
Thanks! :)