Skip to main content
Topic: Delayed Statistic Counters (Read 6691 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Delayed Statistic Counters

Hi @all,
are there any plans to delay the update of Statistics like "Views" on Topics / Boards to relax the io on serverside during the busy times ?
Some other boards do this using a cron Job to minimize the write operation.

Re: Delayed Statistic Counters

Reply #1

Heh, I was just looking at that thinking there should be a setting.

Code: [Select]
		// Add 1 to the number of views of this topic (except for robots).
if (!$user_info['possibly_robot'] && (empty($_SESSION['last_read_topic']) || $_SESSION['last_read_topic'] != $topic))
{
increaseViewCounter($topic);
$_SESSION['last_read_topic'] = $topic;
}

First off, it shouldn't update for every view that you've made. It should only update once per session (no matter how many other topics you look at) and it should only update when there is a new post or edit in the topic (not sure about edits). Also, the circuit breaker should kick in and disable this completely when the load is too high. That should handle most issues, but a hook in the function could do it via caching or a topic_views table which would be in memory. That's another idea.

Anyway, are you having issues with load? I wouldn't start trying to change things until you start seeing issues. Especially with numbers like that. Users seem to be picky about their counts (downloads, views, stats, etc).

Re: Delayed Statistic Counters

Reply #2

It will update every time you go into that Message - only thing it will not update twice if you call it repeatedly without viewing another Thread in the meantime.
-
I don't know how the circuit breaker works, but in Topic.subs.php is no sign of any exceptions, it just make the update into the Database.
-
I don't know if this will be a problem for me. I delayed the View Update on my MyBB Board to avoid running into trouble if the load gets high. But MyBB is using a different Datastructure where i don't know if i can compare it to Elkarte. I think Elkarte is putting a bit more load onto the Server, but this is just guessing at the moment. I need to make a few Load Testings as soon as my Importer is ready.

btw: i don't won't to loose Counters - i just like to have the updates delayed.

Re: Delayed Statistic Counters

Reply #3

Where do you store them "in the meantime" waiting for the db update?
Bugs creator.
Features destroyer.
Template killer.

Re: Delayed Statistic Counters

Reply #4

memcached ? (i know may be lost on powerfailure and so on.)
MyBB is inserting it into a table and updates the stats only once in a hour.

Re: Delayed Statistic Counters

Reply #5

So it not really delaying the writing of the "counts" just the calculation of the stats?   I'd have to look to see how often that occurs, it might be on stats page load, not sure.

For a really busy sight, there are certainly my.cnf changes that you would make, as well as MyISAM -> InnoDB changes in the default install as well.

Re: Delayed Statistic Counters

Reply #6

I thing we talk about different things.
i talk about elkarte_topics.views .

Re: Delayed Statistic Counters

Reply #7

I guess so

If you are having performance issues due to
Code: [Select]
UPDATE {db_prefix}topics  SET num_views = num_views + 1
Then I really don't know what to say, in terms of performance bottlenecks that would not even be in my top 1000 TBH

If you are talking about not updating the ?action=stats page but once every 30 mins or so, then I could potentially see a need for that as I don't think that currently uses the cache and it could.

Re: Delayed Statistic Counters

Reply #8

Judging by a quick inspection of the code, MyBB and ElkArte work basically the same in that respect (i.e. UPDATE view = view + 1).
Relevant index (i.e. the topic id) is the same, so it shouldn't make any difference.
MyBB has a "delayed" option that instead of UPDATE'ing the topics table, INSERTs a row in another table.

Reading a bit around (e.g. http://use-the-index-luke.com/sql/dml ), it seems INSERTs may be faster than UPDATEs, but usually only if the table doesn't have indexes (in that case INSERT is just an adding stuff at the end of the table).
In the specific case, MyBB has a KEY(tid) on the threadviews table as well, so the two operations (UPDATE/INSERT) should really be mostly the same.

It may give an advantage (to be quantified[1]) using a table without indexes for the insert, even better if MEMORY, or a caching system (not file-based though).

It may be worth doing some tests.
It's my localhost with a 1.5M random posts/145k random topics testing forum, the update takes between 0.00038099 (on a "recent" topic) and 0.01108718 (on a "very old" topic the first time it's accessed, the second it becomes similar to the "recent" topic, probably because of some mysql caching) seconds
Bugs creator.
Features destroyer.
Template killer.

Re: Delayed Statistic Counters

Reply #9

Is that a InnoDB or MyISAM table ?  I would have suggest using insert delay but thats been depreciated.  I also considered using a temporary memory table that you flush to the main table from time to time, but then I was like, its just an update :P

If its MyISAM table, could you try  (if you don't already have this)
Code: [Select]
concurrent_insert = 2
in my.cnf and see if that improves things,  it would help the insert but maybe not the update, not sure. 

If InnoDB do you have
Code: [Select]
innodb_flush_log_at_trx_commit  = 2
or even set that to = 0 to be more aggressive.  Neither 0 or 2 is  ACID compliance. so you could have a minor data loss (like a +1 miss) less so with 2, but you get a performance boost for that minor risk.


Re: Delayed Statistic Counters

Reply #10

What I was talking about is the view count on the topics. It updates every time you visit the topic in your session so long as you go to another topic. That doesn't really make sense to me unless there's a new post. I would keep a "last_view_time" in the session with an array of $topic => time(). Then have a threshold of 5 minutes (tunable) for how often you'd update the topic. It would check the last post time of the topic as well so if $last_view_time < $last_post_time ? increaseViewCounter().

INSERT is almost always faster than UPDATE. Appending to a file is a hell of a lot faster than finding that data point in the file and then changing the value. The biggest issue is that it has to lock the row to run the update. INSERT has no locks, even on MyISAM.

Unless you're running one of the largest forums, you really shouldn't be worried about this kind of stuff though. Take a bit of advice from me, who has spent my entire dev life trying to work on performance, reliability, and scalability, it's not worth it.

Re: Delayed Statistic Counters

Reply #11

Quote from: Joshua Dickerson – Unless you're running one of the largest forums, you really shouldn't be worried about this kind of stuff though.
I have a large Forum. There is a different if every move of a user makes an Update or an Insert if you have 500 or 1500 Users concurrently running around. And i had a few times Peaks of 1500 Users.  But 400-500 Users are always active.

According to myisam <-> innodb: its right innodb is better while updating, bit myisam / aria ( i run on Mariadb) is better in terms of read ops ( and would support full text ( but i use sphinx, its just 1Mio times faster) ).

Re: Delayed Statistic Counters

Reply #12

Quote from: Spuds –
Code: [Select]
innodb_flush_log_at_trx_commit  = 2
or even set that to = 0 to be more aggressive.  Neither 0 or 2 is  ACID compliance. so you could have a minor data loss (like a +1 miss) less so with 2, but you get a performance boost for that minor risk.
I delayed the writes once and ended up having corrupt databases which i had to rebuild to get back online.  That takes hours of manual work - if you around. Very bad if you are currently on vacation .... ;) )

Data needs to be flushed to Disk.

Re: Delayed Statistic Counters

Reply #13

There's a big difference between 1500 users looking at the ElkArte stats and looking at the database connections. I guarantee you've never had 1500 concurrent connections to the database. The ElkArte stats default to keeping someone "online" if they've visited the site (anywhere) in the past 15 minutes. Consider that once you open a page, it takes a second for it to load (from click to full page load). Then you take time to read it. Then you take time to find your next link. So, it isn't 1500 concurrent connections or even 1500 views per second. To know the true usage, you need to use a better tool.

There are much better ways to see how many concurrent users you have. You can look at your Apache/Nginx access log with a script that reads it or just use Google Analytics which uses a Javascript connection to monitor live stats.

Think about it like this, every page load takes a second, it takes 28 seconds (to make the math easy) to read the page, and another second to click a link or click back, that's 30 seconds per user per page. At most, the user can get 2 pages per minute. If you're like me, you open a bunch of tabs from the messageindex or boardindex and then read them by tab, but it still averages out to the same. So, 1500 users, 2 pages per minute over 15 minutes. That should be nothing for any decently run server.

Now, obviously everything is done in bursts. That's where caching comes in. That's what you're talking about here as well (along with row-level locking because you should be using InnoDB). How many seeks do you think happen on any given topic in the 1/10 of a second it takes for someone to run an update on that row? Also consider that the lock gets returned in that 1/10 of a second.

Like @Spuds said you're worrying about the wrong thing. I will make a commit to make the changes I said (at some point), but delaying the update for that row is probably not going to make any difference for your site or 99.9999% of the sites out there. Removing the query makes a difference because the SQL analyzer is what's taking the most time.

I hope you don't take this as me telling you you're wrong. You're not. You're just worried about something that is less important to your performance than some of the bigger things. I've gone down this rabbit hole for years working on YaBB SE, SMF, and for a little while with ElkArte before I realized I should be picking the low hanging fruit.

Re: Delayed Statistic Counters

Reply #14

Wait, are you running MyISAM?