Skip to main content
Balancing query hooks and caching Started by emanuele · · Read 5531 times 0 Members and 2 Guests are viewing this topic. previous topic - next topic

Balancing query hooks and caching

A while ago I started introducing some hooks to directly add new fields (and joins) to queries.
Now, I realized this may be nice from a "number of queries" stand point, but it may be detrimental if we want to push more the caching.

Why, you say?
Simply because if you inject a JOIN in a query, that for example should be executed, only for guests or for a particular members group, you are increasing the number of "cases" to be cached separately, leading to more resources used to run queries.
Instead, if the query is "stable" for the core, the caching strategy can be identified without guessing and hoping.

So I was wondering, what do you think?
Better "force" more simple queries that can either be cached separately or just not cached because likely not very relevant for the general impact, or let addon developers build (potentially) "bigger" queries that may be less easy to cache and have other (unforeseen, like conflicts between addons) issues?
Bugs creator.
Features destroyer.
Template killer.

Re: Balancing query hooks and caching

Reply #1

Perhaps this is a case of tunnel-vision on my end, but I see it mainly as adding extra columns. Sure, you could use JOINs and whatnot but it seems somewhat obvious that you might have a performance impact from that.

https://github.com/Frenzie/elk-stop-spammer/blob/d6cc6155b2e9d45d63f315ba24948216c66fcdf1/sources/StopSpammer_hooks.php#L128 (w00t, all hooks)

https://github.com/Frenzie/elk-stop-spammer/blob/d6cc6155b2e9d45d63f315ba24948216c66fcdf1/install_1.1.xml#L28 (would be nice to be 100% hook-based)

Custom queries would work too, but I would request an easy way to build on top of the default query rather than having to copy the whole thing and risking the thing becoming outdated.

Re: Balancing query hooks and caching

Reply #2

I know many won't like this, but just use another query for that one. I know it's off topic too.

Most query edits should be an additional query. Any query you change has a huge possibility of breaking.

For your mod, I would create a new table and have the PK be the user ID. You could cache it if you want but you could just add an index to cover that column and it would go straight to memory anyway. Incredibly fast.

Re: Balancing query hooks and caching

Reply #3

@emanuele
Do you have a specific example of one of these hooks to make the discussion more concrete? If someone were forced to perform a JOIN-circus to achieve what they want that would not be a good thing. If it's either or then directing users to separate queries by available hooks would likely be the better option.

Here is some related stuff from WordPress.
https://codex.wordpress.org/Custom_Queries (note how you can affect the query vars and use where and join filters)
https://codex.wordpress.org/Displaying_Posts_Using_a_Custom_Select_Query

@Joshua Dickerson
Could you identify what part of my reply you consider off topic and why?

QuoteMost query edits should be an additional query. Any query you change has a huge possibility of breaking.
Fair enough in principle, but in practice I think the query edits from most SMF mods still work on Elk precisely for the reasons I outlined above. They tend to only edit queries (as opposed to adding additional queries) only for simple changes like an extra column in the SELECT statement.

Re: Balancing query hooks and caching

Reply #4

Actually, looking at few queries right now searching for a potential example, I spotted a potentially problematic one, the one in loadBoard.
The query is cached, and the query itself is altered by hooks via integrate_load_board_query before being cached, but the hook is run only to refresh the cache.
So what does it happen?
If an addon joins a table based on... some condition (e.g. belonging to a particular group, then you may get inconsistent results.

Let's say we use the hook to add something like this:
Code: [Select]
if ($user_info['is_admin'])
{
    $select_columns[] = 'a.new_column';
    $select_tables[] = 'LEFT JOIN {db_prefix}my_table AS a ON (a.id_board = b.id_board);
}
if the first user visiting the page is not an admin, the query will be cached as "non-admin" and the new table will not be joined, when the admin will visit the page, the content of the query will be retrieved from the cache and the new_column field will not be there for the admin.

Conflict example is rather easy:
Code: (addon 1) [Select]
    $select_columns[] = 'a.new_column';
    $select_tables[] = 'LEFT JOIN {db_prefix}my_table AS a ON (a.id_board = b.id_board);
Code: (addon 2) [Select]
    $select_columns[] = 'a.some_column';
    $select_tables[] = 'LEFT JOIN {db_prefix}some_table AS a ON (a.id_board = b.id_board);

Of course these are not something I found in real example, but they are not far from reality.
Bugs creator.
Features destroyer.
Template killer.

Re: Balancing query hooks and caching

Reply #5

Frenzied, I was off topic. You're good

Re: Balancing query hooks and caching

Reply #6

Frenzie those mods would work in even more places if there were no query or table edits and they just added another table.

Re: Balancing query hooks and caching

Reply #7

@Joshua Dickerson
I'm afraid I don't understand what you're getting at. The only difference I see is that you're introducing some redundancy with all it entails. Like needing to make sure the relevant row is also deleted when you, e.g., delete a user. The added maintenance complexity is a much bigger concern overall than some hopefully minor uncached performance differences.
Last Edit: June 13, 2017, 05:54:33 am by Frenzie

Re: Balancing query hooks and caching

Reply #8

The only additional thing to worry about is on delete. You gain a lot in the portability of the code and the maintaining of your database though

 

Re: Balancing query hooks and caching

Reply #9

The mod I'm working on, I use integrate_message_query to get an additional column from the message table, integrate_topic_query for the same.

And I do use integrate_load_board_query but again just to add one column.

integrate_messageindex_topics I do use a join, but it is the same for everyone.

It seems to me that it would counter-intuitive to add a whole new table to add one column to an existing table.  Maybe it just needs to be added to documentation for what kind of additions should be avoided for queries that are cached.

Re: Balancing query hooks and caching

Reply #10

I'm on the phone, so I can't be very detailed, but let's imagine post moderation to work with hooks (and the idea would be to extract that as well as we did with many other features).
If the page load before the cache is done by a moderator it would include unapproved posts, that would be then presented to non-moderators until the cache expires.
If the page load before the cache is done by a non-moderator the query would exclude unapproved posts that would not be presented to moderators.
Bugs creator.
Features destroyer.
Template killer.