Skip to main content
Topic: Number of BBCode params and permutation (Read 11425 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: Number of BBCode params and permutation

Reply #15

Cool, thanks for taking a look at the problem.  Can't test at the moment as I'm in addon mania right now :P  .... That and I don't have a bbc code with lots of parameters to test so need to create one of those. 

Are you thinking it will save memory because of the use of array keys instead of the full arrays?   I saw that cute little substr in the loop as well, was going to address that at some point too.

Re: Number of BBCode params and permutation

Reply #16

Instead of using an array of strings, you're using small integers.

Re: Number of BBCode params and permutation

Reply #17

That link I posted is a much better solution though. We don't need to figure out all of the permutations at once. We can do it one at a time and see if it matches. If it matches, just return the match. I suspect that most of the time, it will be complete within a few permutations. This should speed it up for everything.

Re: Number of BBCode params and permutation

Reply #18

Quote from: Joshua Dickerson – Instead of using an array of strings, you're using small integers.
In my permute test file, thats how I've been testing.  Your fine through 8, but since the number grows exponentially, get to 9 will require ~225M of memory, at least on my old machine it did.  10 took almost 2G.  Thats with an input of array(1,2,3,4,5,6,7,8,9,10);

Quote from: Joshua Dickerson – That link I posted is a much better solution though. We don't need to figure out all of the permutations at once. We can do it one at a time and see if it matches. If it matches, just return the match. I suspect that most of the time, it will be complete within a few permutations. This should speed it up for everything.
Below say 5 (need to verify with "real" data), the permute overhead is measured in .00x of seconds, past that you do start to ramp up, again since its exponential.   Which link or do you mean the code you posted?  Either way, yes not having to figure all of the permutations up front will help in the cases where there are lots of params.  Actually thinking about that, the only permutes needed would be on the required params, if the others are optional, they can just be appended to the permuted regex of the required.

 

Re: Number of BBCode params and permutation

Reply #19

http://docstore.mik.ua/orelly/webprog/pcook/ch04_26.htm

This is where generators would come in handy. We could just do the checks as we go along. Then, you can essentially have unlimited. I am going to work on that in Notepad really quick.

Re: Number of BBCode params and permutation

Reply #20

nods thats a very good idea, being able to use the intermediate results should be a help.   We are still dealing with massive sets though,  be interesting to see what happens to execution time when you get to a possible solution set of 36Billion, not that it should take that many to "find" a solution but still

Re: Number of BBCode params and permutation

Reply #21

Driving from FL to NJ so I won't have this until later. I still think 14 parameters is ridiculous and ordering them might be better. So ultimately, the proposal I'd make is using the permutation method on save to get the parameters and then ordering them for later.

I like what XF does with an AST. I think this is the future but that's another topic.

Re: Number of BBCode params and permutation

Reply #22

Be kind... I just got back from 15 hours of driving, I left my computer in FL, took someone else's by accident, and wrote this in Notepad with severely bad internet connection (storms I presume).

@Spuds, can you give this a shot? I think it will work. I think you'll understand how it should be placed in Subs.php. I just took the example from that book and it looks like it will fit perfectly. It does a check per permutation instead of getting them all before.

Code: [Select]
function pc_next_permutation($p, $size) {
    // slide down the array looking for where we're smaller than the next guy
    for ($i = $size - 1; $p[$i] >= $p[$i+1]; --$i) { }

    // if this doesn't occur, we've finished our permutations
    // the array is reversed: (1, 2, 3, 4) => (4, 3, 2, 1)
    if ($i == -1) { return false; }

    // slide down the array looking for a bigger number than what we found before
    for ($j = $size; $p[$j] <= $p[$i]; --$j) { }

    // swap them
    $tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp;

    // now reverse the elements in between by swapping the ends
    for (++$i, $j = $size; $i < $j; ++$i, --$j) {
         $tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp;
    }

    return $p;
}

$set = array();
foreach ($possible['parameters'] as $p => $info)
{
    $set[] = '(\s+' . $p . '=' . (empty($info['quoted']) ? '' : '"') . (isset($info['match']) ? $info['match'] : '(.+?)') . (empty($info['quoted']) ? '' : '"') . ')' . (empty($info['optional']) ? '' : '?');
}
$size = count($set) - 1;
$keys = range(0, $size);
$message_part = substr($message, $pos1 - 1);

do {
    $match_string = '~^';
    foreach ($keys as $key)
    {
        $match_string .= $set[$key];
    }
    $match_string .= '\]~i';
    $match = preg_match($match_string, $message_part), $matches) != 0;
} while (!$match && ($keys = pc_next_permutation($keys, $size)));

Re: Number of BBCode params and permutation

Reply #23

I have no idea about what that means, and I have no idea how it is supposed to work either. What I do know is just one thing: you both scared and shocked me, you are amazing dude!
~ SimplePortal Support Team ~

Re: Number of BBCode params and permutation

Reply #24

thanks @Joshua Dickerson I'll take a look at this later today  :)

Re: Number of BBCode params and permutation

Reply #25

Summary ...

The memory usage / time taken is only that used to compute the N!, it does nothing with the results for this timing and memory usage table. 

As expected the new function, since it computes a permutation at a time, has a generally constant memory footprint, but begins to ramp up in the time taken to compute the combinations since its returning the results on a per loop basis.

The old permute function although much faster for larger sets, is memory hungry as the set size increases.  The final set (10) timings and memory size are estimated as it ate all I had in my testbed (>1.8G when it failed).

So we can solve the memory issue with returning the intermediate values, however execution time will become an issue.    I think we could safely allow 8 params (40,320 combos)  in conjunction with the new function, but I should test the time for (potentially) 40320
preg_match calls as well.  Could also simply "clamp" the existing function to a maximum of 5 parameters.

The ultimate fix would be to order any passed parameters to what the function expects, that however may hold some unexpected consequences in the parser.  I also think its a bit of an example of a solution looking for a problem, as IMO that is simply to many parameters for a single bbc code.

# keysTime
(PC_NEXT)
Memory
(PC_NEXT)
Time
(PERMUTE)
Memory
(PERMUTE)
30166k0166k
40166k0174k
50166k0215k
6.015166k0498k
7.047166k.0112,740k
8.280166k.09222,685k
92.316166k.795231,605k
1023.17166k~72,000,000k

Re: Number of BBCode params and permutation

Reply #26

Just thinking out loud, without any test or supporting data.
If I understood right, doing one permutation at a time means testing for example:
author
data
link
author + data
data + author
author + data + link
* author + link + data
etc.
one at each iteration.
But that means the first one could already match a tag
Code: [Select]
[quote author=emanuele data=12345678 link=topic=1] isn't it?
Unless we start from the latest one (the "longest") and come back to the shortest.
Bugs creator.
Features destroyer.
Template killer.

Re: Number of BBCode params and permutation

Reply #27

The permutations will always return the same number of keys it was passed ...  so pass it 1,2,3 you will get
1,2,3
1,3,2
2,1,3
2,3,1
3,1,2
3,2,1

Not necessarily in that order but whatever the algo returns.
Last Edit: August 11, 2015, 04:16:07 pm by Spuds

Re: Number of BBCode params and permutation

Reply #28

Told you I didn't look at the code. :P
Bugs creator.
Features destroyer.
Template killer.

Re: Number of BBCode params and permutation

Reply #29

LOL, I don't blame you.

Here is what the section in subs would look like if we move to the lexicographic permutation (above) which allows us to save memory.  This one I placed an iteration clamp on the function which will an exhaustive search of up to 7 parameters and a subset of more than that.  Could increase that but why?   The reality is the code was wrong for > 3 for what 10+ years?  I know that's not an excuse to ignore the error, its simply stated to frame the extent of the issue.

The benefits here are
- much lower memory requirements for the fringe cases that have > 3 parameters
- no real difference (memory or speed) for <= 3 parameters as compared to our current function
- actually works for >3 parameters, as I noted the current function has a simple error that makes it wrong for > 3
- even if 20 parameters are passed, it will still return some permutations, based on the max_iterations variable, I chose the number to allow an exhaustive search of 7 parameters which should be enough for 99.99 of the cases.

Code: [Select]
				// Okay, this may look ugly and it is, but it's not going to happen much and it is the best way
// of allowing any order of parameters but still parsing them right.
$param_size = count($preg) - 1;
$preg_keys = range(0, $param_size);
$message_stub = substr($message, $pos1 - 1);

// If an addon adds many parameters we can exceed max_execution time, lets prevent that
// 5040 = 7, 40,320 = 8, (N!) etc
$max_iterations = 5040;

// Step, one by one, through all possible permutations of the parameters until we have a match
do {
$match_preg = '~^';
foreach ($preg_keys as $key)
$match_preg .= $preg[$key];
$match_preg .= '\]~i';

// Check if this combination of parameters matches the user input
$match = preg_match($match_preg, $message_stub, $matches) !== 0;
} while (!$match && --$max_iterations && ($preg_keys = pc_next_permutation($preg_keys, $param_size)));