BBC Parsing

Topic: BBC Parsing (Read 40089 times) previous topic - next topic

0 Members and 4 Guests are viewing this topic.

Re: BBC Parsing

Reply #90 – September 09, 2015, 02:02:46 am

Upgrade and conversions...

Re: BBC Parsing

Reply #91 – September 09, 2015, 08:02:23 am

Could you do it as part of preparse .. find autolink-able URLS and then wrap them in a [autolink] tag. Then parsebbc works on those and avoids that preg. We can strip the autolink tag on edit. Or is that what you mean (sorry brain is a bit slow this AM).

Re: BBC Parsing

Reply #92 – September 09, 2015, 04:52:07 pm

That's exactly what I mean

Re: BBC Parsing

Reply #93 – September 09, 2015, 09:21:10 pm

If that preg is that large a performance hit, then it makes a lot of sense to set it up so it only has to run once on save and/or modify

Re: BBC Parsing

Reply #94 – September 09, 2015, 09:46:06 pm

Okay, the upgrade/converter will take on a lot more load. A test for that would require a database dump to see how much longer it will take.

Re: BBC Parsing

Reply #95 – September 27, 2015, 02:56:48 pm

Adding any codes significantly increases the amount of time it takes to parse. I added email_auto and url_auto to the parser and it increased the amount of time it takes to do the benchmark by 5%. I then changed it to zurl and zemail since there are no 'z' codes. Same result.

Re: BBC Parsing

Reply #96 – September 28, 2015, 09:00:20 am

Is that 5% in comparison to the "old" parse_bbc or just your new version?

So adding in autolinking detection is a 5% hit?

Re: BBC Parsing

Reply #97 – September 28, 2015, 09:56:43 am

The new one. I am going to see what happens when I add 10 more. Looks like changing it from an autolink to a BBC in the preparser increases parsing time by 5% and decreases it by 7%. So, a 2% win. If that shows to be true, it's not worth it.

Re: BBC Parsing

Reply #98 – September 28, 2015, 11:43:38 am

Ugh, the elusive benchmark results. Today I'm not getting any of the same and I don't know why. Adding 10 more codes didn't add any measurable time. I restarted my computer after it had been running for 2 weeks. Maybe the VM was screwed up? It is taking 1/10 the time it was taking yesterday. I tried it with url_auto and z_url (I think I'm going to use z_url just so I'm not adding to the u codes).

Re: BBC Parsing

Reply #99 – September 28, 2015, 11:56:03 am

Can I just question that "2 weeks" thing? It will eventually break down if you brutalize it that way, and it's not good for any of its internal parts.

Re: BBC Parsing

Reply #100 – September 28, 2015, 12:43:00 pm

Quote from: Flavio93Zena – September 28, 2015, 11:56:03 amCan I just question that "2 weeks" thing? It will eventually break down if you brutalize it that way, and it's not good for any of its internal parts.

I was away and left it in my backpack for most of the two weeks. I took it out once or twice. So, most of the time it was hibernating. If my computer dies because it's on for two weeks straight, I need to get a new computer because I am on my computer about 15 hours a day anyway.

Re: BBC Parsing

Reply #101 – September 28, 2015, 06:40:52 pm

The 15 hours is ok-ish, but let it rest. You sleep as well right?

Re: BBC Parsing

Reply #102 – September 28, 2015, 07:58:14 pm

That's the 9 hours I didn't include in there.

Re: BBC Parsing

Reply #103 – September 28, 2015, 09:13:06 pm

Yeah, then let it sleep by turning it off entirely

Re: BBC Parsing

Reply #104 – October 10, 2015, 02:10:10 am

I just pushed a commit to make the new preparser working. I need more messages to test.

I also fixed a minor bug with TestBBC.php and started looking at the regex parser again. If you remember, tokenizing the string took a long time and I decided it wasn't worth trying to go down that route since it took so long. Well, if we did that in the preparser and stored it, it wouldn't matter. Also, my regex is very slow. I am sure I can make it faster. I am going to once again put it on the backburner, but it is still a thought in my head. Store the messages as an array or even an AST?

I guess the next step is to fix and update my main repo for Elkarte and commit some changes to get at least the parser working.