Skip to main content
Topic: BBC Parsing (Read 42257 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: BBC Parsing

Reply #120

a nice DIC is https://r.je/dice.html ... lightweight and fast, plus its a single file to add, licensed as BSD.  Could consider that, and since we do injections in a few other areas it may not be a bad move.

Past that, I'd almost bite the bullet and static them or maybe get the auto loader working on them (does it already?) .. since parasebbc is so prevalent in the code, really there is darn little that works without it.

Oh .. the parser update is simply awesome sauce :D

Re: BBC Parsing

Reply #121

The autoloader would be fine, but the loading of the codes can take a little bit of time. So, I'd rather cache the instantiation of that.

I am about to commit and push some changes just so you can see what I'm working on. Namely, the /Maker directory. This is going to be a stepping stone for the BBC to be based in the database.

Re: BBC Parsing

Reply #122

Well, with the current codebase, static methods are not that bad, I see them as a step to start making the code look better.
At some point we may be able to remove them almost entirely, but even singleton are an intermediate step for future development.
It's impossible to rewrite everything to be "DIC-proof" in just one go, in my mind 1.x is a set of experiments and changes made to shape the code, then 2.x will be the moment to decide what path follow, and 3.x will be "The Version".
Bugs creator.
Features destroyer.
Template killer.

Re: BBC Parsing

Reply #123

The problem is that one of the main features is that you can have different functions for different areas. I see a couple of parsers, but probably most of them time they'll share the same actual parser: message_parser, sig_parser, news_parser, package_parser, email_parser. Most of them will be the same most of the time, but the big difference would be in the first 3. Say you don't allow colors in messages and no colors or images in signatures but everything is allowed in the news. The way it would work is you'd set disabled in Codes to whatever you want to disable. Then you'd send that object to the parser. I guess you could change the disabled in Codes by BBCParser::getBBC()... hmm... idunno.

Still working on this maker. It is coming along and it is completely written in JS right now.

Re: BBC Parsing

Reply #124

This is the BBC Maker: http://joshuaadickerson.github.io/bbc-maker.html

I am still working on it, but since I'm doing everything the PHP backend will do in Javascript first, this is a good way to see it. I will be pushing to it slowly, but you can already start to use it now to convert your BBC. I am working on getting the default tags' parameters to show up. Next step after that is to get it doing the rules checking (there are actually a lot) and show errors. Then I want to have some help text for each. Finally, it should allow the user to pick whether a field is PHP/raw or a string instead of adding quotes around it with some hints if it can figure that out on its own.

I need help making it look nicer and then I'll need help making it work with Elkarte's templates since it is using Bootstrap right now.

Re: BBC Parsing

Reply #125

Okay, default tags and parameters now work.

Now the headaches with Bootstrap :(

Re: BBC Parsing

Reply #126

Out of curiosity: what would be left to do to bring this one into 1.1? O:-)
Bugs creator.
Features destroyer.
Template killer.

Re: BBC Parsing

Reply #127

As far as I can tell, nothing, just replacing parse_bbc() with it. I'm not satisfied with that though. I want it to do a lot more so I haven't gotten to do the simple part. If you squash the bugs for 1.1, I'll make it a feature ;)

Re: BBC Parsing

Reply #128

I am thinking about completely refactoring/redesigning the preparser.

Kind of YAML'd the idea. It would utilize much of what I did with the parser already. It would be copying, not extending, but it would take a lot of the same ideas. Instead of using regular expressions, use the parser. When it encounters a tag, it will handle it accordingly. In a future version, we might even be able to just combine the parser and preparser's codes.

Code: [Select]
NEXT_TAG_MUST_BE
TAGS_ONLY_CONTENT
REMOVE_EMPTY
ATTRIBUTE_IS_URL
EQUALS_IS_URL
NO_PARSE
FILTER_CONTENT
FILTER_EQUALS
FILTER_ATTRIBUTE
BLOCK_LEVEL
ADD_PARENT_IF_MISSING
REMOVE_EXTRA_CLOSING

----

b
 REMOVE_EMPTY
code
 NO_PARSE
 BLOCK_LEVEL
color
 FILTER_EQUALS
 search: '~\[color=(?:#[\da-fA-F]{3}|#[\da-fA-F]{6}|[A-Za-z]{1,20}|rgb\(\d{1,3}, ?\d{1,3}, ?\d{1,3}\))\]\s*\[/color\]~'
 replace: ''
li
 ADD_PARENT_IF_MISSING
 list
list
 BLOCK_LEVEL
 TAGS_ONLY_CONTENT
 NEXT_TAG_MUST_BE
 li
nobbc
 NO_PARSE
quote
 BLOCK_LEVEL
 REMOVE_EXTRA_CLOSING
 REMOVE_EMPTY
 ATTRIBUTES
 link
 ATTRIBUTE_IS_URL
s
 REMOVE_EMPTY
table
 BLOCK_LEVEL
 TAGS_ONLY_CONTENT
 NEXT_TAG_MUST_BE
 tr
td
 BLOCK_LEVEL
 ADD_PARENT_IF_MISSING
 tr
th
 BLOCK_LEVEL
 ADD_PARENT_IF_MISSING
 tr
tr
 BLOCK_LEVEL
 TAGS_ONLY_CONTENT
 NEXT_TAG_MUST_BE
 td
 th
 ADD_PARENT_IF_MISSING
 table
url
 EQUALS_IS_URL

This should make it considerably easier to add preparsing. It will also make it so we don't have to worry about changing nobbc to entities or worry about parsing in code tags.

Some other things I want to do with it is limit the number of parameters so we can catch that before they post. Say 5 or so. Also, add another tag for list items. So, when it saves, it will save an itemcode [ *] as [list][li]...[/li][/list]
Last Edit: November 03, 2015, 07:23:34 pm by Joshua Dickerson
Come work with me at Promenade Group

Re: BBC Parsing

Reply #129

Code: [Select]
<?php

namespace BBC;

class ParserWrapper
{
protected $disabled = array();
protected $codes;
protected $bbc_parser;
protected $smiley_parser;
protected $html_parser;
protected $autolink_parser;

protected static $instance;

public static function getInstance()
{

}

protected function checkLoad()
{
global $modSettings, $context;

if (!empty($modSettings['bbc']) && $modSettings['current_load'] >= $modSettings['bbc'])
{
$context['disabled_parse_bbc'] = true;
return false;
}

return true;
}

protected function isEnabled()
{
return empty($modSettings['enableBBC']);
}

public function enableSmileys($toggle)
{
$this->smileys_enabled = (bool) $toggle;
return $this;
}

protected function getParsersByArea($area)
{
$parsers = array(
'bbc' => false,
'smiley' => false,
);

// First see if any hooks set a parser.
foreach ($parsers as $parser_type => &$parser)
{
call_integration_hook('integrate_' . $area . '_' . $parser_type . '_parser', array(&$parser, $this));

// If not, use the default one
if ($parser === false)
{
$parser = call_user_func(array($this, 'get' . ucfirst($parser_type) . 'Parser'));
}
}

return $parsers;
}

public function getMessageParsers()
{
return $this->getParsersByArea('message');
}

public function getSignatureParser()
{
return $this->getParsersByArea('signature');
}

public function getNewsParser()
{
return $this->getParsersByArea('news');
}

protected function parse($area, $string)
{
global $modSettings;

// If the load average is too high, don't parse the BBC.
if (!$this->checkLoad())
{
return $message;
}

$parsers = $this->getParsersByArea($area);

if (!$this->isEnabled())
{
if ($this->smileys_enabled)
{
$parsers['smiley']->parse($message);
}

return $message;
}

$message = $parsers['bbc']->parse($message);

if ($smileys_enabled)
{
$parsers['smiley']->parse($message);
}

return $message;
}

public function parseMessage($message, $smileys_enabled)
{
return $this->enableSmileys($smileys_enabled)->parse('message', $message);
}

public function parseSignature($signature, $smileys_enabled)
{
return $this->enableSmileys($smileys_enabled)->parse('signature', $signature);
}

public function parseNews($news)
{
return $this->enableSmileys(true)->parse('news', $news);
}

public function parseEmail($message)
{
return $this->enableSmileys(false)->parse('email', $message);
}

public function parseCustomFields($field)
{
return $this->enableSmileys(true)->parse('customfields', $field);
}

public function parsePoll($field)
{
return $this->enableSmileys(true)->parse('poll', $field);
}

public function parseAgreement($agreement)
{
return $this->enableSmileys(true)->parse('agreement', $agreement);
}

public function parsePM($message)
{
return $this->enableSmileys(true)->parse('pm', $message);
}

public function parseReport($report)
{
return $this->enableSmileys(true)->parse('report', $report);
}

// $modSettings['disabledBBC']
public function setDisabled(array $disabled)
{
foreach ($disabled as $tag)
{
$this->disabled[trim($tag)] = true;
}

return $this;
}

public function getCodes()
{
global $modSettings;

if ($this->codes === null)
{
require_once(SUBSDIR . '/BBC/Codes.class.php');
$this->codes = new \BBC\Codes(array(), $this->disabled);
}

return $this->codes;
}


public function getBBCParser()
{
if ($this->bbc_parser === null)
{
require_once(SUBSDIR . '/BBC/BBCParser.class.php');
$this->bbc_parser = new \BBC\BBCParser($this->getBBCCodes(), $this->getAutolinkParser());
}

return $this->bbc_parser;
}

public function getAutolinkParser()
{
if ($this->autolink_parser === null)
{
require_once(SUBSDIR . '/BBC/Autolink.class.php');
$this->autolink_parser = new \BBC\Autolink($this->getBBCCodes());
}

return $this->autolink_parser;
}

public function getSmileyParser()
{
if ($this->smiley_parser === null)
{
require_once(SUBSDIR . '/BBC/SmileyParser.class.php');
$this->smiley_parser = new \BBC\SmileyParser;
}

return $this->smiley_parser;
}

public function getHtmlParser()
{
if ($this->html_parser === null)
{
require_once(SUBSDIR . '/BBC/HtmlParser.class.php');
$this->html_parser = new \BBC\HtmlParser;
}

return $this->html_parser;
}
}

I haven't completed it yet (busy) but here is what I plan on doing to implement this before we add a DIC.

Re: BBC Parsing

Reply #130

@Spuds, what do you think about the ParserWrapper? Do you think that's the way to go with this?

Re: BBC Parsing

Reply #131

Too late. Already started. Committing soon. Not well tested but that's what a beta process is for ;)

Re: BBC Parsing

Reply #132

Just pushed the change to my Elkarte repo. Hopefully I didn't break too much. I didn't add any tests (yet) in the interest of meeting @Spuds goal to get it in 1.1b1.

It took me a lot more work than necessary due to the size of the controllers and I didn't want to add properties to them in this commit. I'd rather get everything working and then worry about getting it looking better.

Re: BBC Parsing

Reply #133

Debating whether or not I want to change the constants' values to be what they were previously so that there is less BC breakage. I mean, no matter what, there is a BC break but maybe it would be less?

Going along with that, I really want to get this "maker" finished" but the parameters stuff is killing me. I just committed some changes.

Re: BBC Parsing

Reply #134

Just don't get mad with that, nice and easy buddy, it was an untouched monster since smf 1.1, there were (are) reasons for it.
~ SimplePortal Support Team ~