ElkArte Community

Elk Development => Bug Reports => Exterminated Bugs => Topic started by: Flavio93Zena on August 25, 2015, 03:56:02 am

Title: Brackets mistakenly becoming part of URL
Post by: Flavio93Zena on August 25, 2015, 03:56:02 am
Try to click the links by @TE here (http://www.elkarte.net/community/index.php?topic=1946.msg19612#msg19612) and you will get a 404, because the link parses the outer bracket.
FYI, not broken on SMF.

Related? http://www.elkarte.net/community/index.php?topic=2815.0
Title: Re: Brackets mistakenly becoming part of URL
Post by: TE on August 25, 2015, 04:42:14 am
Yep, seems to be a bug.. (I fixed the other post).
Here's another example (http://https://www.polymer-project.org/1.0/)

The closing bracket seems to be the issue..
Title: Re: Brackets mistakenly becoming part of URL
Post by: Spuds on August 25, 2015, 07:45:23 am
We need to add some extra "smarts" to the autolinker ... the basic problem is that a ) is a valid url character (along with a few other odd things) ... so as far as the regex or filter-var is concerned that trailing ) is a valid part of the link.

What we need to do is see if there is a leading ( as well, in which case we can assume that it was entered as (link)  -or-  we could do a more basic check and if it ends in a ) simply cut it as the odds of a link ending in that character are slim (but valid).

Either of those fixes will be less prone to the issue, not fail proof, but much better.

Tracked here: https://github.com/elkarte/Elkarte/issues/2171
Title: Re: Brackets mistakenly becoming part of URL
Post by: Spuds on August 26, 2015, 05:52:20 pm
Well I've looked at this a bit and honestly there is no great solution, its a bit of the limitation of the regex.

I explored adding a look behind to catch the trailing ) (and others symmetric issues)  but aside from getting a bit ugly, I was concerned with having the look ahead and look behind for this, we could end up in a exhaustive recursion situation.

So the solutions as I see them are:
1) Don't allow url's with )'s in them.  This is not the worst things since the autolinker is a "goodie" .. it will miss valid links which is what the url bbc is for anyway.
2) Pull that regex out of the current preg_replace array and add it to a separate preg_replace_callback where we can see if it starts with a ( and if so trim any trailing one.  Something like
if (is_string($result = preg_replace_callback('~(?<=([\s>\.(;\'"])|^)((?:http|https)://[\w\-_%@:|]+(?:\.[\w\-_%]+)*(?::\d+)?(?:/[\p{L}\w\-_\~%\.@!,\?&;=#(){}+:\'\\\\]*)*[/\w\-_\~%@\?;=#}\\\\]?)~ui', create_function('$m', 'return $m[1] !== "(" ? "[url]$m[2][/url]" : "[url]" . rtrim("$m[2]", ")") . "[/url])";'), $data)));
$data = $result;
That covers the most used case and still allows links with )'s in them, even trailing ones

So 1) is easy and cheap, 2) is more correct for this instance but a tad more expensive (probably not in real use) in terms of processing.
Title: Re: Brackets mistakenly becoming part of URL
Post by: Flavio93Zena on August 26, 2015, 09:58:40 pm
And to be honest, I have no idea which one to pick either, I'd kindly pass on and ask more expert people such as TE (no tag, already commented), @emanuele @Joshua Dickerson @ant59 etc. Pretty sure you need to discuss on this one.
Title: Re: Brackets mistakenly becoming part of URL
Post by: emanuele on August 27, 2015, 08:22:23 am
https://github.com/elkarte/Elkarte/issues/2171
Title: Re: Brackets mistakenly becoming part of URL
Post by: Flavio93Zena on August 27, 2015, 09:06:44 am
Didn't see that, well, it's a bump for that then ;)
Title: Re: Brackets mistakenly becoming part of URL
Post by: emanuele on September 02, 2015, 04:55:57 pm
I knew I posted about this: http://www.elkarte.net/community/index.php?topic=499.0 :P
Duplicate. xD
Title: Re: Brackets mistakenly becoming part of URL
Post by: Flavio93Zena on September 02, 2015, 06:21:00 pm
Wth 2 years D:
Title: Re: Brackets mistakenly becoming part of URL
Post by: emanuele on September 03, 2015, 02:14:08 am
Flavio the bugs are solved when there are the conditions.
Pointing out that a bug is 2 years older does nothing else than irritate me (and likely others as well), and in this period I'm not in the mood of being irritated on this forum as well. Thank you.
Title: Re: Brackets mistakenly becoming part of URL
Post by: Flavio93Zena on September 03, 2015, 03:29:53 am
I didn't mean it that way, you should know me enough after 2500 messages or so. ;) If I have something wrong with something/somebody, you know that the guy is the first getting a PM about it.