Skip to main content
Topic: Locating $modSetting['disableEntityCheck'] (Read 439 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Locating $modSetting['disableEntityCheck']

I'm in trouble locating an error with the handling of links in notification mails. Without going into too much detail, the problem comes down to doing a word wrap with the wrong (=too short) line length when the link contains entities, like "&". In the link it is replaced with "&", but he string length counts it as 1 character, when in the textual representation it's 5 characters.

The function in question is Utils::strlen(). There I can spot logic that deals with exactly this problem. The code checks whether $modSetting['disableEntityCheck'] is set. I'm in deep trouble spotting the place where this setting flag comes from. It is not to be found anywhere else in the code. Also I couldn't find anything like this in the admin screens.

Can someone shed light on this? Where could I find this setting, to check if that helps with my initial problem?

Re: Locating $modSetting['disableEntityCheck']

Reply #1
Without knowing the cod,e,JP, my, perhaps stupid, question is whether it can be recoded as a constant (to test for effect)  'disableEntityCheck=True' / False?   If that is feasible at least you'll discover whether you're chasing a rabbit or a fairy (or an orphan).
__________________________________________________________________________
// Deep inside every dilemma lies a solution that involves explosives //

Re: Locating $modSetting['disableEntityCheck']

Reply #2
disableEntityCheck is a very, very old setting in SMF that as far as I know has never been enabled anywhere for any reason. And it probably should not ever be enabled. Something else is wrong.

The link is doing the correct thing in wrapping in emails; they're presumably being embedded as HTML links and encoding an ampersand in HTML is absolutely doing the correct thing in terms of encoding the link.

Is the link to something inside the site? If so why is it even using an ampersand? Where are you getting the assertion from that the strlen length is even relevant (because there is no scenario in which it should be)? Instead of focusing on what you think the cause is, how about starting with what the *actual* problem is? An example maybe of the symptoms you're seeing would be useful.

Re: Locating $modSetting['disableEntityCheck']

Reply #3
On 19 Nov 2020 at 22:56, Arantor via ElkArte Community wrote:

disableEntityCheck is a very, very old setting in SMF that as far as I know has never been enabled anywhere for any reason. And it probably should not ever be enabled. Something else is wrong.



Is that one of those "Don't Push The Red Button" things Arantor? :-)

(You're no fun..)

Thanks for a better response than mine... (as I noted, I don't know the code..)

-Steeley

(Email reply to see if the links below are removed..)

You can

reply to this email and have it posted as a topic reply.

ElkArte Community Links:

To visit ElkArte Community on the web, go to:
https://www.elkarte.net/community

You can see this message by using this link:
https://www.elkarte.net/community/index.php?topic=5883.msg41558#ms g41558

Regards, The ElkArte Community
__________________________________________________________________________
// Deep inside every dilemma lies a solution that involves explosives //

Re: Locating $modSetting['disableEntityCheck']

Reply #4
disableEntityCheck is a very, very old setting in SMF that as far as I know has never been enabled anywhere for any reason. And it probably should not ever be enabled. Something else is wrong.

Blame that on me. I just got curious to find out what it does when I stumbled across it. In no way I want to advocate to work with this setting.

The link is doing the correct thing in wrapping in emails; they're presumably being embedded as HTML links and encoding an ampersand in HTML is absolutely doing the correct thing in terms of encoding the link.

And this is not correct. First of all, notification emails are sent as text only, at least in my system. (How to change? I'm curious to know!). Most (or all?) of the mail clients I know interpret an url coming in as text as if it were a real link and "adds" the url as a click target.

I'm trying to figure the best way how to bring this to the table.

Try to insert

Code: [Select]
[url=https://www.verylongurl.com/somefolder/index.php?c=files&a=download_file&id=11111]very long url[/url] 

into a post .. and then have a look at the notification email you will be receiving. That was the context I was mentioning. The link in the notification email is changed to

Code: [Select]
https://www.verylongurl.com/somefolder/index.php?c=files&a=download_file&id=11111

So for one, most likely the line is word wrapped. As I found, the word wrap limit is set dynamic based on the content. With the situation I was describing about the strlen question, it is applying a word wrap limit that is too short. And, if it is word wrapped it can't be followed any more from clicking on it in the mail.

But more importantly, the link is dead simply wrong! Following this link is not possible on the target server. That I found later, after sending my original question.

I found a (closed) issue on github that deals with this very issue imho. What is the best way to discuss this?

Is the link to something inside the site? If so why is it even using an ampersand? Where are you getting the assertion from that the strlen length is even relevant (because there is no scenario in which it should be)? Instead of focusing on what you think the cause is, how about starting with what the *actual* problem is? An example maybe of the symptoms you're seeing would be useful.

The link is to an external site. It is using an ampersand as a delimiter for parameters (my guess). Maybe that's not best practice, but it is what it is.

I hope I could give some more background to this now, and I believe I did my homework. The encoding and re-coding activities are hard to track, when going back and forth between text, bbc, and html.

Re: Locating $modSetting['disableEntityCheck']

Reply #5
OK, so no, that GitHub issue is nothing to do with this issue. This is important to understand, that the preservation of entities as HTML is important for any time that HTML is sent, which is 99% of the time. And in all cases *except for sending newsletters as text*, that is exactly the correct thing to be doing, converting it to an amp entity. (The GitHub issue relates to double-encoding in posts, not unencoding for newsletters)

Yes, & is a delimiter in normal links; SMF adopted ; years ago to avoid this very issue in normal links because you *cannot* export the & without encoding it when exporting HTML (especially when SMF used strict XHTML).

As far as sending newsletters as HTML, isn't there a checkbox for doing so? There certainly was in SMF, but I never trusted the functionality ever, far better to use something like Threadloom or some fancy integration with SendGrid or Mailgun.

Thing is, disableEntityCheck has nothing to do with the stated 'word wrapping' or counting, firstly because no such word wrapping happens that I can see, secondly because disableEntityCheck is for numeric entities and fixing malformed entities which this isn't. Far more likely is that your email client is choking on the ; for wrapping purposes, and that the & amp ; construction breaks the parsing on the other end because it's being injected as-is rather than through HTML parsing (which would resolve this *back*, correctly)

Best suggestion, https://github.com/elkarte/Elkarte/blob/development/sources/ElkArte/AdminController/ManageNews.php before line 791 add in:

Code: [Select]
if (!$context['send_html'])
{
    $base_message = strtr($base_message, ['&' => '&']);
}

This will, if not sending HTML, fix up the ampersands before sending on.

Re: Locating $modSetting['disableEntityCheck']

Reply #6
OK, so no, that GitHub issue is nothing to do with this issue. This is important to understand, that the preservation of entities as HTML is important for any time that HTML is sent, which is 99% of the time. And in all cases *except for sending newsletters as text*, that is exactly the correct thing to be doing, converting it to an amp entity. (The GitHub issue relates to double-encoding in posts, not unencoding for newsletters)

Yes, & is a delimiter in normal links; SMF adopted ; years ago to avoid this very issue in normal links because you *cannot* export the & without encoding it when exporting HTML (especially when SMF used strict XHTML).

As far as sending newsletters as HTML, isn't there a checkbox for doing so? There certainly was in SMF, but I never trusted the functionality ever, far better to use something like Threadloom or some fancy integration with SendGrid or Mailgun.

First of all thank you so much for taking the time to investigate and dig deeper into this! Sorry, it took me a while to get back to this issue. I still can't locate any setting for sending notifications in HTML. Could you please let me know how to integrate and use threadloom, sendgrid, or mailgun? I haven't heard anything about them yet, and I will definitely try to investigate.

The task I got from my community is to enable the use of the forum in maillist mode. As much as I personally don't like to use a forum that way, I have a bunch of users that are let's say a little bit old fashioned and very reluctant to changes in their habits. And since the functionality is existing in ElkArte, I wanted to make use of it.

Thing is, disableEntityCheck has nothing to do with the stated 'word wrapping' or counting, firstly because no such word wrapping happens that I can see, secondly because disableEntityCheck is for numeric entities and fixing malformed entities which this isn't. Far more likely is that your email client is choking on the ; for wrapping purposes, and that the & amp ; construction breaks the parsing on the other end because it's being injected as-is rather than through HTML parsing (which would resolve this *back*, correctly)

OK, got that .. thanks for explaining!

Best suggestion, https://github.com/elkarte/Elkarte/blob/development/sources/ElkArte/AdminController/ManageNews.php before line 791 add in:

Code: [Select]
if (!$context['send_html'])
{
    $base_message = strtr($base_message, ['&' => '&']);
}

This will, if not sending HTML, fix up the ampersands before sending on.

I tried your suggested change, and unfortunately it didn't help with the problem. Instead I have tweaked the _convert_anchor() function in Html2Md, where the html element <a> is turned into markdown.

Re: Locating $modSetting['disableEntityCheck']

Reply #7
I didn't spend any effort on it really, just I remember how this stuff worked because I've been using it for years.

Integrating Threadloom - I've only seen it done on vBulletin and XenForo, I guess there's no market for them for the free packages. Integrating SendGrid, I haven't done in years, and the integration I did for Mailgun I'd have to rewrite to port it to ElkArte - but since I don't actually *use* ElkArte... that would be difficult. I was just trying to be helpful given the shared DNA between ElkArte and SMF.

Interesting that my tweak doesn't work - it makes me wonder if there is some other problem involved because as far as I could see that happened late enough that this should have already been only encoded the ones.

Changing the Markdown parser as you've done will probably break links in posts, you really do need to leave the conversion to entity form for all web use because you cannot have it not encode for the web. But I've already provided enough warnings on that subject, hope it doesn't break too badly for you.

Re: Locating $modSetting['disableEntityCheck']

Reply #8
I didn't spend any effort on it really, just I remember how this stuff worked because I've been using it for years.

Integrating Threadloom - I've only seen it done on vBulletin and XenForo, I guess there's no market for them for the free packages. Integrating SendGrid, I haven't done in years, and the integration I did for Mailgun I'd have to rewrite to port it to ElkArte - but since I don't actually *use* ElkArte... that would be difficult. I was just trying to be helpful given the shared DNA between ElkArte and SMF.

Interesting that my tweak doesn't work - it makes me wonder if there is some other problem involved because as far as I could see that happened late enough that this should have already been only encoded the ones.

Changing the Markdown parser as you've done will probably break links in posts, you really do need to leave the conversion to entity form for all web use because you cannot have it not encode for the web. But I've already provided enough warnings on that subject, hope it doesn't break too badly for you.

Thanks for the information about these packages! I had a look, and as a first quick impression I found they are more in the newsletter or mailings area. I will take some more time over the weekend to read ;-)

You are making an absolutely good point with regard to changing the markdown parser. Thinking about this made me change routes. I have taken back the changes and instead patched something in the Notifications module. So, if there are unwanted side effects, it should be isolated to the field of notifications.

For the records, I changed Notifications.subs.php - line 207 from

Code: [Select]
$replacements['MESSAGE'] = $data['body'];

to

Code: [Select]
$replacements['MESSAGE'] = strtr($data['body'], ['&amp;' => '&']);