ElkArte Community

Elk Development => Bug Reports => Topic started by: shaitan on February 03, 2017, 01:30:51 pm

Title: Bad caractere inside mails forum in french
Post by: shaitan on February 03, 2017, 01:30:51 pm
I hope you understand the Title.  :)

Content of Mail:

QuoteUne réponse a été publiée sur un sujet que vous regardez par TofZeroSix.

Voir la réponse à: https://forum.mpdb.tv/index.php?topic=35559.msg264796#msg264796

Désabonnez-vous à ce sujet en utilisant ce lien: https://forum.mpdb.tv/index.php?action=notify;topic=35559.0

Le texte de la réponse est illustré ci-dessous:
J'ai appris le shell à l'école au milieu des années 90... :viok: Donc
, je connais un peu.C'était l'époque ou pour changer de ligne sur vi,
 fallait appuyer sur j...
J'ai installé mon premier Linux dans l'été 93... Version 0.99 du
 noyaux, 15 disquettes à passer dans le mange disque... Fallait le vouloir!
 
 
et je confirme... sed, grep, find, et tout le toutim, c'est super puissant
 quand on connait un peu !

Cordiales salutations,
 L'équipe Forum MPDB.tv

Title: Re: Bad caractere inside mails forum in french
Post by: Spuds on February 03, 2017, 02:04:40 pm
Could you include the original text of the message, that will make it easier to test a few things.

Also please attach / quote the full email with its headers if possible (you can remove your email address from that) ... What I want to see is if the system correctly made the multi part message as it should.

Emails contain multiple "versions" of the same text.  One will always be text/plain, which is what the above looks like since there are no ascii characters for certain letters.   After the text/plain version there should be a mime encoded version, probably in base64 or 7bit, which is capable of correctly displaying non ascii characters.   And then there may even be more mime versions such as html or quoted printable.  Generally its from least accurate to most accurate in order.

Your email client will pick the "best" one to display to you based on its settings and capabilities. 
Title: Re: Bad caractere inside mails forum in french
Post by: shaitan on February 03, 2017, 02:21:04 pm
Original message with mail attached.
I understand your explanation, thank you. I don't need to remove my adress.
(I can find many others samples for you if nedeed).

QuoteJ'ai appris le shell à l'école au milieu des années 90... :viok:  Donc, je connais un peu.C'était l'époque ou pour changer de ligne sur vi, fallait appuyer sur j...  :gniark:
J'ai installé mon premier Linux dans l'été 93... Version 0.99 du noyaux, 15 disquettes à passer dans le mange disque... Fallait le vouloir!

et je confirme... sed, grep, find, et tout le toutim, c'est super puissant quand on connait un peu !


Title: Re: Bad caractere inside mails forum in french
Post by: Spuds on February 03, 2017, 05:20:12 pm
Thanks, that helps, I'll see if I can determine what is going on
Title: Re: Bad caractere inside mails forum in french
Post by: Spuds on February 04, 2017, 02:05:35 pm
I'm still not sure why or where this is occurring.

Looking at the sample email, it shows that characters


Somewhere the UTF-8 text is being treated/saved as Windows-1252 or ISO-8859-1 ... Meaning characters like é, which is a two byte character, is being treated not as a single two byte character but as two individual single byte characters.  I have not been able to reproduce this behavior on my system.

Do you have any modifications installed that hook into the mail system? 

I'm really not sure what to check next, I'm not even sure its an ElkArte issue, or system issue with setlocale, or some language/template file not saved as UTF8 .... or ....

Really kind of stumped right now.
Title: Re: Bad caractere inside mails forum in french
Post by: emanuele on February 04, 2017, 02:25:02 pm
Testing idea, maybe it's worthless, but it's the only I have so far: can you change for a while to English the language in your profile and see if it makes any difference?
Title: Re: Bad caractere inside mails forum in french
Post by: shaitan on February 04, 2017, 02:43:18 pm
Spud, I understand. We have also searching  without success. (I have  developers on my site).
I Will try to install a second Elkarte forum on my host, clean, without Mods and with the original language pack. I have many Mods installed..  :D

Thank you for your time !

Emanuelle: Ok, I will also test that.
Title: Re: Bad caractere inside mails forum in french
Post by: shaitan on February 04, 2017, 02:55:20 pm
Détail: I use Php for sending mail. First day after install, no mails  arrive with SMTP. 
Others Mails from the forum are ok.
I come back soon with results of try.

Title: Re: Bad caractere inside mails forum in french
Post by: shaitan on February 05, 2017, 05:29:36 am
Tested:

New installed forum, in English only, no mods. Mail tested with php and Smtp.
https://forumtest.mpdb.tv/index.php

The problem remain the same.  :o


Title: Re: Bad caractere inside mails forum in french
Post by: Spuds on February 05, 2017, 09:03:51 am
Can you give me FTP access to that install so I can test a few things?  Just PM me the details.

What I'm thinking is that the `during the HTML to MD conversion its getting converted for some reason, but need to test several things to find out exactly what.
Title: Re: Bad caractere inside mails forum in french
Post by: shaitan on February 05, 2017, 09:17:13 am
Yes, with pleasure,  done,  PM Sended.
Title: Re: Bad caractere inside mails forum in french
Post by: Spuds on February 05, 2017, 12:06:22 pm
I have narrowed down the problem to our Html2Md.class.php file in the sources/subs directory .... This function is used to convert HTML to Markdown which we use for emails.

The issue is really with PHP's DomDocument function implementation which behaves differently on different versions of  PHP as well as variants of *nix.  This variation really is a PITA.

What we have now is a "trick" to let the function know that we want to handle all text as UTF8.  We also use <xml which allows us to silence any markup errors found.   You can find this trick recommended in many places and it has not failed ... until now !

Code: (find) [Select]
			$this->doc->loadHTML('<?xml encoding="UTF-8">' . $this->html);

Using the supplied FTP access (thank you!) I was able to test several things, and it appears that this will work for now
Code: (try) [Select]
			$this->doc->loadHTML('<?xml encoding="UTF-8"><html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/></head><body>' . $this->html . '</body></html>');

You can use that for now, but I have to check on a few things ... mainly I think we will need to capture the  text inside <body>(.*)</body> to prevent a double wrapping (not an issue unless you are using the email reply capability)
Title: Re: Bad caractere inside mails forum in french
Post by: shaitan on February 05, 2017, 12:34:06 pm
Bravo Spuds !  You're the best [1]

A friend developer who work on our projects had take a look and said me something about  Markdown but he had no time for more searching.

I will use the fix, great.

Quotenot an issue unless you are using the email reply capability

We want try and use this function but not yet, In a few weeks or months

Big thank you.  :)
I hope Emanuelle will never read that, I don't want to cause the third world war  :D
Title: Re: Bad caractere inside mails forum in french
Post by: emanuele on February 05, 2017, 12:45:11 pm
Quote from: Spuds – The issue is really with PHP's DomDocument function implementation which behaves differently on different versions of  PHP as well as variants of *nix.  This variation really is a PITA.
That is sort of normal. LOL
Title: Re: Bad caractere inside mails forum in french
Post by: shaitan on February 05, 2017, 03:45:20 pm
QuoteUne réponse a été publiée sur un sujet que vous regardez par BoneMiMine.

Voir la réponse à: https://forum.mpdb.tv/index.php?topic=35564.msg264936#msg264936

Désabonnez-vous à ce sujet en utilisant ce lien: https://forum.mpdb.tv/index.php?action=notify;topic=35564.0

Le texte de la réponse est illustré ci-dessous:
éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé
ééééééééééééééééééééééééééé 
àààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààààà
àààààààààààààààààà

Cordiales salutations,
 L'équipe Forum Mpdb.tv.

Beautiful, isn't it ?
Title: Re: Bad caractere inside mails forum in french
Post by: Trekkie101 on February 05, 2017, 04:18:51 pm
Damn French always making it difficult ;)

Love from your Auld Alliance (Vieille Alliance) friends en Ecosse!
Title: Re: Bad caractere inside mails forum in french
Post by: radu81 on February 05, 2017, 04:35:34 pm
I have my forum in italian and we also have special caracters like é or è, but no problem on my emails. I send them through smtp
Title: Re: Bad caractere inside mails forum in french
Post by: shaitan on February 05, 2017, 04:47:56 pm
QuoteDamn French always making it difficult ;)

  :D

QuoteLove from your Auld Alliance (Vieille Alliance) friends en Ecosse!

"The oldest alliance of the world!" [1]

For you my friend:

O Flower of Scotland
When will we see
Your like again,
That fought and died for
Your wee bit Hill and Glen
And stood against him
Proud Edward’s Army,
And sent him homeward
Tae think again.

Those days are past now
And in the past they must remain
But we can still rise now
And be the nation again
That stood against him
Proud Edward’s Army
And sent him homeward,
Tae think again.

Général De Gaulle