Page 1 of 1

Charset mismatch

Posted: 07 May 2008, 09:42
by Admin
I suppose that anybody (maybe except for english users) have seen that there are various problems with the charset encodings. I decided to switch now completly to UTF-8.
I know that UTF-8 is slower than ISO-* encodings. Strings are parsed byte by byte in C than to use directly Intel CPU instructions in assembler to handle strings. I hope that hardware constructors would find out a solution to implement UTF-8 instructions in the new processor generations.
However in the while I will switch completly to UTF-8 NFC.
This concerns everything: Mail server, Trashmail web page and forum etc.

Re: Charset mismatch

Posted: 12 May 2008, 06:08
by Z
Is header information now ok for charset encoding? It was missing earlier?

Answer - Nope.

It seems that encoding information is missing. So now email clients are using their DEFAULT charset which often isn't UTF8.

Could you please add encoding information to headers, so email programs would know that they should handle messages as UTF-8.

In europe I use as default ISO-8859-1 or better ISO-8859-15 (including euro) but UTF-8 is still bit rare, it's more often used in Asia.

It seems that many email clients can change charset for outgoing mail on the fly.

If I send email including only english test email is sent as 7 bit ascii, if it includes scands äöåÖÄÅ������8859-1 and if €uro s� is i�uded �n it's -15 etc.

Re: Charset mismatch

Posted: 12 May 2008, 20:38
by Admin
[quote="Z"]Is header information now ok for charset encoding? It was missing earlier?

Answer - Nope.

It seems that encoding information is missing. So now email clients are using their DEFAULT charset which often isn't UTF8.

Could you please add encoding information to headers, so email programs would know that they should handle messages as UTF-8.

In europe I use as default ISO-8859-1 or better ISO-8859-15 (including euro) but UTF-8 is still bit rare, it's more often used in Asia.

It seems that many email clients can change charset for outgoing mail on the fly.

If I send email including only english test email is sent as 7 bit ascii, if it includes scands äöåÖÄÅ��������� s� is i�uded �n it's � etc.�uote]�k,
I would like to add it. Do you know what is the exact header line as its defined in the RFC to advertise for UTF-8 encoding?