Why do (Russian) characters in some received emails change when reading in David InfoCenter?

3

1

I'm using David InfoCenter as email software, and I have troubles with some of my emails in Russian. It's only a few letters, in some emails (sent from different people), like for example the "R" ("Р" in russian) will be shown as a "Т". In other emails in Russian, the problem doesn't appear. Isn't it strange? Has anyone had the same problem already and found what causes it?

When I transmit that email to an external mailbox (internet email account), it's even worse, and gives me symbols instead of all Russian letters.

The default encoding was "Russian (ISO)", I changed it to "Russian (Windows)", but same problem. Another weird reaction is when I write an internal email and name it "Test" in Russian (Тест), with Тест in the text window, it changes the title to "Oano"? But the content stays in Russian.

With Mailinator I got the following, for message and subject "Тест":

Subject: ????
[..]
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----_=_NextPart_000_00017783.4AF7FB71"
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
------_=_NextPart_000_00017783.4AF7FB71
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: base64
0KLQtdGB0YI=
------_=_NextPart_000_00017783.4AF7FB71
Content-Type: text/html;
charset="utf-8"
Content-Transfer-Encoding: base64
PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMCBUcmFuc2l0aW9uYWwv
L0VOIj4NCjxIVE1MPjxIRUFEPg0KPE1FVEEgaHR0cC1lcXVpdj1Db250ZW50LVR5cGUgY29udGVu
dD0idGV4dC9odG1sOyBjaGFyc2V0PXV0Zi04Ij4NCjxNRVRBIG5hbWU9R0VORVJBVE9SIGNvbnRl
bnQ9Ik1TSFRNTCA4LjAwLjYwMDEuMTg4NTIiPjwvSEVBRD4NCjxCT0RZIHN0eWxlPSJGT05UOiAx
MHB0IENvdXJpZXIgTmV3OyBDT0xPUjogIzAwMDAwMCIgbGVmdE1hcmdpbj01IHRvcE1hcmdpbj01
Pg0KPERJViBzdHlsZT0iRk9OVDogMTBwdCBDb3VyaWVyIE5ldzsgQ09MT1I6ICMwMDAwMDAiPtCi
0LXRgdGCPFNQQU4gDQppZD10b2JpdF9ibG9ja3F1b3RlPjxTUEFOIGlkPXRvYml0X2Jsb2NrcXVv
dGU+PC9ESVY+PC9TUEFOPjwvU1BBTj48L0JPRFk+PC9IVE1MPg==
------_=_NextPart_000_00017783.4AF7FB71--

waszkiewicz

Posted 2009-11-05T09:52:17.243

Reputation: 421

Any chance you can upload that picture again? – Arjan – 2010-05-20T07:50:23.013

Sorry, I don't work with David anymore... – waszkiewicz – 2010-06-16T09:03:12.847

1sounds like an encoding problem. does David InfoCenter have any preferences where you can specify a character encoding? – quack quixote – 2009-11-05T10:05:52.130

Yes, there is a "default encoding" choice. – waszkiewicz – 2009-11-05T10:29:51.613

But I don't think it'S the problem. It was already as "russian (ISO)", I changed it to russian (Windows), but same problem. Another weird reaction is when I write an intern email and name it TEST in russian, with TEST in the text window, it changes the title to "OANA"? But the content stays in russian... I really don't get it. – waszkiewicz – 2009-11-05T10:37:06.730

It changes the text after you've send it, right? So you see "OANA" in your Inbox but the Russian "TEST" in your Sent Items? And what exactly is the Russian word for "TEST"? – Arjan – 2009-11-05T11:28:52.617

The image uses both Russian and German, so I suppose you should first ensure you send as Unicode? (I don't know if a Russian character set would include that üßö, and any "normal" characters?) Note that this is still not the "source" of the received message. Do you know where to find that? – Arjan – 2009-11-05T12:46:47.717

No, actually I dont... – waszkiewicz – 2009-11-05T18:34:52.567

yeah, ISO/Windows encodings aren't what you want here; you want to be using Unicode. it's possible your application isn't handling Unicode properly. – quack quixote – 2009-11-05T21:09:24.673

1@waszkiewicz, was the subject just question marks? If so, please repeat using a subject like "test / тест" Also, please use a short test message, like "A test / тест for Super User. Grüßen!"...? – Arjan – 2009-11-09T10:01:56.977

And what about my earlier questions: It changes the text after you've sent it, right? So you see "OANO" in your Inbox but the Russian "тест" in your Sent Items? – Arjan – 2009-11-09T10:09:27.540

(And for a new test: please tell us what the test subject and message were.) – Arjan – 2009-11-09T10:10:02.827

Message and subject are "тест". – waszkiewicz – 2009-11-09T10:29:49.353

So, exactly the same problems when only using Russian? In your original question you were mixing Russian with German. – Arjan – 2009-11-09T10:36:06.940

Yes. Actually I don't have Problems with German, French, Polish. It's really only with russian and greek (Or maybe some other asian languages I don't use). – waszkiewicz – 2009-11-09T11:26:06.917

Are you sure that, for example, the funny Ł in the Polish "fałszywy" does not yield problems when used in the Subject? – Arjan – 2009-11-09T15:04:50.893

Answers

5

To break down the message:

Subject: ????

Too bad, your David InfoCenter is not doing things right. The above should have been something like:

Subject: =?utf-8?Q?=D0=A2=D0=B5=D1=81=D1=82?=

So, this is a bug that should be reported, and fixed.

Next:

MIME-Version: 1.0
Content-Type: multipart/alternative;
 boundary="----_=_NextPart_000_00017783.4AF7FB71"

The above tells the recipient that after each line "----_=_NextPart_000_00017783.4AF7FB71" it will find the very same message in a different format. Good.

Next:

This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

The above will be visible to users of old email software that does not understand MIME. Good.

Next:

------_=_NextPart_000_00017783.4AF7FB71
Content-Type: text/plain;
 charset="utf-8"
Content-Transfer-Encoding: base64
0KLQtdGB0YI=

The above is the plain text, without bold, italic, etcetera. Using the great Online Base64 Decoder from FileFormat.info, the 0KLQtdGB0YI= translates back to Тест. Aha, not the lowercase тест like you wrote...? Anyway, seems fine, and a good email client should understand this part.

In some more detail: 0KLQtdGB0YI= actually decodes to hexadecimal d0 a2 d0 b5 d1 81 d1 82 and you (should) see the same hexadecimal numbers in the Subject above. (When not properly decoded as being UTF-8, like when erroneously interpreted as Windows-1252, this would show as ТеÑÑ‚.)

Next:

------_=_NextPart_000_00017783.4AF7FB71
Content-Type: text/html;
 charset="utf-8"
Content-Transfer-Encoding: base64
PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMCBUcmFuc2l0aW9uYWwv
L0VOIj4NCjxIVE1MPjxIRUFEPg0KPE1FVEEgaHR0cC1lcXVpdj1Db250ZW50LVR5cGUgY29udGVu
dD0idGV4dC9odG1sOyBjaGFyc2V0PXV0Zi04Ij4NCjxNRVRBIG5hbWU9R0VORVJBVE9SIGNvbnRl
bnQ9Ik1TSFRNTCA4LjAwLjYwMDEuMTg4NTIiPjwvSEVBRD4NCjxCT0RZIHN0eWxlPSJGT05UOiAx
MHB0IENvdXJpZXIgTmV3OyBDT0xPUjogIzAwMDAwMCIgbGVmdE1hcmdpbj01IHRvcE1hcmdpbj01
Pg0KPERJViBzdHlsZT0iRk9OVDogMTBwdCBDb3VyaWVyIE5ldzsgQ09MT1I6ICMwMDAwMDAiPtCi
0LXRgdGCPFNQQU4gDQppZD10b2JpdF9ibG9ja3F1b3RlPjxTUEFOIGlkPXRvYml0X2Jsb2NrcXVv
dGU+PC9ESVY+PC9TUEFOPjwvU1BBTj48L0JPRFk+PC9IVE1MPg==

The above is the very same, as a HTML formatted message. This will look about the same, though this is not at all valid HTML, as the tags are not closed in the order in they are opened, and an id should be unique but id=tobit_blockquote is used twice in this one-line message. Actually, the word "blockquote" suggests that you might have copied the word Тест from another message?

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=utf-8">
<META name=GENERATOR content="MSHTML 8.00.6001.18852"></HEAD>
<BODY style="FONT: 10pt Courier New; COLOR: #000000" leftMargin=5 topMargin=5>
<DIV style="FONT: 10pt Courier New; COLOR: #000000">Тест<SPAN 
id=tobit_blockquote><SPAN id=tobit_blockquote></DIV></SPAN></SPAN>
</BODY></HTML>

Also, there's no need to send HTML for simple messages...

Finally (note the two trailing dashes):

------_=_NextPart_000_00017783.4AF7FB71--

This tells the email software the end of all formats is reached.

This test message does not explain how Тест could become Oano, as the question marks could never translate into that. Maybe the question marks are not real question marks after all. Anyway: the Subject being wrong is a bug in your email client, which does not send the correct Subject. Also the HTML is buggy. Stop using that software.

Arjan

Posted 2009-11-05T09:52:17.243

Reputation: 29 084

2Damn, typing all that to get upvoted for the last phrase. ;-) – Arjan – 2009-11-09T11:42:17.710

(Markdown gave me hard times on this; adding a newline after that boundary= (so, having the closing PRE on the next line) changes the whole post...) – Arjan – 2009-11-09T11:44:21.687

2

It surely must be a character set and/or encoding problem. Nowadays all the different character sets like "Russian (ISO)" and "Russian (Windows)" should no longer be required when using Unicode. And when using Unicode, most messages will be encoded using UTF-8.

So:

  • Does changing the character set to Unicode help?
  • Does changing the encoding to UTF-8 help?
  • If not: can you post the source of the test message, after you received that? (Be careful to replace any email addresses with something like me@example.com before adding it to your question.)

All email clients have a different way to show the true source, so maybe using some online service might be the easiest way to explain how see what is received:

  • Send a test message to some Mailinator account. No need to create an account: anything you put before @mailinator.com will work, but note that anyone who guesses that address can read the Inbox.
  • Go to its Inbox at mailinator.com
  • Click on the subject to open the message
  • While viewing the message, click the "(text view)" link:

Mailinator Inbox

  • This will show something like:

    Received: from [..] 
      by [..] 
      for &lt;johndoe@mailinator.com&gt;; Fri, 6 Nov 2009 11:58:10 +0100 (CET)
    Subject: =?utf-8?Q?Test_/_=D1=82=D0=B5=D1=81=D1=82?=
    From: Arjan &lt;[..]&gt;
    Content-Type: text/plain; charset=utf-8; format=flowed
    Message-Id: [..]
    Date: Fri, 6 Nov 2009 11:58:08 +0100
    To: johndoe@mailinator.com
    Content-Transfer-Encoding: quoted-printable
    Mime-Version: 1.0 (Apple Message framework v1076)
    X-Mailer: Apple Mail (2.1076)
    X-Virus-Scanned: by XS4ALL Virus Scanner
    
    A test / =D1=82=D0=B5=D1=81=D1=82 for Super User.
    
    Gr=C3=BC=C3=9Fen!
    
    Arjan.=
    .
    

Above, some personal details have been removed: no need to show us your email address or server details (like IP addresses).

(For some reason Mailinator shows the UTF-8 encoded source of the message, "A test / тест for Super User. Grüßen!" as ASCII in the above screen capture. Seeing things like ü for ü and Ãe for ß is typically UTF-8 encoded text which has not been decoded. Still, it converts the subject just fine. And the last dot is actually a left-over from the SMTP communications, and could have been removed by Mailinator.)

Arjan

Posted 2009-11-05T09:52:17.243

Reputation: 29 084

Changing to Unicode or UTF-8 doesn't solve the problem. – waszkiewicz – 2009-11-05T11:48:14.467

Ok, can you show some source then? – Arjan – 2009-11-05T11:49:09.050