Search This Blog

Wednesday, April 24, 2013

Non-ASCII characters through MIME

Hi guys

Recently I met an issue with danish symbols in html that was created through MIME.
Below you can find a simple example:

Dim s As New NotesSession
Dim doc As NotesDocument
Dim mime As NotesMIMEEntity
Dim stream As notesstream

s.Convertmime = false

Set doc = s.Currentdatabase.Createdocument()
Set stream = s.Createstream()
Call stream.Writetext({ØØØØØØØØ})

Set mime = doc.Createmimeentity("Body")
Call mime.Setcontentfromtext(stream, "text/html", ENC_NONE)

Call doc.Save(true, false)

s.Convertmime = True

When I opened the document using a form with a single rich text item "Body" I saw question marks ? instead of danish symbols  Ø

I tried many things: tried to define charset, tried to set different encoding types etc.
Finally I found a solution.
The interesting part of this is that it is clearly described in a standard help database.
Wise people always told me: "RTFM first!" :-)
So, yes, they were right.
Here is a snippet of description of Setcontentfromtext method:


If the NotesStream input is the result of NotesStream.WriteText, translation of the internal Unicode defaults to US-ASCII. To translate characters other than US-ASCII, append a charset parameter such as "charset=UTF-8" or "charset=Unicode-1-1" to the type/subtype.
If Content-Type specifies "text" and the "charset" parameter specifies a known Internet encoding, and encoding is ENC_IDENTITY_8BIT or ENC_IDENTITY_BINARY, content storage is with the specified character set. Otherwise, content storage is attempted with US-ASCII.


So the key is to use both "charset=UTF-8" and ENC_IDENTITY_8BIT.
Neither first thing nor second one worked alone.
The actual fix is following:

Call mime.SetContentFromText(stream, "text/plain;charset=UTF-8", ENC_IDENTITY_8BIT)

1 comment:

  1. Hi
    Does it solve the same problem when we you working html content. I tried to inser 'text/html' instead of 'text/plain' in the ContentFromText as following:

    Call mime.SetContentFromText(stream, "text/html;charset=UTF-8", ENC_IDENTITY_8BIT)

    And do not solve the problem with danish symbols.

    Thanks in advance

    ReplyDelete