"Use of Win32" demonstrations
Unicode and language tests


   Introduction
   The four application types
   Using codepages and DrawTextA
   Displaying Unicode with TextOutW
   Displaying Unicode with ExtTextOutW
   Using Microsoft Layer for Unicode
   Testbug's unicows.dll test

The mslu loader code is Copyright Jeremy Gordon.

Introduction
This help file touches on two inter-related subjects:-

The main problem is that under W9x /ME the wide range of Unicode APIs are not available. Because of this, programmers have needed a work-around to write applications for both platforms and to display non-Roman characters (typically non "Western" characters) under W9x/ME. Here I briefly examine the problems and some practical solutions.

For more information see my article "Writing Unicode programs" at www.GoDevTool.com.

The four application types

Using codepages and DrawTextA
DrawTextA is available on both W9x/ME and on NT/2000/XP machines. It can be used to display non-Roman characters if those characters are available in a character set in a codepage available on the machine. The character set or codepage is a component of the font which is selected into the device context. So you need to obtain a font handle for the font and character set you want to use and then select the font handle into the device context used by DrawTextA.

Note that DrawTextW (which would accept a Unicode string) is not available in W9x/ME unless the Microsoft Layer for Unicode (mslu) is also used.

In this demonstration, if no writeable character is available in the chosen character set for the value given to DrawTextA, Windows will draw a default character, usually a question mark, box or line.
You may find that you get only a line of default characters for some of the character sets. That is probably because the chosen character set is not installed on the machine you are using.

To watch this demonstration on the debugger set the breakpoint to ANSILANG_TEST or ANSILANGPAINT. When coming into ANSILANG_TEST the value in AL will depend on the menu item chosen.
Single-step in ANSILANGPAINT (which is called on the WM_PAINT message) and see the font handle being obtained and selected into the device context before the draw.

Here is the code which is used in Testbug:-

DATA
;
WNDCLASS DD 10D DUP 0
LOGFONT     DD 7 DUP 0       ;pitch and widths etc.
            DB 32 DUP 0      ;typeface LF_FACESIZE=32 in ansi version
PAINTSTRUCT DD 16D DUP 0
CLIENT_RECT DD 4 DUP 0       ;structure to hold client area size inf
;******************** Window message table 
ANSILANGMESSAGES DD (ENDOF_ANSILANGMESSAGES-$-4)/8  ;=number to be done
         DD  1h,ANSILANGCREATE,2h,ANSILANGDESTROY,0Fh,ANSILANGPAINT
ENDOF_ANSILANGMESSAGES:   ;label used to work out how many messages
;*****************************************
ANSILANG_MESS   DB 'Here is a string of random characters between 128 and 255:-',0Dh,0Ah
                DB 0F8h,0DBh,0D0h,0B4h,0A0h,0FEh,0ACh,0BAh,0EEh,0EBh,0A4h,0A5h
                DB 0F9h,0DCh,0D1h,0B5h,0A1h,0FFh,0ADh,0BBh,0EFh,0ECh,0A5h,0A6h,0
PROPOSED_CHARSET DB 0
;********************
CODE
;
GET_ANSILANGFONT:
PUSH EDI                ;edi is already being used
MOV EDI,ADDR LOGFONT
PUSH EDI                ;push that for CreateFontIndirectA
MOV EDX,EDI             ;and save in edx
XOR EAX,EAX
MOV ECX,7
REP STOSD               ;clear main parts of LOGFONT
MOV D[EDX],16           ;use font height of 16
MOV AL,[PROPOSED_CHARSET]
MOV B[EDX+17h],AL       ;insert in LOGFONT
MOV B[EDX+1Ch],0        ;any typeface will do
CALL CreateFontIndirectA
MOV ESI,EAX             ;keep handle in esi, if found
POP EDI
RET
;
ANSILANGCREATE:         ;one of the few messages dealt with by this prog
XOR EAX,EAX             ;return zero to make window
RET
;
ANSILANGDESTROY:        ;one of the few messages dealt with by this prog
PUSH 0
CALL PostQuitMessage    ;exit via the message loop
STC                     ;go to DefWindowProc too
RET
;
ANSILANGPAINT:
PUSH ADDR PAINTSTRUCT,[EBP+8h]     ;EBP+8h=hwnd
CALL BeginPaint         ;get device context to use, initialise paint
MOV EDI,EAX             ;keep device context in edi
;*******************
MOV EBX,ADDR CLIENT_RECT
PUSH EBX,[EBP+8h]?
CALL GetClientRect      ;get client area of window in CLIENT_RECT
;******************* centre the text ..
MOV EAX,[EBX+8h]        ;get available ?idth 
SHR EAX,2               ;get a quarter of it
MOV [EBX],EAX           ;give that to x-pos
SUB [EBX+8h],EAX        ;shorten rectangle on right hand side
MOV EAX,[EBX+0Ch]       ;get available height
SHR EAX,2               ;get a quarter of it
MOV [EBX+4h],EAX        ;give that to y-pos
SUB [EBX+0Ch],EAX       ;reduce height of rectangle
;******************* set the font and colour
PUSH 0CC3333h,EDI       ;colour, device context
CALL SetTextColor
CALL GET_ANSILANGFONT   ;get, in esi, the font to try
;****** in fact CreateFontIndirect always returns one font or another
PUSH ESI,EDI            ;esi=handle to font, edi=device context
CALL SelectObject       ;select the font into the device context
;*************** push now to deselect later 
PUSH EAX,EDI            ;eax=previous font handle, edi=device context
;******************************* 
PUSH 0                  ;no formatting options
PUSH 1h+10h+100h        ;DT_CENTER+DT_WORDBREAK+DT_NOCLIP
PUSH EBX                ;formatting rectangle
PUSH -1                 ;Windows to calculate length of message
PUSH ADDR ANSILANG_MESS ;address of message
PUSH EDI                ;device context
CALL DrawTextExA
;********************************
CALL SelectObject       ;deselect font handle in dc
PUSH ESI
CALL DeleteObject       ;delete the font (handle in esi)
;*******************
PUSH ADDR PAINTSTRUCT,[EBP+8h]      ;EBP+8h=hwnd
CALL EndPaint
XOR EAX,EAX
RET
;
AnsiLangWndProc:
MOV EDX,ADDR ANSILANGMESSAGES   ;give edx the list of messages to deal with
CALL GENERAL_WNDPROC    ;call the generic message handler
RET 10h                 ;restore the stack as required by caller
;
AT44:                   ;esi=0=enable, 1-disable
MOV EBX,970
L0:
PUSH ESI,EBX,[hMenu]    
CALL EnableMenuItem     ;enable/disable the correct menu items
INC EBX
CMP EBX,983
JNZ L0
RET
;
ANSILANG_TEST:
MOV [PROPOSED_CHARSET],AL   ;keep character set to try
XOR ESI,ESI             ;0=enable, 1-disable
INC ESI
CALL AT44               ;enable/disable the correct menu items
CALL INITIALISE_WNDCLASS    ;get ready to register window class
;********** now add things specific to the window to be made
MOV D[EBX],1h+2h+20h    ;CS_VREDRAW+CS_HREDRAW+CS_OWNDC (window class style)
MOV D[EBX+4],ADDR AnsiLangWndProc         ;window procedure
MOV D[EBX+24h],ADDR ANSILANG_CLASSNAME    ;window class name
PUSH EBX                ;address of structure with window class data
CALL RegisterClassA     ;register the window class
PUSH 0,[hInst],0,0      ;owner=desktop
PUSH 200D               ;height
PUSH 320D               ;width
PUSH 50D,50D            ;position y then x
PUSH 90000000h +0C00000h+40000h +80000h +20000h     +10000h      ;window style
;(POPUP+VISIBLE)+CAPTION+SIZEBOX+SYSMENU+MINIMIZEBOX+MAXIMIZEBOX
PUSH 'ANSI language test'      ;window title
PUSH ADDR ANSILANG_CLASSNAME   ;window class name
PUSH 0                  ;extended window style
CALL CreateWindowExA    ;make window, returning handle in EAX
;************************ now enter the main message loop (ANSI version)
L1:
PUSH 0,0,0
PUSH ADDR MSG
CALL GetMessageA        ;wait for message from Windows
OR EAX,EAX              ;see if it is WM_QUIT
JZ >L2                  ;yes
PUSH ADDR MSG
CALL TranslateMessage   ;no so convert message to character if necessary
PUSH ADDR MSG
CALL DispatchMessageA   ;and send the message to the window procedure
JMP L1                  ;after message dealt with, loop back for next one
L2:
PUSH [hInst],ADDR ANSILANG_CLASSNAME    ;message was WM_QUIT
CALL UnregisterClassA   ;ensure class is removed
XOR ESI,ESI             ;0=enable, 1-disable
CALL AT44               ;enable/disable the correct menu items
RET                     ;return to caller
;
INITIALISE_WNDCLASS:    ;get ready for all windows
MOV EBX,ADDR WNDCLASS
MOV EAX,9
L30:
MOV D[EBX+EAX*4],0      ;fill it with zeroes (may have been fille? with other)
DEC EAX
JNS L30
MOV EAX,[hInst]         ;obtained ?n another part of the prog
MOV [EBX+10h],EAX
MOV EAX,[hIcon]         ;obtained ?n another part of the prog
MOV [EBX+14h],EAX
MOV EAX,[hCurs]         ;obtained in another part of the prog
MOV [EBX+18h],EAX       ;and give to WNDCLASS
MOV D[EBX+1Ch],6D       ;COLOR_WINDOW+1
RET
;
GENERAL_WNDPROC:        ;eax can be used to convey information to the call
PUSH EBP                ;use ebp to avoid using eax which may hold information
MOV EBP,[ESP+10h]       ;uMsg
MOV ECX,[EDX]           ;get number of messages to do * 8 (+4)
ADD EDX,4               ;jump over size dword
L33:
DEC ECX
JS >L46
CMP [EDX+ECX*8],EBP     ;see if its the correct message
JNZ L33                 ;no
MOV EBP,ESP
PUSH ESP,EBX,EDI,ESI    ;save registers as required by Windows
ADD EBP,4               ;allow for the extra call to here
;now [EBP+8]=hWnd, [EBP+0Ch]=uMsg, [EBP+10h]=wParam, [EBP+14h]=lParam,
CALL [EDX+ECX*8+4]      ;call the correct procedure for the message
POP ESI,EDI,EBX,ESP
JNC >L48                ;nc=return value in eax - don't call DefWindowProc
L46:
PUSH [ESP+18h],[ESP+18h],[ESP+18h],[ESP+18h]     ;allowing for change of ESP
CALL DefWindowProcA
L48:
POP EBP
RET

Displaying Unicode with TextOutW
TextOutW is available on both W9x/ME and on NT/2000/XP. It accepts Unicode strings as input for display. It is therefore useful when making applications which can run on both platforms, since the source code does not have to be different. However, if you want to display non-Roman characters, W9x/ME requires some extra work to make sure that the correct font is loaded. This is because an API can only display characters which are available within the font it is using. With XP this is all easier since XP is very smart at finding the correct font to use to display non-Roman characters. With W9x/ME, as with DrawTextA, it is necessary to select a font containing the correct codepage/character set into the device context. Once this is done, the characters in the codepage are mapped to the Unicode characters correctly. In the code below, GetVersionA discovers which platform is being used and calls GET_ANSILANGFONT if necessary (the code for this is given above). Note how in XP the size of string must include the null terminator. This seems to assist Windows to find the correct font automatically.

To see this test in the debugger use the breakpoint DRAW_UNICODE_TEXTOUT, then single-step through the code. Here is the source code used in this test:-

DATA SECTION
;
UNICODE_STRING DUS 'Cheers! in Russian: За здоровье!',0
;
CODE SECTION
;
UNICODE_TEXTOUT:
PUSH 0,[hInst],951,[hWnd]   ;parent=main window, id=951
PUSH 100,260            ;size (height, width)
PUSH 50,50              ;position
PUSH 50002C01h          ;(CHILD+VISIBLE)+1=default push button
PUSH 'Click me to see text drawn using TextOutW'     ;text
PUSH 'BUTTON'           ;button class
PUSH 0                  ;extended window style
CALL CreateWindowExA    ;make button window, returning handle in EAX
RET
;
DRAW_UNICODE_TEXTOUT2:
;******************************* 
PUSH ADDR UNICODE_STRING
CALL lstrlenW           ;get length of unicode string in chars
MOV EBX,EAX
INC EBX                 ;make sure string includes the terminating null
;************** ebx now holds number of characters in the string
;************** now arrange for the text to be centered
;************** by getting size of string in SIZE structure
PUSH ADDR SIZE,EBX,ADDR UNICODE_STRING,EDI
CALL GetTextExtentPoint32W     ;an api available on all platforms
;**** first the x-pos
MOV EAX,[RECT+8h]       ;get width of draw rectangle less 4
ADD EAX,4               ;allow for border removed earlier
SUB EAX,[SIZE]          ;less width of the characters
SHR EAX,1               ;halve result for correct x-pos to centre the text
;**** and now the y-pos
MOV EDX,[RECT+0Ch]      ;get bottom of draw rectangle less 4
ADD EDX,4               ;allow for border removed earlier
SUB EDX,[SIZE+4h]       ;less height of the characters
SHR EDX,1               ;halve result for correct x-pos to centre the text
;**************
PUSH EBX                ;number of characters in string
PUSH ADDR UNICODE_STRING    ;address of string
PUSH EDX                ;y coord
PUSH EAX                ;x coord
PUSH EDI                ;handle to device context
CALL TextOutW           ;write string to screen
;**************
RET
;
DRAW_UNICODE_TEXTOUT:
PUSH [EBP+14h]          ;handle to button
CALL GetDC
MOV EDI,EAX             ;save handle of device context
;*************** rub out existing text
MOV EBX,ADDR RECT
PUSH EBX,[EBP+14h]      ;handle to button
CALL GetClientRect      ;get in RECT the client area of pane
MOV EDX,4
ADD [EBX],EDX           ;left allow for border
ADD [EBX+4h],EDX        ;top allow for border
SUB [EBX+8h],EDX        ;right allow for border
SUB [EBX+0Ch],EDX       ;bottom allow for border
;*************** get ready to draw new text
PUSH 15D                ;COLOR_3DFACE
CALL GetSysColorBrush   ;get brush of face colour of 3 dimension objects
PUSH EAX,ADDR RECT,EDI
CALL FillRect           ;fill child window with background colour (brush)
PUSH 1,EDI              ;transparent
CALL SetBkMode
;*************** W9x and ME need special attention here
MOV ESI,ADDR BUFFER
MOV D[ESI],148          ;OSVERSIONINFO structure size
PUSH ESI
CALL GetVersionExA
CMP D[ESI+10h],1        ;see from platform id, if NT,2000,XP and above
JA >L10                 ;yes, so no need to get font correct
;*************** W9x and ME version - ensure font with Cyrillic codepage is used
MOV B[PROPOSED_CHARSET],204   ;Russian
CALL GET_ANSILANGFONT   ;get, in esi, the font to use
PUSH ESI,EDI            ;esi=handle to font, edi=device context
CALL SelectObject       ;select the font into the device context
;*************** push now to deselect later 
PUSH EAX,EDI            ;eax=previous font handle, edi=device context
CALL DRAW_UNICODE_TEXTOUT2
CALL SelectObject       ;deselect font handle in dc
PUSH ESI
CALL DeleteObject       ;delete the font (handle in esi)
JMP >L12
;*************** NT,2000,XP and above do not need special font handling
L10:
CALL DRAW_UNICODE_TEXTOUT2
L12:
;*****************************************************************************
PUSH EDI,[EBP+14h]
CALL ReleaseDC
PUSH 4000
CALL Sleep
RET

Displaying Unicode with ExtTextOutW
ExtTextOutW is also available on both W9x/ME and on NT/2000/XP. The following code is similar to the code for TextOutW except that we can allow ExtTextOutW to fill-in the background for us.

To see this test in the debugger use the breakpoint DRAW_UNICODE_EXTTEXTOUT, then single-step through the code. Here is the source code used in this test:-

DRAW_UNICODE_EXTTEXTOUT2:
PUSH ADDR UNICODE_STRING
CALL lstrlenW           ;get length of unicode string in chars
MOV EBX,EAX
INC EBX                 ;make sure string includes the terminating null
;************** ebx now holds number of characters in the string
;************** now arrange for the text to be centered
;************** by getting size of string in SIZE structure
PUSH ADDR SIZE,EBX,ADDR UNICODE_STRING,EDI
CALL GetTextExtentPoint32W     ;an api available on all platforms
;**** first the x-pos
MOV EAX,[RECT+8h]       ;get width of draw rectangle less 4
ADD EAX,4               ;allow for border removed earlier
SUB EAX,[SIZE]          ;less width of the characters
SHR EAX,1               ;halve result for correct x-pos to centre the text
;**** and now the y-pos
MOV EDX,[RECT+0Ch]      ;get bottom of draw rectangle less 4
ADD EDX,4               ;allow for border removed earlier
SUB EDX,[SIZE+4h]       ;less height of the characters
SHR EDX,1               ;halve result for correct x-pos to centre the text
;**************
PUSH 0
PUSH EBX                ;number of characters in string
PUSH ADDR UNICODE_STRING    ;address of string
PUSH ADDR RECT          ;clipping rectangle
PUSH 2                  ;options ETO_OPAQUE=2
PUSH EDX                ;y coord
PUSH EAX                ;x coord
PUSH EDI                ;handle to device context
CALL ExtTextOutW        ;write string to screen
RET
;
DRAW_UNICODE_EXTTEXTOUT:
PUSH [?BP+14h]          ;handle to button
CALL GetDC
MOV EDI,EAX             ;save handle of device context
;*************** rub out existing text
MOV EBX,ADDR RECT
PUSH EBX,[EBP+14h]      ;handle to but?on
CALL GetClientRect      ;get in RECT the client area of pane
MOV EDX,4
ADD [EBX],EDX           ;left allow for border
ADD [EBX+4h],EDX        ;top allow for border
SUB [EBX+8h],EDX        ;right allow for border
SUB [EBX+0Ch],EDX       ;bottom allow for border
;************** set up background colour
PUSH 15D                ;COLOR_3DFACE
CALL GetSysColor        ;get face colour of 3 dimension objects
PUSH EAX,EDI
CALL SetBkColor         ;insert new correct background colour in case writing opaque
;*************** W9x and ME need special attention here
MOV ESI,ADDR BUFFER
MOV D[ESI],148          ;OSVERSIONINFO structure size
PUSH ESI
CALL GetVersionExA
CMP D[ESI+10h],1        ;see from platform id, if NT,2000,XP and above
JA >L20                 ;yes, so no need to get font correct
;*************** W9x and ME version - ensure font with Cyrillic codepage is used
MOV B[PROPOSED_CHARSET],204   ;Russian
CALL GET_ANSILANGFONT   ;get, in esi, the font to use
PUSH ESI,EDI            ;esi=handle to font, edi=device context
CALL SelectObject       ;select the font into the device context
;*************** push now to deselect later 
PUSH EAX,EDI            ;eax=previous font handle, edi=device context
CALL DRAW_UNICODE_EXTTEXTOUT2
CALL SelectObject       ;deselect font handle in dc
PUSH ESI
CALL DeleteObject       ;delete the font (handle in esi)
JMP >L22
;*************** NT,2000,XP and above do not need special font handling
L20:
CALL DRAW_UNICODE_EXTTEXTOUT2
L22:
;*****************************************************************************
PUSH EDI,[EBP+14h]
CALL ReleaseDC
PUSH 4000               ;4 second delay
CALL Sleep
RET

Using the Microsoft Layer for Unicode
The Microsoft Layer for Unicode ("mslu") is one way to make just one version of your application which will run both under Windows 9x/ME ME platforms and also under NT/2000/XP. The mslu is contained in a Dll called "unicows.dll". This is redistributable, so the intention is that you will ship this with your executable for placement in the same folder as your executable.
Basically unicows.dll acts as a wrapper around the ANSI APIs so that they can be called as Unicode APIs under Windows 9x/ME. This helps reduce the complexity of your source code, since you can call the Unicode APIs from your application directly. For example, using mslu you can call CreateWindowExW under W9x/ME, passing Unicode strings. The strings will be converted by unicows.dll to ANSI strings, and then unicows.dll will call CreateWindowExA instead.
Under Windows NT/2000/XP, if your application is properly linked, unicows.dll will not get involved at all and will not even be loaded into memory. A call to CreateWindowExW will be a direct call to that API.
To achieve the above, at link-time special code is added to the application for execution at load-time. When the application loads, this code identifies the platform and if it is W9x/ME, it will divert all relevant API calls to unicows.dll.

The way Microsoft gets this loader code into your application is via unicows.lib at link-time. Unfortunately not all tools support this and applications written with some tools will still have to call unicows.dll even when running under NT/2000/XP. In this case the switching is done inside unicows.dll itself. Unicows.dll knows the version of Windows being used and will simply pass on the call to the appropriate system Dll. The disadvantage of this method is that when unicows.dll loads, it also loads a number of other Dlls on which it relies. This, and the extra switching involved, slows down the application.

Using GoLink you can add the mslu loader code very simply. Just add the /mslu switch to the GoLink command line or file. GoLink does not use unicows.lib but has its own code instead. Since GoLink is written wholly in assembler, this code works fast and is small, at less than 800 bytes plus a dword for each API within unicows.dll which has to be dealt with. This therefore provides a simple way to add an mslu loader to your application.

Testbug's unicows.dll test
In this test we make two calls to the API MessageBoxIndirectW. The first call is made within Testbug.exe and the second within Testbug1.dll. Testbug.exe is linked normally (no mslu loader), whereas Testbug1.dll contains an mslu loader. This loader is called by the main executable (Testbug.exe) soon after it starts (see below for an explanation of this). If you are running under Windows NT, 2000 or XP you will see a message box each time MessageBoxIndirectW is called. This is because a working version of the API exists on those platforms within User32.dll. If you are running under Windows 9x or ME, only the second message box will appear. This is because the first call MessageBoxIndirectW (within Testbug.exe) simply returns ERROR_CALL_NOT_IMPLEMENTED. This is because there is no mslu loader in Testbug.exe. In the W9x/ME version of User32.dll the API MessageBoxIndirectW does exist, but it does nothing except return with that value. The second call, however, is different. That is made from Testdll1.dll, and that dll was linked to include the mslu loader. This time the call is diverted to unicows.dll first, where the strings are translated to ansi, and then MessageBoxIndirectA is called.
To single-step through the Unicows test in Testbug set the breakpoint to UNICODE_TEST1. Here is the code you will see in Testbug.exe:-

UNICODE_TEST1:
MOV ESI,ADDR MB_PARAMS   ;structure for MessageBoxIndirectW
MOV D[ESI],40D           ;size of structure
MOV EAX,[hWnd]
MOV [ESI+4],EAX
MOV EAX,[hInst]
MOV [ESI+8],EAX
MOV EAX,ADDR UTMESS1     ;message strings
MOV [ESI+0Ch],EAX
MOV EAX,ADDR UTMESS2     ;message strings
MOV [ESI+10h],EAX
MOV D[ESI+14h],40h       ;information+ok button only
PUSH ESI
CALL MessageBoxIndirectW ;wait till ok pressed
PUSH ADDR MB_PARAMS      ;push on stack for next
JMP UNICODE_TEST1A
Then here is the code you will see in Testdll1.dll:-
UNICODE_TEST1A:
POP ESI                  ;get structure into ESI
MOV EAX,ADDR UTMESS1     ;message strings (in the dll)
MOV [ESI+0Ch],EAX
MOV EAX,ADDR UTMESS2     ;message strings (in the dll)
MOV [ESI+10h],EAX
PUSH ESI
CALL MessageBoxIndirectW ;wait till ok pressed
RET
The mslu loader in Dlls
Note that the mslu loader in Dlls must be called by the main executable soon after starting. This is needed because the loader code cannot be run from the Dlls entry point when the Dll attaches. Instead, GoLink creates a function within the Dll called MSLU_LOADER. Your application has the responsibility of calling this loader before using any function in the Dll which might require unicows.dll. The loader should only be called once. A convenient time to call MSLU_LOADER is just after your main Exe starts. Use the syntax CALL  DllName:MSLU_LOADER or just CALL  MSLU_LOADER. You can see this in Testbug.exe at start up, and you can single-step through the loader to see how it works under W9x/ME as compared with with NT/2000/XP. Here is the code at the start of Testbug:-
START:
PUSH 0
CALL GetModuleHandleA
MOV [hInst],EAX
CALL InitCommonControls       ;ensure common control library is loaded
CALL MSLU_LOADER