Sun Microsystems, Inc.
spacerspacer
spacer www.sun.com docs.sun.com |
spacer
black dot
 
 
5.  Overview of UTF-8 Locale Support System Environment Locale Environment Variable  Previous   Contents   Next 
   
 

TTY Environment Setup

Depending on the terminal and terminal emulator that you are using, you might need to push certain codeset-specific STREAMS modules onto your streams.

For more information on STREAMS modules and streams in general, see the STREAMS Programming Guide.

The following table shows STREAMS modules supported by the en_US.UTF-8 locale in the terminal environment.

Table 5-7 32-bit STREAMS Modules Supported by en_US.UTF-8

32-bit STREAMS Module

Description

/usr/kernel/strmod/u8lat1

Code conversion STREAMS module between UTF-8 and ISO8859-1 (Western European)

/usr/kernel/strmod/u8lat2

Code conversion STREAMS module between UTF-8 and ISO8859-2 (Eastern European)

/usr/kernel/strmod/u8koi8

Code conversion STREAMS module between UTF-8 and KOI8-R (Cyrillic)

The following table lists the 64-bit STREAMS modules supported by en_US.UTF-8.

Table 5-8 64-bit STREAMS Modules Supported by en_US.UTF-8

64-bit STREAMS module

Description

/usr/kernel/strmod/sparcv9/u8lat1

Code conversion STREAMS module between UTF-8 and ISO8859-1 (Western European)

/usr/kernel/strmod/sparcv9/u8lat2

Code conversion STREAMS module between UTF-8 and ISO8859-2 (Eastern European)

/usr/kernel/strmod/sparcv9/u8koi8

Code conversion STREAMS module between UTF-8 and KOI8-R (Cyrillic)

Loading a STREAMS Module at Kernel

To load a STREAMS module at kernel, first become root.

To determine whether you are running a 64-bit Solaris or 32-bit Solaris system, use the isainfo(1) utility as follows:

system# isainfo -v
64-bit sparcv9 applications
32-bit sparc applications

If the command returns this information, you are running the 64-bit Solaris system. If you are running the 32-bit Solaris system, the utility shows the following:

system# isainfo -v
32-bit sparc applications

Use modinfo(1M) to be certain that your system has not already loaded the STREAMS module:

system# modinfo | grep  modulename

If the STREAMS module, such as u8lat1, is already installed, the output looks as follows:

system# modinfo | grep u8lat1
89 ff798000  4b13  18   1  u8lat1 (UTF-8 <--> ISO 8859-1 module)

If the module is already installed, you do not need to load it. However, if the module has not yet been loaded, use modload(1M) as follows:

system# modload /usr/kernel/strmod/u8lat1

This command loads the 32-bit u8lat1 STREAMS module at the kernel so you can push it onto a stream. If you are running the 64-bit Solaris product, use modload(1M) as follows:

system# modload /usr/kernel/strmod/sparcv9/u8lat1

The STREAMS module is loaded at the kernel and you can now push it onto a stream.

To unload a module from the kernel, use modunload(1M), as shown below. In this example, the u8lat1 module is being unloaded.

system# modinfo | grep u8lat1
89 ff798000  4b13  18   1  u8lat1 (UTF-8 <--> ISO 8859-1 module)
system# modunload -i 89

dtterm and Terminals Capable of Input and Output of UTF-8 Characters

Unlike in previous releases of the Solaris operating environment, the dtterm(1) Terminal and any other terminals that support input and output of the UTF-8 codeset do not need to have any additional STREAMS modules in their stream. ldterm(7M) module is now codeset independent and supports Unicode/UTF-8 as well.

To set up the proper terminal environment for the Unicode locales, use the stty(1) utility. To query the current settings use the -a option of the stty(1) utility, as shown below:

system% /bin/stty -a

Note - Because /usr/ucb/stty is not internationalized, use /bin/stty instead.


Terminal Support for Latin-1, Latin-2, or KOI8-R

For terminals that support only Latin-1 (ISO8859-1), Latin-2 (ISO8859-2), or KOI8-R, you should have the following STREAMS configuration:

head <-> ttcompat <->  ldterm <->  u8lat1 <-> TTY

This configuration is only for terminals that support Latin-1. For Latin-2 terminals, replace the STREAMS module u8lat1 with u8lat2. For KOI8-R terminals, replace the module with u8koi8.

Make sure you already have the STREAMS module loaded into the kernel.

To set up the STREAMS configuration shown above, use strchg(1M), as shown in the second command line of the example:

system% cat > tmp/mystreams 
ttcompat
ldterm
u8lat1
ptem
^D
system% strchg -f /tmp/mystreams

Be sure that you are either root or the owner of the device when you use strchg(1). To see the current configuration, use strconf(1) as follows:

system% strconf
ttcompat
ldterm
u8lat1
ptem
pts
system%

To reset the original configuration, set the STREAMS configuration as follows:

system% cat > /tmp/orgstreams
ttcompat
ldterm
ptem
^D
system% strchg -f /tmp/orgstreams

Saving the Settings in ~/.cshrc

Assuming the necessary STREAMS modules are already loaded with the kernel, you can save the following lines in your .cshrc file (C shell example) for convenience:

setenv LANG en_US.UTF-8
if ($?USER != 0 && $?prompt != 0) then
     cat >! /tmp/mystreams$$ << _EOF
     ttcompat
     ldtterm
     u8lat1
     ptem
_EOF
     /bin/strchg -f /tmp/mystreams$$
     /bin/rm -f /tmp/mystreams$$
     /bin/stty cs8 -istrip defeucw
endif

With these lines in your.cshrc file, you do not have to type all of the commands each time you use the STREAMS module. Note that the second _EOF should start from the first column of the file.

Code Conversions

Unicode locale support adds various code conversions among major codesets of many countries through iconv(1), iconv(3C), and sdtconvtool(1).

In the Solaris 9 environment, the utility geniconvtbl enables user-defined code conversions. The user-defined code conversions created with the geniconvtbl utility can be used with both iconv(1) and iconv(3). For more detail on this utility, refer to thegeniconvtbl(1) and geniconvtbl(4) man pages.

The available fromcode and tocode names that can be applied to iconv(1), iconv_open(3C), and sdtconvtool(1) are shown in the tables in Appendix A, iconv Code Conversions. For more details on iconv code conversion, see the iconv(1), iconv_open(3C), iconv (3) , iconv_close(3C ) geniconvtbl( 1 ) geniconvtbl( 4 ) and sdtconvtool(1) man pages. For more information on available code conversions, see the iconv_en_US.UTF-8(5), iconv(5), iconv_ja(5), iconv_ko(5), iconv_zh(5), and iconv_zh_TW(5) man pages. Also see Appendix A, iconv Code Conversions.


Note - UCS-2, UCS-4, UTF-16 and UTF-32 are all Unicode/ ISO/IEC 10646 representation forms that recognize Byte Order Mark (BOM) characters defined in the Unicode 3.1 and ISO/IEC 10646-1:2000 standards if the character appears at the beginning of the character stream. Other forms, like UCS-2BE, UCS-4BE, UTF-16BE, and UTF-32BE are all fixed-width Unicode/ISO/IEC 10646 representation forms that do not recognize the BOM character and also assume big endian byte ordering. Representation forms like UCS-2LE, UCS-4LE, UTF-16LE, and UTF-32LE, on the other hand, assume little endian byte ordering. They also do not recognize the BOM character.

For associated scripts and languages of ISO8859-* and KO18-*, see http://czyborra.com/charsets/iso8869.html.


 
 
 
  Previous   Contents   Next