"The good news about computers is that they do what you tell them to do. The bad news is that they do what you tell them to do."
-- unsure; often attributed without source to information philosopher Ted Nelson
Native language support (NLS) is the capability of a software package to interact in the user's native language.
NLS generally includes two parts: internationalization is enabling a program to support translations, and localization is providing a translation for a specific language. Internationalization is often abbreviated "i18n" and localization "l10n" (the first and last letters, with the number of letters in between!).
With NLS, the programmer writes messages in one language, typically U.S. English. Later, translations of those messages can be provided for other languages. When the program is started, it loads whichever translation the user requests.
True NLS is very difficult to implement. Not only must output messages be translated, but also documentation. Local variations in things such as displaying dates, numbers and currency values must be taken into account. Even input files and command-line arguments should be translated. Unfortunately, NLS support tools are in an infant stage.
A locale is a formal description of a particular set of cultural habits, such as language, character encoding, and how to display dates, together with all translations for the particular language.
Traditionally, programmers have used the ASCII character set for text. As software has gone global, ASCII has proven insufficient to handle native language support. In response, international standards bodies have specified a standard variously known as ISO 10646, UCS and Unicode, which can encode the characters of every modern and historical human language.
The major difference between ASCII and Unicode is that each ASCII character is encoded with one byte (8 bits), while Unicode characters are encoded with one to four bytes. In a programming context, bigger-than-ASCII characters are often called "wide characters."
Unicode support is just starting to catch on, so availability is patchy and support tools vary. Generally, operating systems and GUI toolkits are working to support Unicode with minimal changes to applications.
Modern C/C++ libraries implement wide character support with the wchar_t type and wide-character equivalents of standard text functions. If you wish to enable Unicode support in your package, see the links in the Tools section.
The GNU gettext C library provides basic NLS, allowing a program's text messages to be translated. It cannot help with translating documentation and other aspects of NLS, but at least the user interface components can be translated. It is an easy way to add basic NLS to a program. gettext is licensed with the GPL, not the LGPL, so any programs using it must also be GPL'ed if they are distributed. (Some systems, notably Sun, provide their own version of gettext which are licensed differently. Also, some other systems provide a similar but more difficult system called catgets.)
gettext is intended to be used in conjunction with the autoconf/automake OS portability tools, and will not easily work without them. The programmer should modify the autoconf configuration files in the top-level package directory as follows:
In "configure.in," add the following lines:
ALL_LINGUAS="" AM_GNU_GETTEXT AC_OUTPUT([ intl/Makefile po/Makefile.in ])
Your AC_OUTPUT section will list other files as well, but the above is what is needed for gettext support. When support for a new language is added to your package, add the language code to the ALL_LINGUAS macro. For example, if your package has translations for German and French, the line will read ALL_LINGUAS="de fr".
In "acconfig.h," add the following lines:
#undef PACKAGE #undef VERSION #undef HAVE_LIBSM #undef HAVE_CATGETS #undef HAVE_GETTEXT #undef HAVE_LC_MESSAGES #undef HAVE_STPCPY #undef ENABLE_NLS
In the top-level "Makefile.am", add "intl" and "po" to SUBDIRS.
Once you've set up autoconf and automake to work with gettext, create the gettext support files:
From the top-level package directory, run gettextize once.
This creates the necessary support files for gettext, including copying
the gettext library into an "intl/" subdirectory, and setting up a
"po/" subdirectory to hold translation files. This way, gettext is
distributed with your package so users don't need to install it separately.
(The intl/ subdirectory will contain links, so if you use tar to archive
your distribution, give it the -h option. autoconf does this
automatically for make dist.)
In the "intl/" subdirectory, run make all-yes once.
Create a file "po/POTFILES.in" listing all the source code files that have translatable strings, and keep this file up-to-date. For example:
# Source files containing translatable strings: lib/hello_msg.c lib/hello_help.c src/hello.c
Running make in the po/ directory will run xgettext to rebuild
the ".pot" file containing all translatable strings. This will normally
be done automatically from a make in the top-level directory, but
can also be done independently if desired.
To use gettext, the programmer should:
Edit each source code directory's Makefile.am to make sure LIBS and INCLUDES include the following:
LIBS=-L../../intl -lintl INCLUDES=-DLOCALEDIR=\"$(datadir)/locale\" -I../../intl
Replace "../../intl" with the location of your intl directory.
Add the following near the top of each source code file, or in a common header file:
#include <libintl.h> #define _(s) gettext(s) #define gettext_noop(s) (s) #define N_(s) gettext_noop (s)
The defines aren't absolutely necessary but they make the source code cleaner. If you #include <gnome.h>, do not do the above, as gnome.h will do it for you.
Include the following code early in each program's main() function:
setlocale (LC_ALL, ""); bindtextdomain (PACKAGE, LOCALEDIR); textdomain (PACKAGE);
PACKAGE is the package name and LOCALEDIR is the package's locale directory, as specified in config.h and the make file respectively. In certain cases the LC_ALL flag might not be correct. See the gettext documentation and the locale(7) man page for details.
To ease translation, use whole, standalone strings when creating messages, as opposed to building them from pieces. For example:
/* the wrong way */
printf("You have %d guess%s left.\n", nguesses, (nguesses == 1)? "" : "es");
/* the right way */
if (nguesses == 1)
printf("You have one guess left.\n");
else
printf("You have %d guesses left.\n", nguesses);
Wrap all translatable strings in each program with _(), including output and error messages, user interface components like menu names, and so forth. For example:
printf( _("You have %d guesses left.\n"), nguesses);
The only exception is for places where the output of a function isn't acceptable, such as in static initializations; these should be marked with N_() instead, and _() when they are actually used. For example:
static char **messages = {
N_("potato"),
N_("carrot"),
N_("onion")
};
...
printf( _("You guessed %s.\n"), _(messages[i]) );
See the gettext documentation for details on providing translation files for other languages.
(TO DO)