138 lines
6.3 KiB
Plaintext
138 lines
6.3 KiB
Plaintext
# The wxWidgets language database
|
|
|
|
## Regeneration of the wxWidgets language database related files
|
|
|
|
Run the `genlang.py` script from the top level wxWidgets directory to
|
|
update `include/wx/language.h` (wxLanguage enum), `interface/wx/language.h`
|
|
(its documentation), `src/common/languageinfo.cpp` (conversion tables)
|
|
and the actual tabular data in `include/wx/private/lang_*.h` with the data
|
|
from `langtabl.txt`, `synonymtabl.txt`, `scripttabl.txt`, `likelytabl.txt`,
|
|
`matchingtabl.txt`, and `regiongrouptabl.txt`.
|
|
|
|
The files with the raw tabular data - `langtabl.txt`, `synonymtabl.txt`,
|
|
`scripttabl.txt`, `likelytabl.txt`, `matchingtabl.txt`, and
|
|
`regiongrouptabl.txt` - are derived from public data of the Unicode
|
|
organization. The scripts to generate the files are provided.
|
|
|
|
## Description of the raw data files
|
|
|
|
`langtabl.txt` contains a tabular list of language entries. Each entry
|
|
contains
|
|
|
|
- a symbolic language identifier used in enum wxLanguage,
|
|
- a wxWidgets version when the entry was first introduced (a hyphen if not known)
|
|
- a BCP 47-like locale identifier,
|
|
- a Unix locale identifier,
|
|
- a Unix locale identifier including a region id (if the default Unix
|
|
locale identifier does not include a region identifier) (mainly for
|
|
compatibility with wxWidgets version below 3.1.6),
|
|
- numeric Windows language identifier (1),
|
|
- numeric Windows sublanguage identifier (1),
|
|
- language and region description in English
|
|
- language and region description in native language.
|
|
|
|
`scripttabl.txt` contains a list of 4-letter script codes and their
|
|
aliases (English) based on the ISO 15924 standard (2), restricted to
|
|
entries for which aliases are defined. This list is used in wxWidgets
|
|
to convert between script code used in BCP 47-like identifiers and
|
|
script modifiers used in Unix locale names. The data in (2) can be used
|
|
to update scripttabl.txt if necessary.
|
|
|
|
`synonymtabl.txt` contains a list of aliases for symbolic language identifiers.
|
|
This list is used to generate specific entries in wxLanguage enumeration.
|
|
|
|
The following 3 files are used in the algorithm determining the best translation
|
|
language based on the list of preferred UI languages and the list of available
|
|
translations.
|
|
|
|
`likelytabl.txt` contains for most locales the likely subtags for script and
|
|
region.
|
|
|
|
`matchingtabl.txt` contains a map relating languages to possible replacement
|
|
languages giving the relative distance between 2 languages.
|
|
|
|
`regiongrouptabl.txt` contains for some languages region groups. This allows
|
|
to determine whether certain regional language variants are closer to each
|
|
other or not.
|
|
|
|
**Note**: None of the files `langtabl.txt`, `synonymtabl.txt`, `scripttabl.txt`,
|
|
`likelytabl.txt`, `matchingtabl.txt`, and``regiongrouptabl.txt` should be
|
|
edited manually. Instead these files should be regenerated under Windows 11 or
|
|
above.
|
|
|
|
## Regeneration process
|
|
|
|
Windows provides an extensive list of locales. This list is used to regenerate
|
|
the files `langtabl.txt`, `synonymtabl.txt`, and `scripttabl.txt`. The
|
|
subdirectory `util` contains the C source of a small utility application
|
|
`showlocales.c` that queries Windows for a list of known locales.
|
|
|
|
**Note**: It is recommended to run `showlocales` on a Windows 11 Pro desktop
|
|
computer, because the desktop versions of Windows get updates of locale data
|
|
more frequently than server versions of Windows. Unfortunately, there is no
|
|
easy method to determine when the locale data were last updated. The date of
|
|
last modification of the file `Windows/System32/locale.nls` can be an
|
|
indication when the last update occurred.
|
|
|
|
2 additional tools are required to perform the regeneration process:
|
|
|
|
1) SQLite3 shell
|
|
Precompiled binaries can be downloaded from https://www.sqlite.org/download.html.
|
|
The download link is under the heading "Precompiled Binaries for Windows" and
|
|
looks like "sqlite-tools-win32-x86-3xxyyzz.zip" (where xx, yy, zz denote the
|
|
current SQLite version). Alternatively, the SQLite shell can be compiled from
|
|
sources - the archive "sqlite-amalgamation-3xxyyzz.zip" contains the required
|
|
source files.
|
|
The scripts expect that the executable `sqlite3.exe` can be found on the path.
|
|
|
|
2) Lua shell
|
|
Precompiled binaries are available at https://luabinaries.sourceforge.net/.
|
|
Download lua-x.y.z_Win32_bin.zip or lua-x.y.z_Win64_bin.zip (where x, y, z
|
|
denotes the lua version) from the download page.
|
|
Adjust the script file `misc/languages/data/setupenv.ps1`, so that the
|
|
environment variable `$env:luashell` contains the name of the executable,
|
|
and add the location of the executables to the path.
|
|
|
|
All provided scripts are PowerShell scripts. It is recommended to use
|
|
PowerShell 7 or higher. The regeneration process consists of the following steps:
|
|
|
|
1) Regenerate the list of known Windows locales (optional)
|
|
This step is usually only required when a new major Windows version is published.
|
|
The utility `showlocales` should be invoked from a command prompt as follows:
|
|
|
|
showlocales > win-locale-table-win.txt
|
|
|
|
The resulting file `win-locale-table-win.txt` has to be placed into subdirectory
|
|
`data/windows`. Alternatively, the script `getwindowsdata.ps1` can be used.
|
|
|
|
2) Update the Unicode data files (optional)
|
|
This step is only required when the Unicode data were actually updated.
|
|
To perform this step execute the script file `getunicodefiles.ps1`, located
|
|
in the `data` subdirectory.
|
|
|
|
3) Regenerate the tabular data files `langtabl.txt`, `synonymtabl.txt`,
|
|
`scripttabl.txt` etc.
|
|
To perform this step execute the script files `gensqlfiles.ps1` and
|
|
`makelangdb.ps1`, located in the `data` subdirectory.
|
|
The new versions will be placed in the `data` directory.
|
|
|
|
4) Check the resulting new tabular data.
|
|
The messages from step 3 issued by the script files and the resulting files
|
|
should be carefully checked. If no errors occurred, the script file
|
|
replacetables.bat can be executed.
|
|
|
|
5) Run the Python script `genlang.py` from the top level wxWidgets directory.
|
|
|
|
6) Commit the changes.
|
|
|
|
Notes:
|
|
1) Do not perform the regeneration process for older wxWidgets versions.
|
|
The scripts expect the data table files in a new format that was first introduced
|
|
in version 3.3.0.
|
|
2) If you need to add locales not present in the list of known Windows locales, then
|
|
they should be added at the end of the script `win_genlocaletable.lua`.
|
|
|
|
Footnotes
|
|
(1) used on Windows only, deprecated by Microsoft
|
|
(2) http://www.unicode.org/iso15924/iso15924-codes.html
|