DIHYPH / DITECT Unicode-16 Version
Character coding
Using editable code table "dhco??" or "dtco??" (?? =language-no. 01-50)
DIHYPH and DITECT are working with 1 byte character code, so the 2-byte
Unicode is internally converted into Iso-8859 code by calling:
DMCREPUC (1, nn, ptu, pti);
If needed Iso-code is converted back to Unicode by calling module:
DMCREPUC (2, nn, ptu, pti);
| | |__ pointer to Iso-code string (char)
| |_______ pointer to Unicode " (unsigned short)
|___________ language-no. (e.g. 02 = UK-English)
When calling DIHYPH or DITECT all Unicode strings must:
- start without BOM (byte order mark)
- end with hex: 00 00
- variable 'alc' be set to number of string characters.
e.g.: Unicode string (hex.):
00 65 00 78 00 61 00 6D 00 70 00 6C 00 65 00 00
alc = 7; (or higher !).
If 'alc=9999;' is set to a high number, the program recognizes end of string
by characters 00 00.
To tell the program whether a Unicode string has 'LE' (little endian byte order)
or 'BE' (big endian byte order), switch "unibom=0;" is defined in DHDEF.C
and the program checks byte order of the strings automatically !
Module DMCREPUC.c uses code conversion file DMCREPUC
DITECT proposal word list
The proposal word list with Unicode characters is stored into array
unsigned short prbufuc[1000] (20 lines of 50 unsigned short values)
while array char prbuf[1000] holds 20 lines of 50 Iso characters.
First value in every prbufuc line shows the percentage while the proposal
string starts with second unsigned short value.
Every proposal word in prbufuc[] ends with CR/LF.
DIHYPH / DITECT Unicode-Programs
Unicode-programs differ from standard DIHYPH / DITECT as follows:
All Unicode-modules have the same type of name:
dx????uc
| | |____ Unicode version
| |
| |______ ???? unspecific letters
|
|_________ h = for DIHYPH program
|_________ t = for DITECT program
|_________ m = for DIHYPH and DITECT
To link DIHYPH or DITECT according to installation description,
following programs and files have to be used for Unicode:
file instead of for
__________ __________ _______________________
DHTESTUC.c DHTEST.c DIHYPH-testprogram *)
DTTESTUC.c DTTEST.c DITECT-testprogram *)
DHYPHEUC.c DHYPH.c DIHYPH
DTECTEUC.c DTECT.c DITECT
DMRDWTUC.c (additional C-source) **)
DMCREPUC.c (additional C-source)
DMCREPUC (additional code file)
*) used for testprograms only !
**) used for Exception and testprograms !
DIHYPH has to be called by text-/publishing system by:
alc= nn; (nn = number of characters in 'luni' string or higher)
rc = DHYPH (luni, nn);
| |____ (integer) language-no. 1 - 50
|_________ (unsigned short) Unicode character string.
DITECT has to be called by text-/publishing system by:
alc= nn; (nn = number of characters in 'luni' string or higher)
rc = DTECT (luni, nn);
| |____ (integer) language-no. 1 - 50
|_________ (unsigned short) Unicode character string.
DIHYPH / DITECT test programs:
d?testuc nn infile [outfile] [(...)]
| |
| |__________ language-no. 01 - 50
|
|___________________ h = DIHYPH hyphenation test
|___________________ t = DITECT spelling check test
infile = test words in Unicode-16, one word per line
outfile = test results in Unicode-16 [option]
(...) = test commands [option]. (?) displays possible commands.
Test results are also diplayed on screen in Iso encoding.
Contact