Unternehmensberatung Dieckmann |
---|
Home | Report excerpts | References | DITECT - Spelling check |
Detailed program descriptions | Price List Languages |
InDesign PlugIns | Contact |
Character encoding
Using editable code table "dhco??" or "dtco??" (?? =language-no. 01-50)
DIHYPH and DITECT are working with 1 byte character code, so the 2-byte
Unicode is internally converted into Iso-8859 code by calling:DMCREPUC (1, nn, ptu, pti);
If needed Iso-code is converted back to Unicode by calling module:DMCREPUC (2, nn, ptu, pti); | | |__ pointer to Iso-code string (char) | |_______ pointer to Unicode " (unsigned short) |___________ language-no. (e.g. 02 = UK-English)
When calling DIHYPH or DITECT all Unicode strings must:
- start without BOM (byte order mark)
- end with hex: 00 00
- variable 'alc' be set to number of string characters.
e.g.: Unicode string (hex.):
00 65 00 78 00 61 00 6D 00 70 00 6C 00 65 00 00
alc = 7; (or higher !).
If 'alc=9999;' is set to a high number, the program recognizes end of string
by characters 00 00.
To tell the program whether a Unicode string has 'LE' (little endian byte order)
or 'BE' (big endian byte order), switch "unibom=0;" is defined in DHDEF.C
and the program checks byte order of the strings automatically !
Module DMCREPUC.c uses code conversion file DMCREPUC
DITECT proposal word list
The proposal word list with Unicode characters is stored into array
unsigned short prbufuc[1000] (20 lines of 50 unsigned short values)
while array char prbuf[1000] holds 20 lines of 50 Iso characters.
First value in every prbufuc line shows the percentage while the proposal
string starts with second unsigned short value.
Every proposal word in prbufuc[] ends with CR/LF.
Unicode-programs differ from standard DIHYPH / DITECT as follows:
All Unicode-modules have the same type of name:
dx????uc | | |____ Unicode version | | | |______ ???? unspecific letters | |_________ h = for DIHYPH program |_________ t = for DITECT program |_________ m = for DIHYPH and DITECT
To link DITECT according to installation description,
following programs and files have to be used for Unicode:
file instead of for __________ __________ _______________________ DTTESTUC.c DTTEST.c DITECT-testprogram *) DTECTEUC.c DTECT.c DITECT DMRDWTUC.c (additional C-source) **) DMCREPUC.c (additional C-source) DMCREPUC (additional code file)
*) used for testprograms only !
**) used for Exception and testprograms !
DITECT has to be called by text-/publishing system by:alc= nn; (nn = number of characters in 'luni' string or higher) rc = DTECT (luni, nn); | |____ (integer) language-no. 1 - 50 |_________ (unsigned short) Unicode character string.
DITECT test program:
DTTESTUC nn infile [outfile] [(...)] | |__________ language-no. 01 - 50 infile = test words in Unicode-16, one word per line outfile = test results in Unicode-16 [option] (...) = test commands [option]. (?) displays possible commands. Test results are also diplayed on screen in ISO-encoding.