Unternehmensberatung Dieckmann

Home Report excerpts References DITECT -
Spelling check
Detailed program descriptions Price List
Languages
InDesign PlugIns Contact

 DIHYPH

 hyphenation

 Silbentrennung

DITECT

spelling-check 

Rechtschreibprüfung 




























































































































































































































































DITECT Unicode-16 Version




Character encoding

Using editable code table "dhco??" or "dtco??" (?? =language-no. 01-50)
DIHYPH and DITECT are working with 1 byte character code, so the 2-byte
Unicode is internally converted into Iso-8859 code by calling:

DMCREPUC (1, nn, ptu, pti);

If needed Iso-code is converted back to Unicode by calling module:
DMCREPUC (2, nn, ptu, pti);
              |   |    |__ pointer to Iso-code string  (char)
              |   |_______ pointer to Unicode    " (unsigned short)
              |___________ language-no. (e.g. 02 = UK-English)

When calling DIHYPH or DITECT all Unicode strings must:
- start without BOM (byte order mark)
- end with hex: 00 00
- variable 'alc' be set to number of string characters.
e.g.: Unicode string (hex.):
00 65 00 78 00 61 00 6D 00 70 00 6C 00 65 00 00
alc = 7; (or higher !).
If 'alc=9999;' is set to a high number, the program recognizes end of string
by characters 00 00.

To tell the program whether a Unicode string has 'LE' (little endian byte order)
or 'BE' (big endian byte order), switch "unibom=0;" is defined in DHDEF.C
and the program checks byte order of the strings automatically !

Module DMCREPUC.c uses code conversion file DMCREPUC



DITECT proposal word list

The proposal word list with Unicode characters is stored into array
unsigned short prbufuc[1000] (20 lines of 50 unsigned short values)
while array char prbuf[1000] holds 20 lines of 50 Iso characters.
First value in every prbufuc line shows the percentage while the proposal
string starts with second unsigned short value.
Every proposal word in prbufuc[] ends with CR/LF.



Unicode-programs differ from standard DIHYPH / DITECT as follows:

All Unicode-modules have the same type of name:
dx????uc
 |  | |____  Unicode version
 |  |
 |  |______  ???? unspecific letters
 |
 |_________  h  = for DIHYPH program
 |_________  t  = for DITECT program
 |_________  m  = for DIHYPH and DITECT


To link DITECT according to installation description,
following programs and files have to be used for Unicode:
file           instead of     for
__________     __________     _______________________

DTTESTUC.c     DTTEST.c       DITECT-testprogram   *)

DTECTEUC.c     DTECT.c        DITECT

DMRDWTUC.c     (additional C-source)              **)

DMCREPUC.c     (additional C-source)

DMCREPUC       (additional code file)

*) used for testprograms only !
**) used for Exception and testprograms !



DITECT has to be called by text-/publishing system by:
alc= nn;    (nn = number of characters in 'luni' string or higher)
rc = DTECT (luni, nn);
             |    |____  (integer)  language-no.  1 - 50
             |_________  (unsigned short) Unicode character string.



DITECT test program:

DTTESTUC  nn  infile  [outfile]  [(...)]
          |
          |__________  language-no.  01 - 50


infile  =  test words in Unicode-16, one word per line

outfile =  test results in Unicode-16 [option]

(...)   =  test commands [option].  (?) displays possible commands.

Test results are also diplayed on screen in ISO-encoding.