Major Corrections to 5-level code Extension Scheme

From: jsavard@ecn.ab.ca
Date: 03/29/03


From: jsavard@ecn.ab.ca ()
Date: Sat, 29 Mar 2003 19:49:44 GMT

After recent clarifications to my web page at

http://home.ecn.ab.ca/~jsavard/crypto/mi6133.htm

I realized that one thing that remained to be explained clearly was how to
parse the sequences of multiple shift codes that I was using in my scheme
which attempted to represent a large group of characters - even a gamut as
large as that of UNICODE - within 5-level code.

This wasn't a sort of UTF-5, however, since it wasn't based on UNICODE,
but instead required individual languages to be separately adapted to
5-level code for the maximum bandwidth efficiency.

In doing this, anyways, I found I had made some errors: I had to change
some of the sequences (but less-used ones!) (as if one can use the term
"less-used" about something that has yet to be implemented anywhere) so
that I could propose a simple and unambiguous decoding rule... also, I
found that I had left the notion of a superfluous FIGS code dangling
about.

My page, therefore, is now corrected and updated. I hope it is now
understandable enough so that, although 5-level code hardly seems like the
vehicle to employ for a large character gamut, with 8-bit codes so widely
available, that I really have proposed something which would, in the same
way that Recommendation S.2 had, allow 5-level links when required because
of bandwidth limitations to have a bit more life and versatility.

John Savard



Relevant Pages

  • Re: What is the encoding of this String?
    ... There are two ways to think of Java Strings. ... Strings are collections of characters. ... are Unicode characters. ... pure Unicode data into sequences of bytes -- and Java's Strings are not ...
    (comp.lang.java.programmer)
  • Re: How to get the ascii code of Chinese characters?
    ... No. ASCII characters range is 0..127 while Unicode characters range is ... There are some encodings defined which map Unicode sequences ...
    (comp.lang.python)
  • Re: why isnt Unicode the default encoding?
    ... thought that Unicode was created as a subset of ASCII and Latin-1 so that they would be compatible...but I guess it's never that easy. ... you would have no good way to represent sequences of bytes anymore. ... character string, it is not. ...
    (comp.lang.python)
  • Re: two public class in a single file
    ... characters to byte sequences ... if you just use 16 bit Unicode, ... Canadian Mind Products, Roedy Green. ...
    (comp.lang.java.programmer)