Next: ISO-8859-1 Characters, Previous: Miscellaneous Character Operations, Up: Characters [Contents][Index]
An MIT/GNU Scheme character consists of a code part and a bucky bits part. The MIT/GNU Scheme set of characters can represent more characters than ASCII can; it includes characters with Super and Hyper bucky bits, as well as Control and Meta. Every ASCII character corresponds to some MIT/GNU Scheme character, but not vice versa.5
MIT/GNU Scheme uses a 21-bit character code with 4 bucky bits. The character code contains the Unicode scalar value for the character. This is a change from earlier versions of the system, which used the ISO-8859-1 scalar value, but it is upwards compatible with previous usage, since ISO-8859-1 is a proper subset of Unicode.
Builds a character from code and bucky-bits. Both
code and bucky-bits must be exact non-negative integers in
the appropriate range. Use char-code
and char-bits
to
extract the code and bucky bits from the character. If 0
is
specified for bucky-bits, make-char
produces an ordinary
character; otherwise, the appropriate bits are turned on as follows:
1 Meta 2 Control 4 Super 8 Hyper
For example,
(make-char 97 0) ⇒ #\a (make-char 97 1) ⇒ #\M-a (make-char 97 2) ⇒ #\C-a (make-char 97 3) ⇒ #\C-M-a
Returns the exact integer representation of char’s bucky bits. For example,
(char-bits #\a) ⇒ 0 (char-bits #\m-a) ⇒ 1 (char-bits #\c-a) ⇒ 2 (char-bits #\c-m-a) ⇒ 3
Returns the character code of char, an exact integer. For example,
(char-code #\a) ⇒ 97 (char-code #\c-a) ⇒ 97
Note that in MIT/GNU Scheme, the value of char-code
is the
Unicode scalar value for char.
These variables define the (exclusive) upper limits for the character code and bucky bits (respectively). The character code and bucky bits are always exact non-negative integers, and are strictly less than the value of their respective limit variable.
char->integer
returns the character code representation for
char. integer->char
returns the character whose character
code representation is k.
In MIT/GNU Scheme, if (char-ascii? char)
is true, then
(eqv? (char->ascii char) (char->integer char))
However, this behavior is not required by the Scheme standard, and code that depends on it is not portable to other implementations.
These procedures implement order isomorphisms between the set of
characters under the char<=?
ordering and some subset of the
integers under the <=
ordering. That is, if
(char<=? a b) ⇒ #t and (<= x y) ⇒ #t
and x
and y
are in the range of char->integer
,
then
(<= (char->integer a) (char->integer b)) ⇒ #t (char<=? (integer->char x) (integer->char y)) ⇒ #t
In MIT/GNU Scheme, the specific relationship implemented by these procedures is as follows:
(define (char->integer c) (+ (* (char-bits c) #x200000) (char-code c))) (define (integer->char n) (make-char (remainder n #x200000) (quotient n #x200000)))
This implies that char->integer
and char-code
produce
identical results for characters that have no bucky bits set, and that
characters are ordered according to their Unicode scalar values.
Note: If the argument to char->integer
or integer->char
is
a constant, the compiler will constant-fold the call, replacing it with
the corresponding result. This is a very useful way to denote unusual
character constants or ASCII codes.
The range of char->integer
is defined to be the exact
non-negative integers that are less than the value of this variable
(exclusive). Note, however, that there are some holes in this range,
because the character code must be a valid Unicode scalar value.
Note that the Control bucky bit
is different from the ASCII control key. This means that
#\SOH
(ASCII ctrl-A) is different from #\C-A
.
In fact, the Control bucky bit is completely orthogonal to the
ASCII control key, making possible such characters as
#\C-SOH
.
Next: ISO-8859-1 Characters, Previous: Miscellaneous Character Operations, Up: Characters [Contents][Index]