fbpx
Wikipedia

ZX Spectrum character set

The ZX Spectrum character set is the variant of ASCII used in the ZX Spectrum family computers. It is based on ASCII-1967 but the characters ^, ` and DEL are replaced with ↑, £ and ©. It also differs in its use of the C0 control codes other than the common BS and CR, and it makes use of the 128 high-bit characters beyond the ASCII range.[1] The ZX Spectrum's main set of printable characters and system font are also used by the Jupiter Ace computer.

The ZX Spectrum character set as rendered in the system font (not including User-Defined Graphics characters).

Printable characters

 
Screenshot of output from a Sinclair BASIC program that demonstrates all printable code points including BASIC keywords and the User-Defined Graphics characters (by default defined as copies of the A-U glyphs).

Standard US-ASCII, 0x20–0x7F, is included in the Spectrum character set except that code point 0x5E is an up-arrow (↑) instead of a caret (^), 0x60 is the pound sign (£) instead of the grave accent (`), and 0x7F is the copyright sign (©) instead of the control character DEL. Note that the use of 0x5E as ↑ was also the case in the older 1963 version of ASCII. The £ sign was not mapped to 0x23 as in the British variant of ASCII (ISO-646-GB), allowing both the pound sign and the number sign (#) simultaneously. The ↑ character is the exponentiation operator in Spectrum's BASIC, just like the ^ it replaces compared to ASCII-1967 is used for exponentiation in many other dialects of BASIC and other programming languages.

Beyond 0x7F, the Spectrum character set uses the high-bit range 0x80–0xFF for special purposes. 0x80–0x8F contain the same 2×2 block graphics characters that the ZX80 character set and the ZX81 character set have (at other locations), also available in the Block Elements Unicode block. However the ZX Spectrum's standard character set does not include the ZX80/81 50% dithered 1×2 block graphics characters. Code points 0x90–0xA4 contain the originally 21 User-Defined Graphics (UDG) characters, and 0xA5–0xFF contain BASIC keywords tokenized as single code points. In the 128 BASIC mode introduced later, this was changed to 19 UDG characters ending at 0xA2 followed by the two new tokens SPECTRUM and PLAY. Code points 0xC7–0xC9 are the two-character operators <=, >= and <>, similarly tokenized into single code points. These tokens allow a BASIC command like PRINT to be entered with the single keypress P at the beginning of a line (i.e. in command mode), which generated 0xF6. That is displayed as the full keyword PRINT on screen but only a single byte token is stored so only that single byte need be parsed by the interpreter or saved to/loaded from external storage such as tape.

All non-UDG Spectrum characters can be mapped to Unicode. The three non ASCII-1967 characters ↑, £ and © are at U+2191, U+00A3 and U+00A9. The 2×2 block graphics characters are in the Block Elements block at U+2580–U+259F although font support for the latter is not universal.

The shape of the UDG characters is mapped to a RAM memory area and is initialized to copies of characters A-U, but can be redefined arbitrarily for example using the BASIC command POKE. Like all characters in the system font they use an 8×8 pixel grid stored in 8 bytes. Redefining them changes their appearance in subsequent PRINT statements but it does not change any UDG characters already drawn on the screen. The location of a UDG character's definition can be determined with the BASIC function USR with the character as the argument, e.g. USR "A" for the first one. By default this points to the last 168 (21×8) bytes of RAM at memory addresses 65368 (0xFF58) to 65535 (0xFFFF) for a 48K Spectrum. The location is pointed to by the system variable UDG[2] which can be found at memory address 23675/6 (0x5C7B/C) and can be changed. The TK90X, a Brazilian clone of the ZX Spectrum included an in ROM application to graphically edit these UDG characters, along with functionality to preload then with accented letters used in Portuguese. (For this, the TK90X defined two extra Basic commands at the codes 0 and 1, respectively "trace" and "udg")[3]

The definition of the main system font, 32 (space) to 127 (copyright), are referenced by the system variable CHARS which can be found at memory address 23606/7 (0x5C36/7). It is defined as 256 bytes lower than the first byte of the space character, simplifying the formula for locating a character to CHARS+8×code point. The CHARS value defaults to the value 15360 (0x3C00), with the system font at the end of the Spectrum's ROM at address 15616 (0x3D00) to 16383 (0x3FFF). Entire alternative fonts can be loaded into RAM and the CHARS variable re-pointed accordingly.[2]

Control codes

In the control codes area (the C0 range), the Spectrum mostly uses proprietary controls, such as INK and PAPER to control foreground and background colour. However, the common BS and CR code points are the same as in ASCII. Cursor-down (0x0A, ASCII Line Feed) can be simulated with 32 spaces printed with OVER 1 (transparent overprint) and cursor-up 0x0B (ASCII Vertical Tabulation) can be simulated with 32 backspaces. The system ROM has a fault which prevents cursor-right at 0x09 (c.f. ASCII Horizontal Tabulation) from working.[4][5]

Control code 0x0E is used to indicate that a floating-point number follows, to accelerate text processing. In a Sinclair BASIC program numeric constants are stored as ASCII followed by a 0x0E byte and a 5-byte binary floating point representation. When listing a BASIC program only the ASCII part is used but at runtime only the binary representation is used. Some Spectrum programs exploited this to obfuscate numbers, while others did so to save memory.[6] For example, a BASIC line displayed as GO TO 10 could contain the ASCII characters for digits 1 and 0 followed by a 0x0E byte and the floating-point representation of 100 instead of 10. Anyone listing that program saw the number 10, but when executed the program jumped to line 100.

Undefined codes

Ranges 0x00–0x05, 0x07, 0x0A–0x0C, 0x0F and 0x17–0x1F are undefined. In most cases, they will produce a question mark if printed to the display. However, they may be used to represent their literal numeric values in conjunction with certain control codes: for example, 0x10 + 0x07 sets the ink (foreground text) colour to colour number 7 (white).

Character set

Spectrum Character Set[1]
0_ keypress 0_ character 1_ 2_ 3_ 4_ 5_ 6_ 7_ 8_ 9_ A_ B_ C_ D_ E_ F_
_0 INK   0 @ P £[a] p   (A)[b] (Q)[b] VAL USR FORMAT LPRINT LIST
_1 PAPER ! 1 A Q a q   (B)[b] (R)[b] LEN STR$ MOVE LLIST LET
_2 FLASH " 2 B R b r   (C)[b] (S)[b] SIN CHR$ ERASE STOP PAUSE
_3 BRIGHT # 3 C S c s   (D)[b] (T)[c] COS NOT OPEN # READ NEXT
_4 true video INVERSE $ 4 D T d t   (E)[b] (U)[d] TAN BIN CLOSE # DATA POKE
_5 inv video OVER % 5 E U e u   (F)[b] RND ASN OR MERGE RESTORE PRINT
_6 caps lock comma AT & 6 F V f v   (G)[b] INKEY$ ACS AND VERIFY NEW PLOT
_7 edit TAB ' 7 G W g w   (H)[b] PI ATN <= BEEP BORDER RUN
_8 left left[e] ( 8 H X h x   (I)[b] FN LN >= CIRCLE CONTINUE SAVE
_9 right right[f] ) 9 I Y i y   (J)[b] POINT EXP <> INK DIM RANDOMIZE
_A down * : J Z j z   (K)[b] SCREEN$ INT LINE PAPER REM IF
_B up + ; K [ k {   (L)[b] ATTR SQR THEN FLASH FOR CLS
_C delete , < L \ l |   (M)[b] AT SGN TO BRIGHT GO TO DRAW
_D enter enter - = M ] m }   (N)[b] TAB ABS STEP INVERSE GO SUB CLEAR
_E extend number[g] . > N [a] n ~   (O)[b] VAL$ PEEK DEF FN OVER INPUT RETURN
_F graphics / ? O _ o ©[a]   (P)[b] CODE IN CAT OUT LOAD COPY

See also

Notes

  1. ^ a b c Different from US-ASCII.
  2. ^ a b c d e f g h i j k l m n o p q r s UDG (User-Defined Graphics) character.
  3. ^ UDG T in 48 BASIC, keyword SPECTRUM in 128 BASIC.
  4. ^ UDG U in 48 BASIC, keyword PLAY in 128 BASIC.
  5. ^ In the Standard ROM CHR$ 8 fails backing from line 1 to line zero, and fails in a different way backing off line zero.
  6. ^ In the Standard ROM CHR$ 9 does not actually move the text output position.
  7. ^ Used in BASIC programs as a marker prefixing a 5-byte floating point number.

References

  1. ^ a b ZX Spectrum manual, Appendix A, the character set
  2. ^ a b ZX Spectrum manual, Chapter 25, the system variables
  3. ^ "Los Comandos Exclusivos de la TK 90X".
  4. ^ Logan, Ian (1983). Understanding Your Spectrum. Melbourne House. p. 189. ISBN 086161111X.
  5. ^ Wearmouth, Geoff. . Archived from the original on August 25, 2015.
  6. ^ Swann, Richard P. "Part 4 Decrypters". HOW TO HACK on the ZX Spectrum.
  • Sinclair Basic Manual, Steven Vickers, Robin Bradbeer (ed.); pub. Sinclair Research Limited. Online copy at World of Spectrum

External links

  • From Michael Zaretski's website
  • From the same site
  • The floating point package

spectrum, character, variant, ascii, used, spectrum, family, computers, based, ascii, 1967, characters, href, delete, character, html, title, delete, character, replaced, with, also, differs, control, codes, other, than, common, href, backspace, html, title, b. The ZX Spectrum character set is the variant of ASCII used in the ZX Spectrum family computers It is based on ASCII 1967 but the characters and a href Delete character html title Delete character DEL a are replaced with and c It also differs in its use of the C0 control codes other than the common a href Backspace html title Backspace BS a and a href Carriage return html title Carriage return CR a and it makes use of the 128 high bit characters beyond the ASCII range 1 The ZX Spectrum s main set of printable characters and system font are also used by the Jupiter Ace computer The ZX Spectrum character set as rendered in the system font not including User Defined Graphics characters Contents 1 Printable characters 2 Control codes 3 Undefined codes 4 Character set 5 See also 6 Notes 7 References 8 External linksPrintable characters Edit Screenshot of output from a Sinclair BASIC program that demonstrates all printable code points including BASIC keywords and the User Defined Graphics characters by default defined as copies of the A U glyphs Standard US ASCII 0x20 0x7F is included in the Spectrum character set except that code point 0x5E is an up arrow instead of a caret 0x60 is the pound sign instead of the grave accent and 0x7F is the copyright sign c instead of the control character a href Delete character html title Delete character DEL a Note that the use of 0x5E as was also the case in the older 1963 version of ASCII The sign was not mapped to 0x23 as in the British variant of ASCII ISO 646 GB allowing both the pound sign and the number sign simultaneously The character is the exponentiation operator in Spectrum s BASIC just like the it replaces compared to ASCII 1967 is used for exponentiation in many other dialects of BASIC and other programming languages Beyond 0x7F the Spectrum character set uses the high bit range 0x80 0xFF for special purposes 0x80 0x8F contain the same 2 2 block graphics characters that the ZX80 character set and the ZX81 character set have at other locations also available in the Block Elements Unicode block However the ZX Spectrum s standard character set does not include the ZX80 81 50 dithered 1 2 block graphics characters Code points 0x90 0xA4 contain the originally 21 User Defined Graphics UDG characters and 0xA5 0xFF contain BASIC keywords tokenized as single code points In the 128 BASIC mode introduced later this was changed to 19 UDG characters ending at 0xA2 followed by the two new tokens SPECTRUM and PLAY Code points 0xC7 0xC9 are the two character operators a href Less than sign html Less than sign plus equals sign title Less than sign lt a a href Greater than sign html Greater than sign plus equals sign title Greater than sign gt a and a href Equals sign html Not equal title Equals sign lt gt a similarly tokenized into single code points These tokens allow a BASIC command like PRINT to be entered with the single keypress P at the beginning of a line i e in command mode which generated 0xF6 That is displayed as the full keyword PRINT on screen but only a single byte token is stored so only that single byte need be parsed by the interpreter or saved to loaded from external storage such as tape All non UDG Spectrum characters can be mapped to Unicode The three non ASCII 1967 characters and c are at U 2191 U 00A3 and U 00A9 The 2 2 block graphics characters are in the Block Elements block at U 2580 U 259F although font support for the latter is not universal The shape of the UDG characters is mapped to a RAM memory area and is initialized to copies of characters A U but can be redefined arbitrarily for example using the BASIC command a href PEEK and POKE html title PEEK and POKE POKE a Like all characters in the system font they use an 8 8 pixel grid stored in 8 bytes Redefining them changes their appearance in subsequent PRINT statements but it does not change any UDG characters already drawn on the screen The location of a UDG character s definition can be determined with the BASIC function USR with the character as the argument e g USR A for the first one By default this points to the last 168 21 8 bytes of RAM at memory addresses 65368 0xFF58 to 65535 0xFFFF for a 48K Spectrum The location is pointed to by the system variable UDG 2 which can be found at memory address 23675 6 0x5C7B C and can be changed The TK90X a Brazilian clone of the ZX Spectrum included an in ROM application to graphically edit these UDG characters along with functionality to preload then with accented letters used in Portuguese For this the TK90X defined two extra Basic commands at the codes 0 and 1 respectively trace and udg 3 The definition of the main system font 32 space to 127 copyright are referenced by the system variable CHARS which can be found at memory address 23606 7 0x5C36 7 It is defined as 256 bytes lower than the first byte of the space character simplifying the formula for locating a character to CHARS 8 code point The CHARS value defaults to the value 15360 0x3C00 with the system font at the end of the Spectrum s ROM at address 15616 0x3D00 to 16383 0x3FFF Entire alternative fonts can be loaded into RAM and the CHARS variable re pointed accordingly 2 Control codes EditIn the control codes area the C0 range the Spectrum mostly uses proprietary controls such as INK and PAPER to control foreground and background colour However the common a href Backspace html title Backspace BS a and a href Carriage return html title Carriage return CR a code points are the same as in ASCII Cursor down 0x0A ASCII Line Feed can be simulated with 32 spaces printed with OVER 1 transparent overprint and cursor up 0x0B ASCII Vertical Tabulation can be simulated with 32 backspaces The system ROM has a fault which prevents cursor right at 0x09 c f ASCII Horizontal Tabulation from working 4 5 Control code 0x0E is used to indicate that a floating point number follows to accelerate text processing In a Sinclair BASIC program numeric constants are stored as ASCII followed by a 0x0E byte and a 5 byte binary floating point representation When listing a BASIC program only the ASCII part is used but at runtime only the binary representation is used Some Spectrum programs exploited this to obfuscate numbers while others did so to save memory 6 For example a BASIC line displayed as GO TO 10 could contain the ASCII characters for digits 1 and 0 followed by a 0x0E byte and the floating point representation of 100 instead of 10 Anyone listing that program saw the number 10 but when executed the program jumped to line 100 Undefined codes EditRanges 0x00 0x05 0x07 0x0A 0x0C 0x0F and 0x17 0x1F are undefined In most cases they will produce a question mark if printed to the display However they may be used to represent their literal numeric values in conjunction with certain control codes for example 0x10 0x07 sets the ink foreground text colour to colour number 7 white Character set EditSpectrum Character Set 1 0 keypress 0 character 1 2 3 4 5 6 7 8 9 A B C D E F 0 INK 0 P a p A b Q b VAL USR FORMAT LPRINT LIST 1 PAPER 1 A Q a q B b R b LEN STR MOVE LLIST LET 2 FLASH 2 B R b r C b S b SIN CHR ERASE STOP PAUSE 3 BRIGHT 3 C S c s D b T c COS NOT OPEN READ NEXT 4 true video INVERSE 4 D T d t E b U d TAN BIN CLOSE DATA POKE 5 inv video OVER 5 E U e u F b RND ASN OR MERGE RESTORE PRINT 6 caps lock comma AT amp 6 F V f v G b INKEY ACS AND VERIFY NEW PLOT 7 edit TAB 7 G W g w H b PI ATN lt BEEP BORDER RUN 8 left left e 8 H X h x I b FN LN gt CIRCLE CONTINUE SAVE 9 right right f 9 I Y i y J b POINT EXP lt gt INK DIM RANDOMIZE A down J Z j z K b SCREEN INT LINE PAPER REM IF B up K k L b ATTR SQR THEN FLASH FOR CLS C delete lt L l M b AT SGN TO BRIGHT GO TO DRAW D enter enter M m N b TAB ABS STEP INVERSE GO SUB CLEAR E extend number g gt N a n O b VAL PEEK DEF FN OVER INPUT RETURN F graphics O o c a P b CODE IN CAT OUT LOAD COPYSee also EditZX80 character set ZX81 character set PETSCII ATASCII Atari ST character set Extended ASCIINotes Edit a b c Different from US ASCII a b c d e f g h i j k l m n o p q r s UDG User Defined Graphics character UDG T in 48 BASIC keyword SPECTRUM in 128 BASIC UDG U in 48 BASIC keyword PLAY in 128 BASIC In the Standard ROM CHR 8 fails backing from line 1 to line zero and fails in a different way backing off line zero In the Standard ROM CHR 9 does not actually move the text output position Used in BASIC programs as a marker prefixing a 5 byte floating point number References Edit a b ZX Spectrum manual Appendix A the character set a b ZX Spectrum manual Chapter 25 the system variables Los Comandos Exclusivos de la TK 90X Logan Ian 1983 Understanding Your Spectrum Melbourne House p 189 ISBN 086161111X Wearmouth Geoff An Assembly File Listing to generate a 16K ROM for the ZX Spectrum Archived from the original on August 25 2015 Swann Richard P Part 4 Decrypters HOW TO HACK on the ZX Spectrum Sinclair Basic Manual Steven Vickers Robin Bradbeer ed pub Sinclair Research Limited Online copy at World of SpectrumExternal links EditSinclair Spectrum 48K Character Set From Michael Zaretski s website Mapping table from Sinclair Spectrum 48K Character Set to Unicode From the same site The floating point package Retrieved from https en wikipedia org w index php title ZX Spectrum character set amp oldid 1130384892, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.