The Ascii-Baudot filter mechanism converts 7-bit Ascii characters to 5-bit Baudot characters, and vice versa. The 7-bits ascii character can also be represented in 8-bits, but all valid characters that are allowed for encoding to Baudot are within the 127 possible character combinations in ASCII. Even within this set, not all characters can be converted. This is because all characters in Baudot can only be represented in 5 bits, and therefore only 32 possible combinations of characters must represent all the 26 letters of the alphabet, 10 numerals, as well as punctuations and control characters. With this constraint, some characters must inevitably be left out. Baudot maximizes its character set by categorizing two classes of characters: letters and figures (henceforth symobolized by LTRS and FIGS respectively). The letters (LTRS) type are all 26 characters of the alphabet, and the figures (FIGS) type are all the numbers, some punctuations and some control characters. LTRS and FIGS are themselves special characters. Some characters are neutral and can be in both LTRS and FIGS class: blank (BLANK), space (SP), the carrage return (CR) and the line feed (LF).
There are 2 variants of Baudot: CCITT version of Baudot is ITA2 (International Telegraph Alphabet 2), and the United States variant USTTY. The following table shows the character sets with the shaded rows indicating the difference between the two.
USTTY ITA2
HEX BITS LTRS FIGS 0x00 00000 BLANK BLANK 0x01 00001 E 3 0x02 00010 LF LF 0x03 00011 A - 0x04 00100 SP SP 0x05 00101 S BELL 0x06 00110 I 8 0x07 00111 U 7 0x08 01000 CR CR 0x09 01001 D $ 0x0A 01010 R 4 0x0B 01011 J ' 0x0C 01100 N , 0x0D 01101 F ! 0x0E 01110 C : 0x0F 01111 K ( 0x10 10000 T 5 0x11 10001 Z " 0x12 10010 L ) 0x13 10011 W 2 0x14 10100 H # 0x15 10101 Y 6 0x16 10110 P 0 0x17 10111 Q 1 0x18 11000 O 9 0x19 11001 B ? 0x1A 11010 G & 0x1B 11011 FIGS FIGS 0x1C 11100 M . 0x1D 11101 X / 0x1E 11110 V ; 0x1F 11111 LTRS LTRS
HEX BITS LTRS FIGS 0x00 00000 BLANK BLANK 0x01 00001 E 3 0x02 00010 LF LF 0x03 00011 A - 0x04 00100 SP SP 0x05 00101 S ' 0x06 00110 I 8 0x07 00111 U 7 0x08 01000 CR CR 0x09 01001 D # 0x0A 01010 R 4 0x0B 01011 J BELL 0x0C 01100 N , 0x0D 01101 F @ 0x0E 01110 C : 0x0F 01111 K ( 0x10 10000 T 5 0x11 10001 Z + 0x12 10010 L ) 0x13 10011 W 2 0x14 10100 H $ 0x15 10101 Y 6 0x16 10110 P 0 0x17 10111 Q 1 0x18 11000 O 9 0x19 11001 B ? 0x1A 11010 G * 0x1B 11011 FIGS FIGS 0x1C 11100 M . 0x1D 11101 X / 0x1E 11110 V = 0x1F 11111 LTRS LTRS
Whenever there is a consecutive sequence of letters in the data stream, the LTRS character precedes the stream until a FIGS type character is encountered, and vice versa. For example, when the string "The 5 men of 223rd St" is encoded to Baudot-USTTY, the byte stream would be as follow (shaded):
Text | T | h | e | 5 | m | e | n | o | f | 2 | 2 | 3 | r | d | S | t | ||||||||||
Baudot | LTRS | T | H | E | SP | FIGS | 5 | SP | LTRS | M | E | N | SP | O | F | SP | FIGS | 2 | 2 | 3 | LTRS | R | D | SP | S | T |
Byte Stream | 0x1F | 0x10 | 0x14 | 0x01 | 0x04 | 0x1B | 0x17 | 0x04 | 0x1F | 0x1C | 0x01 | 0x0C | 0x04 | 0x18 | 0x0D | 0x04 | 0x1B | 0x13 | 0x13 | 0x01 | 0x1F | 0x0A | 0x09 | 0x04 | 0x05 | 0x10 |
Notice that the lower case alphabet characters are encoded to their upper case equivalent. Because of this, the encoded Baudot stream always decode to upper case. In the above example, the encoded shaded portion would decode to "THE 5 MEN OF 223RD ST".
This encoding mechanism is not widely used mainly because: