Framework EDI Reference. Encoding Mechanism
ASCII-Baudot

The Ascii-Baudot filter mechanism converts 7-bit Ascii characters to 5-bit Baudot characters, and vice versa.  The 7-bits ascii character can also be represented in 8-bits, but all valid characters that are allowed for encoding to Baudot are within the 127 possible character combinations in ASCII.  Even within this set, not all characters can be converted.  This is because all characters in Baudot can only be represented in 5 bits, and therefore only 32 possible combinations of characters must represent all the 26 letters of the alphabet, 10 numerals, as well as punctuations and control characters.  With this constraint, some characters must inevitably be left out.  Baudot maximizes its character set by categorizing two classes of characters: letters and figures (henceforth symobolized by LTRS and FIGS respectively).  The letters (LTRS) type are all 26 characters of the alphabet, and the figures (FIGS) type are all the numbers, some punctuations and some control characters.  LTRS and FIGS are themselves special characters.  Some characters are neutral and can be in both LTRS and FIGS class: blank (BLANK), space (SP), the carrage return (CR) and the line feed (LF).

There are 2 variants of Baudot: CCITT version of Baudot is ITA2 (International Telegraph Alphabet 2), and the United States variant USTTY.  The following table shows the character sets with the shaded rows indicating the difference between the two.

USTTY ITA2
HEX BITS LTRS FIGS
0x00 00000 BLANK BLANK
0x01 00001 E 3
0x02 00010 LF LF
0x03 00011 A -
0x04 00100 SP SP
0x05 00101 S BELL
0x06 00110 8
0x07 00111 U 7
0x08 01000 CR CR
0x09 01001 D $
0x0A 01010 R 4
0x0B 01011 J '
0x0C 01100 N ,
0x0D 01101 F !
0x0E 01110 C :
0x0F 01111 K (
0x10 10000 T 5
0x11 10001 Z "
0x12 10010 L )
0x13 10011 W 2
0x14 10100 #
0x15 10101 Y 6
0x16 10110 P 0
0x17 10111 Q 1
0x18 11000 O 9
0x19 11001 B ?
0x1A 11010 G &
0x1B 11011 FIGS FIGS
0x1C 11100 M .
0x1D 11101 X /
0x1E 11110 V ;
0x1F 11111 LTRS LTRS
HEX BITS LTRS FIGS
0x00 00000 BLANK BLANK
0x01 00001 E 3
0x02 00010 LF LF
0x03 00011 A -
0x04 00100 SP SP
0x05 00101 S '
0x06 00110 8
0x07 00111 U 7
0x08 01000 CR CR
0x09 01001 D #
0x0A 01010 R 4
0x0B 01011 J BELL
0x0C 01100 N ,
0x0D 01101 F @
0x0E 01110 C :
0x0F 01111 K (
0x10 10000 T 5
0x11 10001 Z +
0x12 10010 L )
0x13 10011 W 2
0x14 10100 $
0x15 10101 Y 6
0x16 10110 P 0
0x17 10111 Q 1
0x18 11000 O 9
0x19 11001 B ?
0x1A 11010 G *
0x1B 11011 FIGS FIGS
0x1C 11100 M .
0x1D 11101 X /
0x1E 11110 V =
0x1F 11111 LTRS LTRS

 

Whenever there is a consecutive sequence of letters in the data stream, the LTRS character precedes the stream until a FIGS type character is encountered, and vice versa.  For example, when the string "The 5 men of 223rd St" is encoded to Baudot-USTTY, the byte stream would be as follow (shaded):

Text   T h e 5 m e n o f 2 2 3 r d S t
Baudot LTRS T H E SP FIGS 5 SP LTRS M E N SP O F SP FIGS 2 2 3 LTRS R D SP S T
Byte Stream  0x1F  0x10  0x14  0x01  0x04  0x1B  0x17  0x04  0x1F  0x1C  0x01  0x0C  0x04  0x18  0x0D  0x04  0x1B  0x13  0x13  0x01  0x1F  0x0A  0x09  0x04  0x05  0x10

Notice that the lower case alphabet characters are encoded to their upper case equivalent.  Because of this, the encoded Baudot stream always decode to upper case.  In the above example, the encoded shaded portion would decode to "THE 5 MEN OF 223RD ST".

This encoding mechanism is not widely used mainly because: