proj-plbook-plChEncoding

Table of Contents for Programming Languages: a survey

Chapter: encoding

Instruction encoding

Fixed or variable length

The various syntactic constructs of a target language program can be of fixed or variable length.

An advantage of fixed-length encodings is that the instructions can be decoded in parallel.

Variable length encodings

For constructs with variable-length encodings, a way of finding the end of each construct must be provided. The possibilities and tradeoffs are similar to string length/termination schemes, and particularly to the tradeoffs between C-style strings and Pascal-style strings (see , below), except:

Data encoding

String length/termination schemes

The two most popular schemes are 'Pascal-style' strings (each string has a fixed-length header (two bytes is a popular choice) that specifies the string length) and 'C-style strings' (each string is termination by a zero byte, or 'NULL').

Pascal-style strings have the disadvantages that:

C-style strings have the disadvantages that:

Pascal-style strings can be extended by various encoding schemes that allow arbitrary lengths to be encoded into a variable-length header. For example: