Revert to Section 4.2.3

4.4 The PES packet format

As stated earlier in the text, prior to entering the transport layer, some elementary bit streams will go through PES layer packetization. The PES header carries various rate, timing, and descriptive information as set by the encoder. The PES packet length is described in a field provided for that purpose. The PES packetization interval is application dependent resulting in packets of variable length with a maximum definable size of 216 bytes. If the PES packet length is set to zero, the PES packet can be of any length. A value of zero for the PES packet length can be used only when the PES packet payload is a video elementary stream.

figure 30

An example of data access flow in the receiver

It is useful to have PES packets start on Group of Pictures (GOP) boundaries when dealing with compressed video and associated audio. The example described in this section represents a subset of the general MPEG-2 description which allows simplification of the receiver. In this example, all data for a PES packet, including the header, are transmitted contiguously as the payload of transport packets. A new PES packet always starts a new transport packet, and PES packets that end in the middle of a transport packet are followed by stuffing bytes for the remaining length of the transport packet.

figure 31

PES Packet Structure

A PES packet consists of a PES_packet_start_code, PES header flags, PES packet header fields, and a data block payload as shown in Fig. 31. The packet payload is a stream of contiguous bytes of a single elementary stream, and for video or audio packets, the payload is a sequence of access units provided by the encoder corresponding to the video pictures and the audio frames.

Each elementary stream is identified by a unique stream_id which is carried by the PES packet. PES packets carrying various types of elementary streams can be multiplexed to form a program or transport stream. The stream_id can take on a number of values indicating the type of data in the payload as shown in Table 2.

Table 2

PES Packet Overview




Indicates the start of a new packet. Together with the stream_id forms the packet start code. Takes the value 0x00 0001.


Specifies the type and number of the stream to which the packet belongs:

1011 1100 - Reserved stream.
1011 1101 - Private Stream 1.
1011 1110 - Padding Stream.
1011 1111 - Private Stream 2.

110x xxxx - MPEG Audio Stream Number xxxxx.
1110 xxxx - MPEG Video Stream Number xxxx.
1111 0000 ECM stream
1111 0001.EMM stream
1111 0010 DSM CC stream
1111 0011 MHEG stream
1111 0100 - 1111 1000 ITU-T Rec. H.222.1 type A - type E
1111 1001 ancillary stream
1111 1010 - 1111 1110 reserved data stream

1111 1111 program stream directory


Specifies the number of bytes remaining in the packet after this field.

0x 0000 - this value is only allowed value for video. Audio details to be determined.

figure 32

PES Header Flags

The PES header flags for a constrained example of the MPEG-2 system are shown in Fig. 32 and described in Table 3, and provide indicators of the properties of the bit stream and the existence of additional flags in the PES header.

Table 3

PES Header Flags



PESSC (PES_scrambling_control)

Indicates the scrambling of the PES packet received:

00 - Not scrambled.
01 - User defined.
10 - User defined.
11 - User defined.

(In this example, set = 00.)

PESP (PES_priority)

Indicates the priority of this packet with respect to other packets: 1=high priority; 0=no priority.

DAI (data_alignment_indicator)

Indicates the nature of alignment of the first start code occurring in the payload. The type of data in the payload is indicated by the data_stream_alignment_descriptor.

1 - Aligned;
0 - No indication of alignment. (Must be aligned for video.)

CY (copyright)

Indicates the copyright nature of the associated PES packet payload:

1 - Copyrighted.
0 - Not defined.

OOC (original_or_copy)

Indicates whether the associated PES packet payload is the original program or a copy:

1 - Original;
0 - Copy.

TSF (PTS_DTS_flags)

Indicates whether the PTS or PTS and DTS are in the PES header:

00 - Neither PTS or DTS are present in the header.
1x - PTS field is present.
11 - Both PTS or DTS are present in the header.

(The PTS flag is set when video data alignment indicator is set. The DTS may be included to signal the decoder of any special requirements. PTS transmissions should be spaced less than 700 ms apart.)

ESCR (ESCR_flag)

Indicates whether the Elementary Stream Clock Reference field is present in the PES header. (In this example, set = 0.)

RATE (ES_rate_flag)

Indicates whether the Elementary Stream Rate field is present in the PES header. (In this example, set = 0.)

TM (DSM_trick_mode_flag)

Indicates the presence of an 8 bit field describing the DSM (Digital Storage Media) operating mode:

1 - Field is present.
0 - Field is not present. (For broadcasting purposes, set = 0.)

ACI (additional_copy_info_flag)

Indicates the presence of the additional_copy_info field.

1 - Field is present.
0 - Field is not present.

CRC (PES_CRC_flag)

Indicates the presence of a CRC field in the PES packet. (In this example, set = 0.)

EXT (PES_extension_flag)

The flag is set as necessary to indicate that extension flags are set in the PES header. Its use includes support of private data.

1 - Field is present.
0 - Field is not present.

The PES header follows the PES_header_length field which indicates the header size in bytes. The size of the header includes all the header fields, any extension fields, and stuffing_bytes. The organization of the PES header is described by the PES header flags and all of the fields of the PES header are optional. Certain applications require particular fields to be set. For example, the DTTB transport of video PES packets requires that the data_alignment_indicator be set. The trick mode flag is generally not set. For digital storage media, retrieval of video requires the opposite conditions to be true. The encoder associated with each application must set the appropriate flags and encode the appropriate fields.

figure 33

PES Header Organization

The PES header is shown in Fig. 33 and described in Table 4.

Table 4

PES Header



PTS (presentation_time_stamp)

DTS (decoding_time_stamp)

PTS informs the decoder of the intended time of presentation of a presentation unit. DTS informs the decoder of the intended time of decoding of an access unit. An access unit is an encoded presentation unit. When encoded, the PTS refers to the presentation unit corresponding to the first access unit occurring in the packet. If an access unit does not occur in a packet, it shall not contain a PTS. Under normal conditions, the DTS may be derived from the PTS and need not be encoded. A video access unit occurs if the first byte of the picture start code is present in the PES packet payload. An audio access unit occurs if the first byte of the audio frame is present.


An eight bit field indicating the nature of the information encoded. The field is further partitioned as follows:

trick_mode_control (3 bits),
field_id (2 bits),
intra_slice_refresh (1 bit), and
frequency_truncation (2 bits).


Indicates the nature of the DSM Mode:

000 - Fast Forward
001 - Slow Motion
010 - Freeze Frame
011 - Fast reverse
1xx - Reserved.


This identifier is valid for interlaced pictures only and describes how the current frame is to be displayed:

00 - Display field 1 only.
01 - Display field 2 only.
10 - Display complete frame.
11 - Reserved.


This field indicates the selection of coefficients from the DSM:

00 - Only DC coefficients are sent.
01 - The first three coefficients in scan order on average.
10 - The first six coefficients in scan order on average.

The field is for information purposes only. At times, more than the specified number of coefficients may be sent. At other times, less than the specified number of coefficients may be sent.


This field indicates that each picture is composed of intra slices with possible gaps between them. The decoder should replace the missing slices by repeating the collocated sites from the previously decoded picture.


This field indicates how many times the decoder should repeat field #1 as both the top and bottom fields alternatively. After field #1 is displayed, field #2 is displayed the same number of times. This identifier set to "0" is equivalent to a freeze frame with field_id set = "10".

figure 34

PES Extension Flags Field

The PES header can contain additional flags if the EXT flag is set. These flags are transmitted in a one byte data field as shown in Fig. 34 and are described in Table 5. The flags indicate whether further extensions to the PES header exist. In each case the flag is set to "1" if the header field is present.

Table 5

PES Extension Flags





Indicates whether PES packet contains private data.

As defined.


Indicates whether an MPEG-1 systems packet header or an MPEG-2 program stream packet header is present.

As defined.


Indicates whether the STD_buffer_scale and the STD_buffer_size flags are encoded.

Set = 0 for this example.


Indicates the presence of additional data in the PES header.

As defined.


Continue to Section 4.5

Return to DTTB Tutorial Table Of Contents

Return to Tutorial Index Page