Published: TV Technology magazine Issue: 26 January 98 (sidebar to Solving the HDTV Switching Delay - published same date) What is Bitstream Splicing? MPEG bitstream splicing is the process by which programs are switched, edited, etc., in the com- pressed domain in a similar way to conventional, uncompressed video and audio switching. It in- volves joining each of the component elementary streams of one program with the corresponding streams from another. The following information comes courtesy of Mike Knowles, systems devel- opment manager, NDS, Ltd., based in Heathrow, England. MPEG Splicing Applications There are many potential applications, including local ad insertion from a video server into a live feed; local programming in a national network; remote precompressed programming feeds (i.e., sports events, news, etc.) and program editing in the compressed domain. Standardization MPEG defines the basic toolkit and “hooks” for splicing of bitstreams. SMPTE (in the form of Working Group PT20.02) is currently in the process of producing a standard called “Splice Points for MPEG-2 Transport Streams” that defines markers encoders should place in bitstreams to indicate suitable places to splice; and constraints on the coded bitstream to ensure the ability to splice be- tween any two compliant bitstreams. Buffer Management The management of the decoder buffers is one of the most important aspects of splicing. In an MPEG bitstream, there is constant delay from input to the encoder to output of the decoder, but the fill level of the encoder and decoder buffers varies greatly over time. This depends on the type of video frame being transmitted, or on the complexity of the scene. Thus, it may be required to splice from a bitstream that has the decoder buffer relatively full, to one that is expecting it to be close to empty. This is almost certain to cause a buffer overflow. MPEG defines two types of splicing: seamless splicing, in which decoder video buffers are not al- lowed to underflow or overflow, but audio buffers are expected to underflow; and nonseamless splicing, in which all decoder buffers are allowed to underflow, causing a reset and restart of de- coding. As is implied by the names, seamless splicing results in a much cleaner “splice” of the programs — in fact, the splice is invisible. Nonseamless splicing will normally result in a “freeze-frame” effect on video at the decoder while the buffers refill and decoding starts. However, there is also the possi- bility of doing what has become known as “near-seamless” splicing, whereby there is only a short period of freeze-frame, as the decoder buffers are not allowed to completely empty, but are deliber- ately managed during the transition across the splice. Video Bitstream Constraints To make continuous decoding possible, the first picture after the splice has to be an I-frame. That’s fairly easy to arrange if you are splicing to a program coming off a server, or to a co-sited local en- coder. If, however, the two bitstreams are coming from remote sites, or you have no control over them for other reasons, the splicer has to carefully control the transition to make the splice as unob- trusive as possible. If B-frames are used, it is necessary to ensure that the first Group of Pictures (GOP) after the splice is “closed” — that is, there are no B-frames dependent on I- or P-frames from the preceding GOP (which the receiver never sees, because it was receiving the other program). In 60 Hz environments in which 3:2 pull-down is used, it is necessary to ensure that the field-parity sequence is maintained over the splice point. SMPTE specifies out-points after a bottom-field and in-points before a top-field. Bit-rate Differences What do you do if the first program is coded at 2 Mbps and the second at 3 Mbps? You have to en- sure that you always have the capacity required in the outgoing transport stream. You are therefore really stuck with splicing to a stream of the same or lower bit-rate. The much bigger problem is statistically multiplexed bitstreams. More and more broadcasters are looking to utilize such systems to improve the efficiency of the encoding. In such systems, the bit- rate of the video component of any one program is continually varying over a wide range. To splice into such a bitstream, you either need to ensure that the program you are splicing to is coded at the lowest bit-rate utilized by the program you are replacing, or you have to take the pro- gram in question out of the statistical multiplexing algorithm for the duration of the insert. The use of multiple logical groups of programs, with real-time allocation of programs to those groups, enables the broadcasters to combine the advantages of statistical multiplexing with the flexibility of bitstream splicing. Clock References and Time Stamps Obviously, the chances of the two programs to be spliced together having the same PCR (program clock reference) values is extremely small. There are two solutions to this problem: change the PCR, PTS (presentation time stamp) and DTS (decode time stamp) values of the second bitstream to match those of the first, such that the decoder does not see any discontinuity; and signal a system time base discontinuity at the splice point and leave the decoder to recover according to the MPEG standard. Packet Identifiers (PIDs) An MPEG decoder determines the PIDs that carry each of the elementary streams from the PSI (program-specific information). As this information is only transmitted every few hundred millisec- onds, and the response time to changes is decoder-specific, the splicer must remap the PIDs of the second bitstream to match those of the first to ensure there is no delay in starting to decode the sec- ond bitstream. Consideration also has to be given to the possibility that the programs being spliced have different numbers of elementary streams (e.g., the first could have a second language audio component, while the second only has a single audio component). Encrypted Bitstreams Bitstreams scrambled at the Transport Layer have all the PES (program elementary stream) header scrambled. It is thus impossible to read or change any of the contents. This means that only non- seamless splicing, with time base discontinuities, is possible. For either seamless or near-seamless splicing, the bitstream needs to be unscrambled. Control of the Splice Unlike uncompressed signals, parts of the components that are to be presented simultaneously oc- cur in the bitstream at very different times. For example, typically, video data is transmitted well ahead of the associated audio, due to decoding delays. Thus, manual control of splicing is virtually impossible unless the operator can see the program prior to encoding. If they only have access to the decoded program, there will be a delay of up to a second or more between their pressing the “splice” button on cue, and the point where the splice actually occurs. Work in SMPTE has assumed automated splicing according to predetermined schedules. There’s more than video and audio! There are still a number of questions relating to bitstream splicing that have not even been ad- dressed up to this point. Many people only consider the video and audio components of a program. But what about other components, such as subtitles, teletext and DSM-CC carousels used for data broadcasting or interactive TV applications? ATSC closed-captions are dealt with by the SMPTE proposed standard, as they are carried in the video “user data” field, but the others are only men- tioned in passing. n — Joe Fedele