DirectShow

This page contains an introduction to DirectShow and some example filter designs using Delphi. The filters are used in my component packs and are delivered in their own download package. The installer puts the filters into <drive>:\Program Files\Common Files\DtsMedia and registers them. They are registered with a Merit of MERIT_DO_NOT_USE which means they will not be included in the list of filters to try automatically. My components specifically ask for them and, if you want to use them, so should your code . It also prevents their being chosen by Windows media player so if you actually want to use them, you must increase their Merit value to be higher than any competing filter using a tool such as GraphStudio. The installer includes an uninstaller.

Each filter has an interface unit that allows your code to control the filter and access its various call-back functions. The interface units are written in Pascal but can easily be converted to C++ to allow an application written in that language to use them.

DirectShow is the Microsoft library for media based applications. It provides a filter graph and filters which you instantiate and connect to replay audio, video, television and teletext content from files, PC hardware and the internet.

Most filters are implemented using C++, the language in which Microsoft publishes its DirectShow API. If you want to build a filter in C++ you need not only a mastery of Com objects, you also need to find your way around the DirectShow header files, of which there are many. You need to pepper your code with AddRef and Release calls to manage the lifetime of your interfaces or add special typecasting; forget this and your filters will throw exceptions or refuse to be destroyed. The learning curve is very steep but, once you've ascended it, there's little competition.

Delphi, surprisingly, is an excellent environment for DirectShow coding. C++ enthusiasts will claim their code is much faster but this is a total myth. Delphi code is just as fast and its more rigid type casting avoids most of the difficult to find bugs that can arise. If you really are a succour for punishment, you can write you code in assembler, which Delphi supports with panache.  

In true Delphi style, you need only one additional file called BaseClass.pas. This file collects Delphi versions of the numerous C++ headers into one place. Delphi's code location tools (hold Ctrl down, put your mouse over a method or parameter and click it to go to the implementation or parameter value) make it easy to find the base class methods and parameters you build upon to code your filters.

Delphi codes interfaces as interfaced objects which means there's no need to think about AddRef and Release calls, the compiler and interfaced objects do this for you. AddRef and Release are still available if you want to do something very "special", or cope with some irregularity in the C++ DirectShow implementation, of which there are a few. BaseClass.pas calls through DirectShow9 into the core Windows implementation in exactly the same way as the Microsoft base classes, so there is no language conversion penalty.

You can get all the files you need by installing the DShow package from SOURCEFORGE.   This package provides wrappers for DirectShow filter Graphs, Filters and Video Renderers. You can build a simple media player using this package. There are some sample filters in the DShow package to help you on your way. If you just want to build filters, however, a more direct approach is to add BaseClass.pas, and perhaps DSUtil.pas for some small utility methods, into your project. The only reason you need the Microsoft DirectShow SDK is for its help files.

Mpeg Demultiplexer

mpeg-demux property page

This filter de-multiplexes mpeg2 Programme and Elementary streams.

The original use of Mpeg2 was for live transmissions over the air where it now finds widespread use in Satellite and Terrestrial television. There was little thought given by mpeg to playing or editing the files on a computer. As a consequence, the data stream does not contain indexing information to allow a given frame to be found or the media to be scanned through at speed, both of which are essential for professional use. This indexing feature was implemented as an addition to mpeg2 by companies such as Elecard.

On the other hand, mpeg's impressive compression efficiency makes it very attractive for large scale media storage and production. Additional tools were developed by BBC Research Department and their partners in the EC funded AC078 Atlantic Project.   This work proved the ability to build an entire programme chain from studio capture to display in the home that used only mpeg2 compressed signals. The work was well before its time, as usual for the BBC, but is now becoming commercially available from companies such as Solveigmm.

This filter is an integral part of an end-to-end programme chain. It can be configured to index an mpeg2 file when it is opened and store the index in the media folder or an assigned index folder. If this option is selected, the filter will read an index file created earlier and use it to index the media. Alternatively it can be configured just to read an earlier index file or to use an approximation based on bit-rate as would a standard de-multiplexer. Only the indexed method is acceptable for professional use since using bit-rate causes a storm of network activity while the demultiplexer searches for an I-Frame, which will not normally be the on you asked for!

Avi Splitter

Avi Splitter property page

All Avi files are supposed to contain an index describing the file offset for each frame in the file. This index identifies all frames not just the key-frames. This means that a well designed Avi splitter should be able to do frame accurate positioning and "scrub" play at least as well as an indexed mpeg2 file. Sadly though, this is not the case for the stock Avi splitter.

My Avi splitter builds an internal index from the file's index which allows it to do rapid seeking and "scrub" play through an Avi file. I have built the filter to deliver the same protocol for frame accurate seeking as used on the mpeg2 demultiplexer but so far have not found any media decoders, other than FFDShow, capable of seeking to a non-key frame, at least for all the media types I have examples of. Consequently, seeking is not generally frame accurate. It does seem to work a lot faster than the stock Avi splitter though , which is a bit of a mystery.

Avi files are a bit of a problem since the file format does not define what type of media the file contains. To determine this you have to get the MediaType from the file content which can be a bit tricky. I have included all the types of media that I could find enough information about but there will be lots of others I have missed. I suppose this is one of the problems with keeping your file format proprietary.

Video Mixer

Video Mixer property page

This filter allows mixing of video streams from numerous source files. A more detailed description can be found in BBC R&D White Paper WHP129 

The mixing is frame accurate, provided the source files are indexed and decoded with index capable filters. There is a selection of twenty simple effects that can be applied to the junctions between sections. The video can also be flipped, flopped or both and there is the possibility to add a bitmap indicating that the media for a shot is not currently available.

The filter has a dynamically variable number of input pins - a new empty pin appears after a source "Sub-Graph" is connected. The filter assumes all input streams have the same pixel counts, so if you want to use streams with different pixel counts you must first interpolate them to the values the mixer chose when it was first connected. My Video Pre-Filter allows you to enforce a media type on this filter and convert media with different pixel counts to this chosen format.

The filter requires only one graph to run. This graph can have numerous source files loaded, each one containing a source reader, a demultiplexer and a decoder (a "Sub-Graph"). Sub-Graphs that are not currently contributing to the output stream are paused and so do not use cpu power.

The filter can use a variety of interleaved video formats. It prefers YUV but this format can be overridden if the filter is to be used in a graph that requires an Alpha channel, for video overlay or Colour separation overlay for instance.

The graph can be played at normal speed on all but the lowliest PCs and can be scrub-played or fast played so you can quickly scan through your work. All of the features mentioned above are implemented in RSMediaSequence which is part of EditPack.

Video Pre-Filter

Video Pre-filter property page

This filter allows you to pre or post-condition a video media stream. You can place it after the Video Mixer to set the video Gain, Sit, Hue and Saturation of the entire production, or in a source graph before the mixer to set these properties for individual shots in a production. You can also set the image size, centre and rotation in the same way. These settings can be changed continuously allowing dynamic effects to be implemented. This process can be synchronised to the video stream using the filter's new image call-back.

When you place it after the mixer, you can ask it to enforce a given pixel count and aspect ratio on the mixer by unsetting the Free Output Format property. The output pixel count and aspect ration of the filter will then be the values set in the Output Format settings. If the Do Pre Processing property is set, the input stream will be converted to the output pixel count. If the property is not set then the input pixel count will also be constrained, allowing you to force a pixel count on the mixer.

Once the Mixer has its pixel count set, you can use this filter to convert sources having different pixel counts to the mixer's pixel count by setting the filter's Free Output Format to true, which allows it to connect to the mixer input pin. You must also set the Do Pre Processing property to true so that the filter's pixel rate conversion methods are enabled.

AllowYUV and UseAlpha are mutually exclusive.  If you are using Alpha channels for overlay then you cannot use a YUV format since it has no space for the Alpha data.

The Free Output Format and Do Pre Processing properties must be set before the filter is connected.  They are persisted in the registry when set from the property sheet to allow application de-bugging.  The Width, Height, Horizontal Position, Vertical Position and Rotation properties on the Position settings page are not persisted in the registry.  You need to set these in your application.  The filter control is implemented in EditPack's RSMediaSequence control and allows you to set a chosen pixel count for your production, select any resolution input files and apply property changes to the entire production or to individual shots.

Audio Mixer

Audio Mixer property page

This filter allows mixing of audio streams from numerous source files. A more detailed description can be found in BBC R&D White Paper WHP129 

The mixing is accurate to a single video frame period, provided the source files are indexed and decoded with index capable filters or are intrinsically *timeable* (uncompressed).

There are three effects, cross-fade, fade out and in, and butt-edit that can be applied to the junctions between sections. The butt edit is in fact a rapid fade out and in and so does not make an objectionable click when it is used. The audio levels can also be adjusted at points within a shot and the length of the fade in and fade out can be changed. Level adjustments can be common to both stereo tracks or done differently for each.

There is the possibility to add a silent clip where the media for a shot is not currently available to match the "no media" facility in the video mixer.

The filter has a dynamically variable number of input pins - a new empty pin appears after a source "Sub-Graph" is connected. The filter assumes all input streams have the same audio format, so if you want to use streams with different audio formats you must first interpolate them to the values the mixer chose when it was first connected. My Audio Pre-Filter allows you to enforce a media type on this filter and converts media with different data rates, word depth and number of channels to this chosen format.

The filter requires only one graph to run. This graph can have numerous source files loaded, each one containing a source reader, a demultiplexer and a decoder or a parser for uncompressed files (a "Sub-Graph"). Sub-Graphs that are not currently contributing to the output stream are paused and so do not use cpu power.

The graph can be scrub-played or fast played so you can quickly scan through your work but the audio is muted when not playing at normal speed. All of the features mentioned above are implemented in RSMediaSequence which is part of EditPack.

Audio PreFilter

Audio Pre-filter property page

This filter allows you to pre or post-condition an audio media stream. You can place it after the Audio Mixer to set the audio Gain and Balance of the entire production, or in a source graph before the mixer to set these properties for individual shots in a production. You can also apply a low-pass, high-pass, band-pass or band-stop audio filter in the same way. These settings can be changed continuously allowing dynamic effects to be implemented. This process can be synchronised to the audio stream using the filter's new sample call-back.

The filter also provides an audio level call-back, allowing your application to display an audio level bar for the left and right hand channels.

When you place it after the mixer, you can ask it to enforce a given sample rate, bit-depth and channel count on the mixer by unsetting the Free Output Format property. The output sample rate, bit-depth and channel count of the filter will then be the values set in the Output Format settings. If the Do Pre Processing property is set, the input stream will be converted to the output sample rate, bit-depth and channel count. If the property is not set then the input values will also be constrained, allowing you to enforce them on the mixer.

Once the Mixer has its sample rate, bit-depth and channel count set, you can use this filter to convert sources having different values by setting the filter's Free Output Format to true, which allows it to connect to the mixer input pin. You must also set the Do Pre Processing property to true so that the filter's sample rate, bit-depth and channel count conversion methods are enabled.

The Free Output Format and Do Pre Processing properties must be set before the filter is connected. They are persisted in the registry when set from the property sheet to allow application de-bugging. The filter values in the Filter Settings page are not persisted in the registry. You need to set these in your application. The filter control is implemented in EditPack's RSMediaSequence control and allows you to set a chosen sample rate, bit-depth and channel count for your production, select different sample rate, bit depth and mono/stereo input files and apply property changes to the entire production or to individual shots.

Cut Detector

Cut detector property page

This filter implements cut detection on a video stream. It is based on work done by me, using dedicated hardware, around 1980 and was published in BBC RD 1984/7    so it is safely out of patent coverage. Jim Easterbrook did the first C++ DirectShow implementation. My implementation is more or less the same but uses Delphi, has some additional property pages and allows a video loop-through that need not be connected.

The filter assumes that you start your detection at the start of the file and continue until you reach the end. In this way, you do not need an indexed file since you can count the frames as they pass through.

The filter can also detect short cross-fades as often happens with cartoons or conversions from a different frame rate. It cannot, however, detect long cross-fades since these do not have an identifiable temporal pattern. To detect long cross-fades you would require some fairly sophisticated content recognition - you would not find such a filter on a free-ware site and probably not even on this planet.

It has a threshold parameter which can be adjusted to optimise detection efficiency and provides an output for each shot change detected that indicates its confidence in the detection. You can access the results by providing a call-back method in your application. TRSThreadCutDetector which is part of EditPack provides these features together with wrapping the filter graph in a thread so that your application can continue responding to user input while a shot change detection is running.