Introduction
This article
explains how to use the DirectShow
API for simple audio conversion, particularly
Wav to
MP3 conversion.
Audio codecs in the DirectShow
API are of three type : native codecs, ACM
codecs, and DMO (DirectX
Media Object) codecs.
There are only few
audio native codecs for audio compression. For MP3
encoding, the only one that I've found is the LAME
DirectShow wrapper from Elecard. Most MP3
encoders are in the ACM (Audio Compression Manager)
format, wich was introduced with the Windows Multimedia
API. CDSEncoder
class and it's relative
classes CDSCodec
and CDSCodecFormat
enumerate ACM codecs and their respective compression
parameters, construct a graph and do the encoding.
The GraphBuilder and
other filters
The graph consist of
five filters:
- File
source (async) for reading the input wav
file,
- WAV
Parser for wav
parsing,
- ACM codec for
audio compression (in this case : MP3
ACM, wrapped by the ACM Wrapper Filter),
- WAV
Dest, for wav
output multiplexing,
- File
Writer for writing the output file.
Important note
The WAV
Dest filter is not included in standard filters, but
need to be compiled from the DirectX SDK (SDK_root\Samples\Multimedia\DirectShow\Filters\WavDest).
For convenience, the compiled WAV Dest filter is
included in the demo zip, but you have to register it by
RegSrv32 wavdest.ax.
ACM codecs and the ACM
Wrapper filter
All of the ACM codecs
are listed in DirectShow
in the Audio Compressors Filter Category (CLSID_AudioCompressorCategory
)
and cannot be instantiated directly. We have to use the
Device Enumerator to use them.
Note :
Depending of your configuration, several ACM codecs for
a same format can be installed on your computer. This
can be the case for MP3 codecs. You set priority or
deactivate some of them by the use of control panel, as
show in the following figure.
The DeviceEnumerator
or how to browse ACM codecs
The Device Enumerator
must be used to retrieve an instance of an ACM codec. It
returns the codecs list by the IEnumMoniker
interface, so we can get the filter interface (IBaseFilter
)
by a call to IMoniker::BindToObject()
and
the filter name by a call to IMoniker::BindToStorage()
.
Configuring the ACM
codec with IAMStreamConfig interface
Once the desired codec
is instantiated, we can obtain an IBaseFilter
interface for filter configuration. Since each IBaseFilter
have one or more Pin, we have to search the output Pin
by the use of the IEnumPins
interface and IPin::QueryDirection()
calls.
With the output Pin, we can query the IAMStreamConfig
interface to configure the following property :
- Numbers of channels,
- Samples
per second,
- Average byte per
second,
- Bits per sample.
Note : For
some codecs (including MP3), the call to IAMStreamConfig::SetFormat()
must be after the graph rendering.
The classes
CDSEncoder
CDSEncoder
assumes the following tucancode.net :
- Enumerate the Audio
codecs (
CLSID_AudioCompressorCategory
),
- Build, render, and
run the graph.
class CDSEncoder : public CArray<CDSCodec*, CDSCodec*>
{
public:
void BuildGraph(CString szSrcFileName, CString szDestFileName,
int nCodec, int nFormat);
CDSEncoder();
virtual ~CDSEncoder();
protected:
void BuildCodecArray();
HRESULT AddFilterByClsid(IGraphBuilder *pGraph, LPCWSTR wszName,
const GUID& clsid, IBaseFilter **ppF);
BOOL SetFilterFormat(AM_MEDIA_TYPE* pStreamFormat,
IBaseFilter* pBaseFilter);
IGraphBuilder *m_pGraphBuilder;
};
As CDSEncoder
inherits from CArray
, the collection of
codecs is exposed by CArray
methods with
each codecs returned as CDSCodec
object.
CDSCodec
CDSCodec
assumes the following tucancode.net :
- Enumerate the codec
supported parameters,
- expose the codec
name.
class CDSCodec : public CArray<CDSCodecFormat*, CDSCodecFormat*>
{
public:
CDSCodec();
virtual ~CDSCodec();
CString m_szCodecName;
IMoniker *m_pMoniker;
void BuildCodecFormatArray();
};
As CDSCodec
inherits from CArray
, the collection of
codecs supported parameters is exposed by CArray
methods
with each parameters returned as CDSCodecFormat
object.
CDSCodecFormat
CDSCodecFormat
exposes the properties of one-codec parameters :
- Number of channels,
- Samples per second,
- Bytes per second,
- Bits per samples.
class CDSCodecFormat
{
public:
WORD BitsPerSample();
DWORD BytesPerSec();
DWORD SamplesPerSecond();
WORD NumberOfChannels();
CDSCodecFormat();
virtual ~CDSCodecFormat();
public:
AM_MEDIA_TYPE* m_pMediaType;
};
Known issues
Errors Checking
The article
goal is to demonstrate the use of DirectShow
for simple audio conversion. These classes are not as
safe as they have to be. Please keep this in mind if you
plan to use it in a production environment.
Source Wav
format
There are no sampling
conversion, so you can only generate 44 kHz output files
if you use 44 kHz Wav.
Windows Media
Windows Media format
can be used only with a certificate that can be obtained
by the Windows
Media SDK from Microsoft.