Use of the MPEG4IP Server Side Tools

The MPEG4IP Server Side Tools allow for the conversion of raw audio and video into MPEG-4 compressed media stored in an MP4 file which can be streamed by the Darwin Streaming Server.

The easiest way to proceed is to use the 'mp4encode' script which is installed with MPEG4IP. This script makes some simplifying assumptions about the input and the desired output. See "The Simple Process" below.

If 'mp4encode' doesn't meet your needs, you can then either create a modified 'mp4encode' script or use the tools directly. See "The Detailed Process" below for the details of the underlying tools.

The Simple Process

1) Capture raw audio and/or video into an AVI file

The details here are varied. But generally speaking, you use a capture program to record audio and video from either a camera or a capture board into an AVI file on disk.

The audio format should be PCM raw audio, 16 bits/sample, little-endian, 2 channels, 44100 samples/sec.

The video format should be YUV12, aka YV12. (FYI: This is a planar format sampled at 4:2:0). Any frame size and frame rate are acceptable.

Personally, I recommend 320x240 @ 24 frames/sec as a good compromise between video quality and size. At these settings, 5 minutes of raw audio/video will consume approximately 1 GB of disk space. (Don't worry things get much better once we encode.)

2) Encode content

Invoke 'mp4encode' with the name of the raw AVI file to produce the encoded MP4 file.

The following parameters can be specified to 'mp4encode':

-w <uint>   The input video frame width in pixels, default value is 320

-h <uint>   The input video frame height in pixels, default value is 240

-r <uint>   The video frame rate in frames per second, default value is 24

-V <uint>   The desired video bitrate in Kbps, default value is 500

-A <uint>   The desired audio bitrate in Kbps, default value is 96

-a <float>  The desired aspect ratio, default value is 1.33 (4:3). Typically letterbox is 2.35.

-I            Use the ISO MPEG-4 video encoder instead of the OpenDivX encoder

-M            Use the MP3 audio encoder instead of the AAC encoder

-d            Debug mode, leave intermediate files.

Examples:

mp4encode mycontent.avi (results are written to mycontent.mp4)

mp4encode -w 640 -h 480 -r 30 -I -V 1000 -M -A 128 mycontent.avi mycontent_divx_1000_mp3_128.mp4

3) Make content available

Copy the MP4 file to a directory under the configured media root for your streaming server. See the documentation for your server as to how to generate a URL for the file, but typically it would look like rtsp://myserver/mycontent.mp4

Example:

cp mycontent.mp4 /usr/local/movies

 

The Detailed Process

1) Acquire raw audio and/or video

The details here are varied. But generally speaking, you use a capture program to record audio and video from either a camera or a capture board into one or two files on disk.

The audio format should be PCM raw audio, 16 bits/sample, little-endian, 2 channels, 44100 samples/sec.

The video format should be YUV12, aka YV12. (FYI: This is a planar format sampled at 4:2:0). Any frame size and frame rate are acceptable.

If you use a Windows system for the media acquisition, you frequently will have your data stored in an AVI file. If that is the case, the avi2raw utility is provided to extract either the audio or video tracks from an AVI file, and write the raw track data to a file of your choice.

Examples:

avi2raw -a mymedia.avi myaudio.pcm

avi2raw -v mymedia.avi myvideo.yuv

The avi2raw utility also has very basic editing capabilities. You can specify a time to start the extraction in seconds from the beginning of the file and a length in seconds. For example, to extract 10 seconds of video starting at 5 seconds from the beginning of the file:

avi2raw –s 5 –l 10 -v mymedia.avi myvideo.yuv

2) Encode audio

There are two options for audio encoding: MP3 and AAC.

2a) MP3

If you wish to use MP3, the provided lame tool will generate the MP3 file.

The -b parameters are specifies the desired bitrate in Kbps, typically between 64 and 128.

The -h parameter specifies that high quality output should be generated at the expense of slower encoding. It is recommended since it does not significantly impact the encoding time in our experience.

If a raw PCM file is provided as the input, then the -r and -x options must be used.

Examples:

lame -h -b 128 myaudio.wav

lame -h -b 96 -r -x myaudio.pcm

2b) AAC

If you wish to use AAC, the provided faac tool will generate the MPEG-4 AAC file.

Two parameters are required: the profile to be used, typically "LOW" for low-complexity; the desired bitrate in Kbps, typically between 16 and 128.

If a raw PCM file is provided as the input, then the -r option must be used.

Examples:

faac -pLOW -b128 myaudio.wav

faac -pLOW -b96 -r myaudio.pcm

       (output file is myaudio.aac)

ISSUE: currently the FAAC code is hard-coded with respect to the parameters of the raw format, which match those described in section 1. If you must use a different format, you should embedded the audio input in a WAV file which faac will read to determine the correct configuration parameters to use with your audio data.

3) Encode video

There are two options for video encoding: ISO reference and OpenDivx. Both generate MPEG-4 video bitstreams. The OpenDivx encoder produces a Simple Profile stream only. The ISO  encoder use most of the video tools defined for MPEG-4, and hence can produces many of the defined profiles. In terms of video quality and encoding speed we have found the OpenDivx encoder to be superior.

If you want to use the OpenDivx encoder, use the provided divxenc tool.

The following options should be used to inform the encoder of the configuration parameters for the compressed video bitstream.

-h <uint>   --height <uint>          Use the given number as the video frame height in pixels, default value is 240

-w <uint>   --width <uint>            Use the given number as the video frame width in pixels, default value is 320

-r <uint>   --rate <uint>             Use the given number as the video frame rate in frames per second, default value is 30

-b <uint>   --bitrate <uint>         Use the given number as desired bitrate in Kbps, default value is 500

-i <uint>   --ifrequency <uint>     Use the given number as the frequency of I frames in frames, default value is 30, i.e. once a second at 30 fps

Example:

divxenc -w 320 -h 240 -r 24 -b 500 -i 24 myvideo.yuv myvideo.divx

If you want to use the ISO encoder, use the provided mp4venc tool. This tool takes one parameter which is the name of a file where the many configuration parameters are given. An template parameter file, 'mp4venc_template.par', is installed into the /usr/local/share directory.

Example:

mp4venc myvideo.par

4) Packetize audio-only

Use the provided mp4apkt tool to generate an MP4 file from the MP3 or AAC encoded file. There are two required parameters the first is the input file name, the second is the output file name.

Examples:

mp4apkt myaudio.mp3 myaudio.mp4

mp4apkt myaudio.aac myaudio.mp4

5) Packetize video-only

Use the provided mp4vpkt tool to generate an MP4 file from the MPEG-4 Video encoded file. There are two required parameters the first is the input file name, the second is the output file name.

The following options should be used to inform the packetizer of the configuration parameters of the compressed video bitstream.

-h <uint>   --height <uint>          Use the given number as the video frame height in pixels, default value is 240

-w <uint>   --width <uint>            Use the given number as the video frame width in pixels, default value is 320

-r <uint>   --rate <uint>             Use the given number as the video frame rate in frames per second, default value is 30

Examples:

mp4vpkt myvideo.divx myvideo.mp4

mp4vpkt -h 160 -w 120 -r 15 myvideo.cmp myvideo.mp4 

If you use the ISO encoder, the video frames can be encoded to use bi-directional motion vectors, aka B-frames. If this is the case, then you will also need to specify how many B-frames occur between non-B-frames (I-frames or P-frames). Note: the answer is often 2. This parameter can be specified via the ‘bfrequency’ command line argument. It is expected that in a future version of the tool, this parameter will be detected automatically.

Also if you use B-frames or other advanced coding tools of MPEG-4 you should designate which video profile you are complying with. By default all streams are marked as "Simple@L3". If you are using advanced tools, "Main@L2" is probably what you want.

Example:

mp4vpkt –bfrequency=2 -profile="Main@L2" myvideo.cmp myvideo.mp4

6) Packetize both audio and video

Use the provided mp4apkt or mp4vpkt tool to first generate an MP4 file for one of the media. Then use the '--merge' option when packetizing the second media. The resulting MP4 file will contain both media tracks and their associated hint tracks.

Example:

mp4apkt myaudio.aac mymedia.mp4

mp4vpkt --merge myvideo.cmp mymedia.mp4

7) Useful options to both packetizers

-s <uint>   --payloadsize <uint>    Use the given number as the RTP payload size, the default value is 1460 bytes

-d     --dump       Dumps MP4 file structure to stdout

-t     --trace      Dumps RTP packetization information to stdout

8) Make content available

Copy the MP4 file to a directory under the configured media root for your streaming server. See the documentation for your server as to how to generate a URL for the file, but typically it would look like rtsp://myserver/mymedia.mp4

Example:

cp mymedia.mp4 /usr/local/movies