--- trunk/xvidcore/doc/xvid-encoder.txt 2002/03/08 02:46:11 3 +++ trunk/xvidcore/doc/xvid-encoder.txt 2002/06/24 17:03:03 239 @@ -1,21 +1,53 @@ -/************************************************************* - * Short explanation for the XviD data strutures and routines - * - * encoding part - * - * if you have further questions, visit http://www.xvid.org - * - **************************************************************/ ++--------------------------------------------------------------------+ + Short explanation for the XviD data strutures and routines -/* these are are structures/routines from xvid.h needed for encoding */ + The encoding part --------------------------------------------------------------------------- + If you have further questions, visit http://www.xvid.org ++--------------------------------------------------------------------+ -#define API_VERSION ((1 << 16) | (0)) +Document version : +$Id: xvid-encoder.txt,v 1.2 2002-06-24 17:03:03 edgomez Exp $ -This is the revision of the xvid.h file that you have in front of you. Check it against the library's version. ++--------------------------------------------------------------------+ +| Abstract ++--------------------------------------------------------------------+ + +This document presents the basic structures and API of XviD. It tries +to explain how to use them to obtain a simple profile compliant MPEG4 +stream feeding the encoder with a sequence of frames. + ++-------------------------------------------------------------------+ +| Document ++-------------------------------------------------------------------+ + + + + Chapter 1 : The XviD version + +-----------------------------------------------------------------+ + +The Xvid version is defined at library compilation time using the +constant defined in xvid.h + +#define API_VERSION ((2 << 16) | (1)) + +Where 2 stands for the major XviD version, and 1 for the minor version +number. + +The current version of the API is 2.1 and should be incremented each +time a user defined structure is modified (XVID_INIT_PARAM, +XVID_ENC_PARAM ... we will discuss about them later). + +When you're writing a program/library which uses the XviD library, you +must check your XviD API version against the available library +version. We will see how to check the version number in the next +chapter. + + + + Chapter 2 : The XVID_INIT_PARAM + +-----------------------------------------------------------------+ --------------------------------------------------------------------------- typedef struct { @@ -24,124 +56,295 @@ int core_build; [out] } XVID_INIT_PARAM; -This is filled by xvid_init with the correct CPU flags for initialization -(auto-detect), unless you pass flag to it (cpu_flags!=0). Do not use that -unless you really know what you are doing. -api_version can (should) be checked against API_VERSION, to see if you -have the right core library. - Used in: xvid_init(NULL, 0, &xinit, NULL); --------------------------------------------------------------------------- +This tructure is used and filled by the xvid_init function depending +on the cpu_flags value. + +List of valid flags for the cpu_flags member : + + - XVID_CPU_MMX : cpu feature + - XVID_CPU_MMXEXT : cpu feature + - XVID_CPU_SSE : cpu feature + - XVID_CPU_SSE2 : cpu feature + - XVID_CPU_3DNOW : cpu feature + - XVID_CPU_3DNOWEXT : cpu feature + - XVID_CPU_TSC : cpu feature + - XVID_CPU_IA64 : cpu feature + - XVID_CPU_CHKONLY : command + - XVID_CPU_FORCE : command + +In order to set a flag : xinit.cpu_flags |= desired_flag_constant; + +1st case : you call xvid_init without setting the XVID_CPU_CHKONLY or +the XVID_CPU_FORCE flag, the xvid_init function detects auto magically +the host cpu features and fills the cpu_flags member. The xvid_init +function also performs all internal function pointers initialization +according to deteced features and then returns XVID_ERR_OK. + +2nd case : you call xvid_init setting the XVID_CPU_CHKONLY flag, the +xvid_init function will just detect the host cpu features and return +XVID_ERR_OK without initializing the internal function pointers (NB: +The XviD library is not usable after such a call to xvid_init). + +3rd case : you call xvid_init with the cpu_flags XVID_CPU_FORCE and +desired feature flags set up (eg : XVID_CPU_SSE | XVID_CPU_MMX). In +this case you force XviD to use the given cpu features passed in the +cpu_flags member. Use this if you know what you're doing. + +NB for PowerPC archs : the ppc arch has not automatic detection, the +library must be compiled for a specific ppc target using the right +Makefile (the cpu_flags is irrevelevant for these archs). Use +Makefile.linuxppc for standard ppc optimized functions and +Makefile.linuxppc_altivec for altivec simd optimized functions. + +NB for IA64 archs : There's optimized ia64 assembly functions provided +in the library, they must be forced using the +XVID_CPU_FORCE|XVID_CPU_IA64 pair of flags. + +To check the XviD library version against your own XviD header file, +you have just to call the xvid_init function (no matter the cpu_flags) +and compare the returnded xinit.api_version integer with your +API_VERSION number. The core_build build member is not relevant at the +moment but is reserved for future use (when XviD would have reached a +certain stability in its API and releases). + + + + Chapter 3 : XVID_ENC_PARAM structure + +-----------------------------------------------------------------+ + typedef struct { - int width, height; - int fincr; // frame increment is relative to fbase - int fbase; // so each frame takes "fincr/fbase" seconds - int bitrate; // the bitrate of the target encoded stream, in bits/second - int rc_period; // the intended rate control averaging period - int rc_reaction_period; // the reaction period for rate control - int rc_reaction_ratio; // the ratio for down/up rate control - int max_quantizer; // the upper limit of the quantizer - int min_quantizer; // the lower limit of the quantizer - int max_key_interval; // the maximum interval between key frames - int motion_search; // the quality of compression ( 1=fastest, 6=best ) - int lum_masking; // lum masking on/off - int quant_type; // 0=h.263, 1=mpeg4 - - void * handle; // [out] encoder instance handle - -} XVID_ENC_PARAM; - -This structure has to be filled to create a new encoding instance: - -width and height are the size of the image to be encoded. - -fincr and fbase are the MPEG-way of defining the framerate. -If you have an integer framerate, say 24, 25 or 30fps, use -fincr=1, fbase=framerate. -However, if framerate is non-integer, like 23.996fps you can -e.g. multiply with 1000, getting fincr=1000 and fbase=23996, -giving you integer values again. - -rc-parameters are for ratecontrol, you don't have to change them, -good defaults are -rc_period = 2000 rc_reacton_period = 10 rc_reaction_ratio = 20 - -min_quantizer, max_quantizer limit the range of allowed quantizers. -normally quantizers range is [1..31], so min=1 and max=31. -!!! the HIGHER the quantizer, the LOWER the quality !!! -!!! the HIGHER the quantizer, the HIGHER the compression ratio !!! - -min_quant=1 is somewhat overkill, min_quant=2 is good enough -max_quant depends on what you encode, leave it with 31 or lower it -to something like 15 or 10 for better quality (but encoding with -very low bitrate might fail then). - -max_key_interval is the maximum value of frames between two keyframe -(I-frames). Keyframes are also inserted dynamically at scene breaks. -It is important to have some keyframes, even in longer scenes, if you -want to skip position in the resulting file, because skipping is only -possible from one keyframe to the next. However, keyframes are much larger -than non-keyframes, so do not use too many of them. -A value of framerate*10 is a good choice normally. - -motion_search determines the quality of motion search done by the codec. -The better the search, the smaller the files (or the better the quality). -Since low modes (1-3) are hardly faster than high modes (4,5) a value of -5 is a good choice normally. 6 is possible, but a little slower. If you -want absolutely highest quality, use 6. - -lum_masking stand for "luminance masking" which is an experimental feature. -It tries to compress better by using facts about the human eye. -You might try to switch it on and decide yourself, if you gain anything from it. - -quant_type is technical, is changes the way coefficient are quantized. -Both values are okay, though a value of 0 might be faster. + int width, height; [in] + int fincr, fbase; [in] + int rc_bitrate; [in] + int rc_reaction_delay_factor; [in] + int rc_averaging_period; [in] + int rc_buffer; [in] + int max_quantizer; [in] + int min_quantizer; [in] + int max_key_interval; [in] + + void *handle; [out] +} +XVID_ENC_PARAM; Used in: xerr = xvid_encore(NULL, XVID_ENC_CREATE, &xparam, NULL); --------------------------------------------------------------------------- +This structure has to be filled to create a new encoding instance: + + - width and height. +They have to be set to the size of the image to be encoded. + + - fincr and fbase (<0 forces default value 25fps - [25,1]). + +They are the MPEG-way of defining the framerate. If you have an +integer framerate, say 24, 25 or 30fps, use fincr=1, fbase=framerate. +However, if framerate is non-integer, like 23.996fps you can +e.g. multiply with 1000, getting fincr=1000 and fbase=23996, giving +you integer values again. + + - rc_bitrate (<0 forces default value : 900000). + +This the desired target bitrate. XviD will try to do its best to +respect this setting but keep in mind XviD is still in development and +it has not been tuned for very low bitrates. + + - Any other rc_xxxx parameter are for the bit rate controler in order + to respect your rc_bitrate setting the best it can. (<0 forces + default values) + +Default's are good enough and you should not change them. + +ToDo : describe briefly their impact on the bit rate variations and +the rc_bitrate setting respect. + + - min_quantizer and max_quantizer (<0 forces default values : 1,31). + +These 2 memebers limit the range of allowed quantizers. Normally, +quantizer's range is [1..31], so min=1 and max=31. + +NB : the HIGHER the quantizer, the LOWER the quality. + the HIGHER the quantizer, the HIGHER the compression ratio. + +min_quant=1 is somewhat overkill, min_quant=2 is good enough max_quant +depends on what you encode, leave it with 31 or lower it to something +like 15 or 10 for better quality (but encoding with very low bitrate +might fail then). + + - max_key_interval (<0 forces default value : 10*framerate == 10s) + +This is the maximum value of frames between two keyframes +(I-frames). Keyframes are also inserted dynamically at scene breaks. +It is important to have some keyframes, even in longer scenes, if you +want to skip position in the resulting file, because skipping is only +possible from one keyframe to the next. However, keyframes are much +larger than non-keyframes, so do not use too many of them. A value of +framerate*10 is a good choice normally. + + - handle + +This is the returned internal encoder instance. + + + + Chapter 4 : the XVID_ENC_FRAME structure. + +-----------------------------------------------------------------+ typedef struct { - void * bitstream; // [in] bitstream ptr - int length; // [out] bitstream length (bytes) + int general; [in] + int motion; [in] + void *bitstream; [in] + int length; [out] + + void *image; [in] + int colorspace; [in] + + unsigned char *quant_intra_matrix; [in] + unsigned char *quant_inter_matrix; [in] + int quant; [in] + int intra; [in/out] + + HINTINFO hint; [in/out] +} +XVID_ENC_FRAME; - void * image; // [in] image ptr - int colorspace; // [in] source colorspace +Used in: + xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats); - int quant; // [in] frame quantizer (vbr) - int intra; // [in] force intra frame (vbr only) - // [out] intra state -} XVID_ENC_FRAME; +This is the main structure to encode a frame, it gives hints to the +encoder on how to process an image. + - general flag member. -The main structure to encode a frame: image points to the picture, -in a format that is given by colorspace, e.g. XVID_CSP_RGB24 or -XVID_CSP_YV12. +The general flag member informs XviD on general algorithm choices made +by the library client. -If you set quant=0, then the ratecontrol chooses quantizer for you. -If quant!=0, then this value is used as quantizer, so make 1<=quant<=31. +Valid flags : -intra decides where the frame is going to be a keyframe or not. - intra=1 means: make it a keyframe - intra=0 means: don't make it a keyframe - intra=-1 means: let encoder decide (based on contents and max_key_interval) + - XVID_CUSTOM_QMATRIX : informs xvid to use the custom user + matrices. -So for an ordinary encoding step, you would set quant=0 and intra=-1. + - XVID_H263QUANT : informs xvid to use H263 quantization + algorithm. -The length of the MPEG4-bitstream is returned in length, and -if you set intra to -1, it now contains the encoder's decision: - 0 for non-keyframe, - 1 for keyframe because of a scene change, - 2 for keyframe because max_key_interval was reached. + - XVID_MPEGQUANT : informs xvid to use MPEG quantization + algorithm. -Used in: xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats); + - XVID_HALFPEL : informs xvid to perform a half pixel motion + estimation. + + - XVID_ADAPTIVEQUANT : informs xvid to perform an adaptative + quantization. + + - XVID_LUMIMASKING : infroms xvid to use a lumimasking algorithm. + + - XVID_LATEINTRA : ??? + + - XVID_INTERLACING : informs xvid to use the MPEG4 interlaced + mode. + + - XVID_TOPFIELDFIRST : ??? + + - XVID_ALTERNATESCAN : ??? + + - XVID_HINTEDME_GET : informs xvid to return Motion Estimation + vectors from the ME encoder algorithm. Used during a first pass. + + - XVID_HINTEDME_SET : informs xvid to use the user given motion + estimation vectors as hints for the encoder ME algorithms. Used + during a 2nd pass. + + - XVID_INTER4V : ??? + + - XVID_ME_ZERO : forces XviD to use the zero ME algorithm. + + - XVID_ME_LOGARITHMIC : forces XviD to use the logarithmic + ME algorithm. + + - XVID_ME_FULLSEARCH : forces XviD to use the full search ME + algorithm. + + - XVID_ME_PMVFAST : forces XviD to use the PMVFAST ME algorithm. + + - XVID_ME_EPZS : forces XviD to use the EPZS ME algorithm. + +ToDo : fill the void entries in flags, and describe briefly each ME +algorithm. + + - motion member. + +ToDo : add all the XVID_ME flags here and detail the effect of each +flag. + + - quant member. + +The quantizer value is used when the DCT coefficients are divided to +zero those coefficients not important (according to the target bitrate +not the image quality :-) + +Valid values : + + - 0 (zero) : Then the rate controler chooses the right quantizer + for you. Tipically used in ABR encoding or first pass of a VBR + encoding session. + + - != 0 : Then you force the encoder to use this specific + quantizer value. It is clamped in the interval + [1..31]. Tipically used during the 2nd pass of a VBR encoding + session. + + - intra member. + +[in usage] +The intra value decides wether the frame is going to be a keyframe or +not. + +Valid values : + + - 1 : forces the encoder to create a keyframe. Mainly used during + a VBR 2nd pass. + + - 0 : forces the encoder not to create a keyframe. Minaly used + during a VBR second pass + + - -1 : let the encoder decide (based on contents and + max_key_interval). Mainly used in ABR mode and dunring a 1st + VBR pass. + +[out usage] + +When first set to -1, the encoder returns the effective keyframe state +of the frame. + + - 0 : the resulting frame is not a keyframe + + - 1 : the resulting frame is a keyframe (scene change). + + - 2 : the resulting frame is a keyframe (max_keyframe interval + reached) + + - quant_intra_matrix and quant_inter_matrix members. + +These are pointers to to a pair of user quantization matrices. You +must set the general XVID_CUSTOM_QMATRIX flag to make sure XviD uses +them. + +When set to NULL, the default XviD matrices are used. + +NB : each time the matrices change, XviD must write a header into the +bitstream, so try not changing these matrices very often. This will +save space. + + + + Chapter 5 : The XVID_ENC_STATS structure + +-----------------------------------------------------------------+ --------------------------------------------------------------------------- typedef struct { @@ -151,15 +354,20 @@ } XVID_ENC_STATS; -In this structure the encoder return statistical data about the encoding -process, e.g. to be saved for two-pass-encoding. -quant is the quantizer chosen for this frame (if you let ratecontrol do it) -hlength is the length of the frame's header, including motion information etc. -kblks, mblks, ublks are unused at the moment. +Used in: + xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats); + +In this structure the encoder return statistical data about the +encoding process, e.g. to be saved for two-pass-encoding. quant is +the quantizer chosen for this frame (if you let ratecontrol do it) +hlength is the length of the frame's header, including motion +information etc. kblks, mblks, ublks are unused at the moment. + + -Used in: xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats); + Chapter 6 : The xvid_encode function + +-----------------------------------------------------------------+ --------------------------------------------------------------------------- int xvid_encore(void * handle, int opt, @@ -167,15 +375,16 @@ void * param2); -XviD uses a single-function API, so everything you want to do is done by -this routine. The opt parameter chooses the behaviour of the routine: +XviD uses a single-function API, so everything you want to do is done +by this routine. The opt parameter chooses the behaviour of the +routine: -XVID_ENC_CREATE: create a new encoder, XVID_ENC_PARAM in param1, - a handle to the new encoder is returned in handle +XVID_ENC_CREATE: create a new encoder, XVID_ENC_PARAM in param1, a +handle to the new encoder is returned in handle. -XVID_ENC_ENCODE: encode one frame, XVID_ENC_FRAME-structure in param1, - XVID_ENC_STATS in param2 (or NULL, if you are not - interested in statistical data). +XVID_ENC_ENCODE: encode one frame, XVID_ENC_FRAME-structure in param1, +XVID_ENC_STATS in param2 (or NULL, if you are not interested in +statistical data). -XVID_DEC_DESTROY: shut down this encoder, do not use handle afterwards +XVID_DEC_DESTROY: shut down this encoder, do not use handle afterwards.