[svn] / trunk / xvidcore / doc / xvid-encoder.txt Repository:
ViewVC logotype

Diff of /trunk/xvidcore/doc/xvid-encoder.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 3, Fri Mar 8 02:46:11 2002 UTC revision 247, Thu Jun 27 14:49:05 2002 UTC
# Line 1  Line 1 
1  /*************************************************************  +--------------------------------------------------------------------+
2   * Short explanation for the XviD data strutures and routines        Short explanation for the XviD data strutures and routines
  *  
  *                       encoding part  
  *  
  * if you have further questions, visit http://www.xvid.org  
  *  
  **************************************************************/  
3    
4  /* these are are structures/routines from xvid.h needed for encoding */                            The encoding part
5    
6  --------------------------------------------------------------------------         If you have further questions, visit http://www.xvid.org
7    +--------------------------------------------------------------------+
8    
9  #define API_VERSION ((1 << 16) | (0))  Document version :
10    $Id: xvid-encoder.txt,v 1.3 2002-06-27 14:49:05 edgomez Exp $
11    
12  This is the revision of the xvid.h file that you have in front of you. Check it against the library's version.  +--------------------------------------------------------------------+
13    | Abstract
14    +--------------------------------------------------------------------+
15    
16    This document presents the basic  structures and API of XviD. It tries
17    to explain how to use them  to obtain a simple profile compliant MPEG4
18    stream feeding the encoder with a sequence of frames.
19    
20    +-------------------------------------------------------------------+
21    | Document
22    +-------------------------------------------------------------------+
23    
24    
25    
26                         Chapter 1 : The XviD version
27     +-----------------------------------------------------------------+
28    
29    The  Xvid version  is defined  at library  compilation time  using the
30    constant defined in xvid.h
31    
32    #define API_VERSION ((2 << 16) | (1))
33    
34    Where 2 stands for the major XviD version, and 1 for the minor version
35    number.
36    
37    The current version  of the API is 2.1 and  should be incremented each
38    time   a  user   defined  structure   is   modified  (XVID_INIT_PARAM,
39    XVID_ENC_PARAM ... we will discuss about them later).
40    
41    When you're writing a program/library which uses the XviD library, you
42    must  check  your  XviD  API  version against  the  available  library
43    version.  We will  see how  to check  the version  number in  the next
44    chapter.
45    
46    
47    
48                       Chapter 2 : The XVID_INIT_PARAM
49     +-----------------------------------------------------------------+
50    
 --------------------------------------------------------------------------  
51    
52  typedef struct  typedef struct
53  {  {
# Line 24  Line 56 
56          int core_build;         [out]          int core_build;         [out]
57  } XVID_INIT_PARAM;  } XVID_INIT_PARAM;
58    
 This is filled by xvid_init with the correct CPU flags for initialization  
 (auto-detect), unless you pass flag to it (cpu_flags!=0). Do not use that  
 unless you really know what you are doing.  
 api_version can (should) be checked against API_VERSION, to see if you  
 have the right core library.  
   
59  Used in:  xvid_init(NULL, 0, &xinit, NULL);  Used in:  xvid_init(NULL, 0, &xinit, NULL);
60    
61  --------------------------------------------------------------------------  This tructure is  used and filled by the  xvid_init function depending
62    on the cpu_flags value.
63    
64    List of valid flags for the cpu_flags member :
65    
66     - XVID_CPU_MMX      : cpu feature
67     - XVID_CPU_MMXEXT   : cpu feature
68     - XVID_CPU_SSE      : cpu feature
69     - XVID_CPU_SSE2     : cpu feature
70     - XVID_CPU_3DNOW    : cpu feature
71     - XVID_CPU_3DNOWEXT : cpu feature
72     - XVID_CPU_TSC      : cpu feature
73     - XVID_CPU_IA64     : cpu feature
74     - XVID_CPU_CHKONLY  : command
75     - XVID_CPU_FORCE    : command
76    
77    In order to set a flag : xinit.cpu_flags |= desired_flag_constant;
78    
79    1st case : you call  xvid_init without setting the XVID_CPU_CHKONLY or
80    the XVID_CPU_FORCE flag, the xvid_init function detects auto magically
81    the host  cpu features and  fills the cpu_flags member.  The xvid_init
82    function also  performs all internal  function pointers initialization
83    according to deteced features and then returns XVID_ERR_OK.
84    
85    2nd case :  you call xvid_init setting the  XVID_CPU_CHKONLY flag, the
86    xvid_init function will  just detect the host cpu  features and return
87    XVID_ERR_OK without  initializing the internal  function pointers (NB:
88    The XviD library is not usable after such a call to xvid_init).
89    
90    3rd case  : you call  xvid_init with the cpu_flags  XVID_CPU_FORCE and
91    desired feature  flags set up  (eg : XVID_CPU_SSE |  XVID_CPU_MMX). In
92    this case you  force XviD to use the given cpu  features passed in the
93    cpu_flags member. Use this if you know what you're doing.
94    
95    NB for PowerPC  archs : the ppc arch has  not automatic detection, the
96    library must  be compiled  for a specific  ppc target using  the right
97    Makefile  (the  cpu_flags  is   irrevelevant  for  these  archs).  Use
98    Makefile.linuxppc   for   standard   ppc   optimized   functions   and
99    Makefile.linuxppc_altivec for altivec simd optimized functions.
100    
101    NB for IA64 archs : There's optimized ia64 assembly functions provided
102    in    the    library,    they     must    be    forced    using    the
103    XVID_CPU_FORCE|XVID_CPU_IA64 pair of flags.
104    
105    To check the  XviD library version against your  own XviD header file,
106    you have just to call the xvid_init function (no matter the cpu_flags)
107    and  compare   the  returnded  xinit.api_version   integer  with  your
108    API_VERSION number. The core_build build member is not relevant at the
109    moment but is reserved for future  use (when XviD would have reached a
110    certain stability in its API and releases).
111    
112    
113    
114                     Chapter 3 : XVID_ENC_PARAM structure
115     +-----------------------------------------------------------------+
116    
117    
118  typedef struct  typedef struct
119  {  {
120    int width, height;          int width, height;              [in]
121    int fincr;    // frame increment is relative to fbase          int fincr, fbase;               [in]
122    int fbase;    // so each frame takes "fincr/fbase" seconds          int rc_bitrate;                 [in]
123    int bitrate;  // the bitrate of the target encoded stream, in bits/second          int rc_reaction_delay_factor;   [in]
124    int rc_period;                // the intended rate control averaging period          int rc_averaging_period;        [in]
125    int rc_reaction_period;       // the reaction period for rate control          int rc_buffer;                  [in]
126    int rc_reaction_ratio;        // the ratio for down/up rate control          int max_quantizer;              [in]
127    int max_quantizer;    // the upper limit of the quantizer          int min_quantizer;              [in]
128    int min_quantizer;    // the lower limit of the quantizer          int max_key_interval;           [in]
129    int max_key_interval; // the maximum interval between key frames  
130    int motion_search;    // the quality of compression ( 1=fastest, 6=best )          void *handle;                   [out]
131    int lum_masking;      // lum masking on/off  }
132    int quant_type;       // 0=h.263, 1=mpeg4  XVID_ENC_PARAM;
   
   void * handle;        // [out] encoder instance handle  
133    
134  } XVID_ENC_PARAM;  Used in:    xerr = xvid_encore(NULL, XVID_ENC_CREATE, &xparam, NULL);
135    
136  This structure has to be filled to create a new encoding instance:  This structure has to be filled to create a new encoding instance:
137    
138  width and height are the size of the image to be encoded.   - width and height.
139    
140    They have to be set to the size of the image to be encoded.
141    
142     - fincr and fbase (<0 forces default value 25fps - [25,1]).
143    
144  fincr and fbase are the MPEG-way of defining the framerate.  They  are the  MPEG-way of  defining the  framerate.  If  you  have an
145  If you have an integer framerate, say 24, 25 or 30fps, use  integer framerate, say 24,  25 or 30fps, use fincr=1, fbase=framerate.
 fincr=1, fbase=framerate.  
146  However, if framerate is non-integer, like 23.996fps you can  However, if framerate is non-integer, like 23.996fps you can
147  e.g. multiply with 1000, getting fincr=1000 and fbase=23996,  e.g. multiply  with 1000,  getting fincr=1000 and  fbase=23996, giving
148  giving you integer values again.  you integer values again.
149    
150  rc-parameters are for ratecontrol, you don't have to change them,   - rc_bitrate (<0 forces default value : 900000).
 good defaults are  
 rc_period = 2000    rc_reacton_period = 10    rc_reaction_ratio = 20  
   
 min_quantizer, max_quantizer limit the range of allowed quantizers.  
 normally quantizers range is [1..31], so min=1 and max=31.  
 !!! the HIGHER the quantizer, the LOWER the quality  !!!  
 !!! the HIGHER the quantizer, the HIGHER the compression ratio !!!  
   
 min_quant=1 is somewhat overkill, min_quant=2 is good enough  
 max_quant depends on what you encode, leave it with 31 or lower it  
 to something like 15 or 10 for better quality (but encoding with  
 very low bitrate might fail then).  
151    
152  max_key_interval is the maximum value of frames between two keyframe  This  the desired  target bitrate.  XviD will  try to  do its  best to
153    respect this setting but keep in mind XviD is still in development and
154    it has not been tuned for very low bitrates.
155    
156     - Any other rc_xxxx parameter are for the bit rate controler in order
157       to  respect your  rc_bitrate setting  the best  it can.  (<0 forces
158       default values)
159    
160    Default's are good enough and you should not change them.
161    
162    ToDo :  describe briefly their impact  on the bit  rate variations and
163    the rc_bitrate setting respect.
164    
165     - min_quantizer and max_quantizer (<0 forces default values : 1,31).
166    
167    These  2 memebers limit  the range  of allowed  quantizers.  Normally,
168    quantizer's range is [1..31], so min=1 and max=31.
169    
170    NB : the HIGHER the quantizer, the LOWER the quality.
171         the HIGHER the quantizer, the HIGHER the compression ratio.
172    
173    min_quant=1 is somewhat overkill, min_quant=2 is good enough max_quant
174    depends on what you encode, leave  it with 31 or lower it to something
175    like 15 or  10 for better quality (but encoding  with very low bitrate
176    might fail then).
177    
178     - max_key_interval (<0 forces default value : 10*framerate == 10s)
179    
180    This   is  the  maximum   value  of   frames  between   two  keyframes
181  (I-frames). Keyframes are also inserted dynamically at scene breaks.  (I-frames). Keyframes are also inserted dynamically at scene breaks.
182  It is important to have some keyframes, even in longer scenes, if you  It is important to have some keyframes, even in longer scenes, if you
183  want to skip position in the resulting file, because skipping is only  want to skip position in the resulting file, because skipping is only
184  possible from one keyframe to the next. However, keyframes are much larger  possible from  one keyframe to  the next. However, keyframes  are much
185  than non-keyframes, so do not use too many of them.  larger than non-keyframes, so do not use too many of them.  A value of
186  A value of framerate*10 is a good choice normally.  framerate*10 is a good choice normally.
   
 motion_search determines the quality of motion search done by the codec.  
 The better the search, the smaller the files (or the better the quality).  
 Since low modes (1-3) are hardly faster than high modes (4,5) a value of  
 5 is a good choice normally. 6 is possible, but a little slower. If you  
 want absolutely highest quality, use 6.  
   
 lum_masking stand for "luminance masking" which is an experimental feature.  
 It tries to compress better by using facts about the human eye.  
 You might try to switch it on and decide yourself, if you gain anything from it.  
187    
188  quant_type is technical, is changes the way coefficient are quantized.   - handle
189  Both values are okay, though a value of 0 might be faster.  
190    This is the returned internal encoder instance.
191    
 Used in:    xerr = xvid_encore(NULL, XVID_ENC_CREATE, &xparam, NULL);  
192    
 --------------------------------------------------------------------------  
193    
194                  Chapter 4 : the XVID_ENC_FRAME structure.
195     +-----------------------------------------------------------------+
196    
197  typedef struct  typedef struct
198  {  {
199          void * bitstream;       // [in] bitstream ptr          int general;            [in]
200          int length;             // [out] bitstream length (bytes)          int motion;             [in]
201            void *bitstream;        [in]
202            int length;             [out]
203    
204            void *image;            [in]
205            int colorspace;         [in]
206    
207            unsigned char *quant_intra_matrix;  [in]
208            unsigned char *quant_inter_matrix;  [in]
209            int quant;                          [in]
210            int intra;                          [in/out]
211    
212            HINTINFO hint;                      [in/out]
213    }
214    XVID_ENC_FRAME;
215    
216    Used in:
217      xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);
218    
219    This is  the main structure to encode  a frame, it gives  hints to the
220    encoder on how to process an image.
221    
222     - general flag member.
223    
224    The general flag member informs XviD on general algorithm choices made
225    by the library client.
226    
227    Valid flags :
228    
229        - XVID_CUSTOM_QMATRIX  :  informs  xvid  to use  the  custom  user
230          matrices.
231    
232        - XVID_H263QUANT   :  informs  xvid   to  use   H263  quantization
233          algorithm.
234    
235        - XVID_MPEGQUANT   :  informs  xvid   to  use   MPEG  quantization
236          algorithm.
237    
238        - XVID_HALFPEL  : informs  xvid  to perform  a  half pixel  motion
239          estimation.
240    
241          void * image;           // [in] image ptr      - XVID_ADAPTIVEQUANT  :  informs  xvid  to perform  an  adaptative
242          int colorspace;         // [in] source colorspace        quantization.
243    
244          int quant;              // [in] frame quantizer (vbr)      - XVID_LUMIMASKING : infroms xvid to use a lumimasking algorithm.
         int intra;              // [in] force intra frame (vbr only)  
                                 // [out] intra state  
 } XVID_ENC_FRAME;  
245    
246        - XVID_LATEINTRA : ???
247    
248  The main structure to encode a frame: image points to the picture,      - XVID_INTERLACING  : informs  xvid  to use  the MPEG4  interlaced
249  in a format that is given by colorspace, e.g. XVID_CSP_RGB24 or        mode.
 XVID_CSP_YV12.  
250    
251  If you set quant=0, then the ratecontrol chooses quantizer for you.      - XVID_TOPFIELDFIRST : ???
 If quant!=0, then this value is used as quantizer, so make 1<=quant<=31.  
252    
253  intra decides where the frame is going to be a keyframe or not.      - XVID_ALTERNATESCAN : ???
   intra=1 means: make it a keyframe  
   intra=0 means: don't make it a keyframe  
   intra=-1 means: let encoder decide (based on contents and max_key_interval)  
254    
255  So for an ordinary encoding step, you would set quant=0 and intra=-1.      - XVID_HINTEDME_GET  : informs  xvid to  return  Motion Estimation
256          vectors from the ME encoder algorithm. Used during a first pass.
257    
258  The length of the MPEG4-bitstream is returned in length, and      - XVID_HINTEDME_SET :  informs xvid to  use the user  given motion
259  if you set intra to -1, it now contains the encoder's decision:        estimation vectors as hints  for the encoder ME algorithms. Used
260    0 for non-keyframe,        during a 2nd pass.
   1 for keyframe because of a scene change,  
   2 for keyframe because max_key_interval was reached.  
261    
262  Used in:    xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);      - XVID_INTER4V : forces XviD to search a vector for each 8x8 block
263          within the 16x16  Macro Block. This mode should  be used only if
264          the  XVID_HALFPEL mode is  activated (this  could change  in the
265          future).
266    
267        - XVID_ME_ZERO : forces XviD to use the zero ME algorithm.
268    
269        - XVID_ME_LOGARITHMIC  :  forces   XviD  to  use  the  logarithmic
270          ME algorithm.
271    
272        - XVID_ME_FULLSEARCH  : forces  XviD  to use  the  full search  ME
273          algorithm.
274    
275        - XVID_ME_PMVFAST : forces XviD to use the PMVFAST ME algorithm.
276    
277        - XVID_ME_EPZS : forces XviD to use the EPZS ME algorithm.
278    
279    ToDo :  fill the void entries  in flags, and describe  briefly each ME
280    algorithm.
281    
282     - motion member.
283    
284    Valid flags for  16x16 motion estimation (no XVID_INTER4V  flag in the
285    general flag).
286    
287        - PMV_ADVANCEDDIAMOND16  : XviD has  a modified  diamond algorithm
288          that performs a bit faster  than the original one. Use this flag
289          if  you want  to use  the  speed optimized  diamond serach.  The
290          quality loss is  not big (better quality than  square search but
291          less than the normal diamond search).
292    
293        - PMV_HALFPELDIAMOND16 : switches the search algorithm from 1 or 2
294          full pixels precision to 1 or 2 half pixel precision.
295    
296        - PMV_HALFPELREFINE16  :  After normal  diamond  search, an  extra
297          halfpel refinement step is  performed.  Should always be used if
298          XVID_HALFPEL is  on, because it  gives a rather big  increase in
299          quality.
300    
301        - PMV_EXTSEARCH16 :  Normal PMVfast predicts one  start vector and
302          does diamond search around this position. EXTSEARCH means that 2
303          more  start vectors  are used:  (0,0) and  median  predictor and
304          diamond search  is done for  those, too.  Makes  search slightly
305          slower, but quality sometimes gets better.
306    
307        - PMV_EARLYSTOP16 :  PMVfast and EPZS stop search  if current best
308          is  below some dynamic  threshhold. No  diamond search  is done,
309          only halfpel  refinement (if active).  Without EARLYSTOP diamond
310          search is always done. That would be much slower, but not really
311          lead to better quality.
312    
313        - PMV_QUICKSTOP16   :  like  EARLYSTOP,   but  not   even  halfpel
314          refinement is  done. Normally worse  quality, so it  defaults to
315          off. Might be removed, too.
316    
317        - PMV_UNRESTRICTED16   :  "unrestricted  ME"   is  a   feature  of
318          MPEG4. It's not  implemented, so this flag is  ignored (not even
319          checked).
320    
321        - PMV_OVERLAPPING16 :  same as unrestricted.  Not implemented, nor
322          checked.
323    
324        - PMV_USESQUARES16  : Replace  the  diamond search  with a  square
325          search.
326    
327    
328    Valid flags  when using 4 vectors  mode prediction. They  have the same
329    meaning as their 16x16 counter part so we only give the list :
330    
331        - PMV_ADVANCEDDIAMOND8
332        - PMV_HALFPELDIAMOND8
333        - PMV_HALFPELREFINE8
334        - PMV_EXTSEARCH8
335        - PMV_EARLYSTOP8
336        - PMV_QUICKSTOP8
337        - PMV_UNRESTRICTED8
338        - PMV_OVERLAPPING8
339        - PMV_USESQUARES8
340    
341     - quant member.
342    
343    The quantizer value  is used when the DCT  coefficients are divided to
344    zero those coefficients not important (according to the target bitrate
345    not the image quality :-)
346    
347    Valid values :
348    
349         - 0 (zero) : Then the  rate controler chooses the right quantizer
350           for you.  Tipically used in ABR encoding or first pass of a VBR
351           encoding session.
352    
353         - !=  0  :  Then you  force  the  encoder  to use  this  specific
354           quantizer   value.     It   is   clamped    in   the   interval
355           [1..31]. Tipically used  during the 2nd pass of  a VBR encoding
356           session.
357    
358     - intra member.
359    
360    [in usage]
361    The intra value  decides wether the frame is going to  be a keyframe or
362    not.
363    
364    Valid values :
365    
366        - 1 : forces the encoder  to create a keyframe. Mainly used during
367          a VBR 2nd pass.
368    
369        - 0 :  forces the  encoder not to  create a keyframe.  Minaly used
370          during a VBR second pass
371    
372        - -1   :  let   the  encoder   decide  (based   on   contents  and
373           max_key_interval). Mainly  used in ABR  mode and dunring  a 1st
374           VBR pass.
375    
376    [out usage]
377    
378    When first set to -1, the encoder returns the effective keyframe state
379    of the frame.
380    
381        - 0 : the resulting frame is not a keyframe
382    
383        - 1 : the resulting frame is a keyframe (scene change).
384    
385        - 2  : the resulting  frame is  a keyframe  (max_keyframe interval
386          reached)
387    
388     - quant_intra_matrix and quant_inter_matrix members.
389    
390    These are  pointers to  to a pair  of user quantization  matrices. You
391    must set the  general XVID_CUSTOM_QMATRIX flag to make  sure XviD uses
392    them.
393    
394    When set to NULL, the default XviD matrices are used.
395    
396    NB : each time the matrices  change, XviD must write a header into the
397    bitstream, so  try not changing  these matrices very often.  This will
398    save space.
399    
400    
401    
402                   Chapter 5 : The XVID_ENC_STATS structure
403     +-----------------------------------------------------------------+
404    
 --------------------------------------------------------------------------  
405    
406  typedef struct  typedef struct
407  {  {
# Line 151  Line 411 
411    
412  } XVID_ENC_STATS;  } XVID_ENC_STATS;
413    
414  In this structure the encoder return statistical data about the encoding  Used in:
415  process, e.g. to be saved for two-pass-encoding.    xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);
416  quant is the quantizer chosen for this frame (if you let ratecontrol do it)  
417  hlength is the length of the frame's header, including motion information etc.  In  this  structure the  encoder  return  statistical  data about  the
418  kblks, mblks, ublks are unused at the moment.  encoding process,  e.g. to be  saved for two-pass-encoding.   quant is
419    the quantizer  chosen for  this frame (if  you let ratecontrol  do it)
420    hlength  is  the  length  of  the  frame's  header,  including  motion
421    information etc.  kblks, mblks, ublks are unused at the moment.
422    
423    
424    
425  Used in:    xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);                   Chapter 6 : The xvid_encode function
426     +-----------------------------------------------------------------+
427    
 --------------------------------------------------------------------------  
428    
429  int xvid_encore(void * handle,  int xvid_encore(void * handle,
430                  int opt,                  int opt,
# Line 167  Line 432 
432                  void * param2);                  void * param2);
433    
434    
435  XviD uses a single-function API, so everything you want to do is done by  XviD uses a single-function API, so  everything you want to do is done
436  this routine. The opt parameter chooses the behaviour of the routine:  by  this routine.  The  opt  parameter chooses  the  behaviour of  the
437    routine:
438    
439  XVID_ENC_CREATE:   create a new encoder, XVID_ENC_PARAM in param1,  XVID_ENC_CREATE:  create a  new encoder,  XVID_ENC_PARAM in  param1, a
440                     a handle to the new encoder is returned in handle  handle to the new encoder is returned in handle.
441    
442  XVID_ENC_ENCODE:   encode one frame, XVID_ENC_FRAME-structure in param1,  XVID_ENC_ENCODE:   encode one frame, XVID_ENC_FRAME-structure in param1,
443                     XVID_ENC_STATS in param2 (or NULL, if you are not  XVID_ENC_STATS  in param2  (or  NULL,  if you  are  not interested  in
444                     interested in statistical data).  statistical data).
445    
446  XVID_DEC_DESTROY:  shut down this encoder, do not use handle afterwards  XVID_DEC_DESTROY: shut down this encoder, do not use handle afterwards.
447    

Legend:
Removed from v.3  
changed lines
  Added in v.247

No admin address has been configured
ViewVC Help
Powered by ViewVC 1.0.4