[svn] / branches / dev-api-4 / xvidcore / doc / xvid-encoder.txt Repository:
ViewVC logotype

Diff of /branches/dev-api-4/xvidcore/doc/xvid-encoder.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 3, Fri Mar 8 02:46:11 2002 UTC revision 239, Mon Jun 24 17:03:03 2002 UTC
# Line 1  Line 1 
1  /*************************************************************  +--------------------------------------------------------------------+
2   * Short explanation for the XviD data strutures and routines        Short explanation for the XviD data strutures and routines
  *  
  *                       encoding part  
  *  
  * if you have further questions, visit http://www.xvid.org  
  *  
  **************************************************************/  
3    
4  /* these are are structures/routines from xvid.h needed for encoding */                            The encoding part
5    
6  --------------------------------------------------------------------------         If you have further questions, visit http://www.xvid.org
7    +--------------------------------------------------------------------+
8    
9  #define API_VERSION ((1 << 16) | (0))  Document version :
10    $Id: xvid-encoder.txt,v 1.2 2002-06-24 17:03:03 edgomez Exp $
11    
12  This is the revision of the xvid.h file that you have in front of you. Check it against the library's version.  +--------------------------------------------------------------------+
13    | Abstract
14    +--------------------------------------------------------------------+
15    
16    This document presents the basic  structures and API of XviD. It tries
17    to explain how to use them  to obtain a simple profile compliant MPEG4
18    stream feeding the encoder with a sequence of frames.
19    
20    +-------------------------------------------------------------------+
21    | Document
22    +-------------------------------------------------------------------+
23    
24    
25    
26                         Chapter 1 : The XviD version
27     +-----------------------------------------------------------------+
28    
29    The  Xvid version  is defined  at library  compilation time  using the
30    constant defined in xvid.h
31    
32    #define API_VERSION ((2 << 16) | (1))
33    
34    Where 2 stands for the major XviD version, and 1 for the minor version
35    number.
36    
37    The current version  of the API is 2.1 and  should be incremented each
38    time   a  user   defined  structure   is   modified  (XVID_INIT_PARAM,
39    XVID_ENC_PARAM ... we will discuss about them later).
40    
41    When you're writing a program/library which uses the XviD library, you
42    must  check  your  XviD  API  version against  the  available  library
43    version.  We will  see how  to check  the version  number in  the next
44    chapter.
45    
46    
47    
48                       Chapter 2 : The XVID_INIT_PARAM
49     +-----------------------------------------------------------------+
50    
 --------------------------------------------------------------------------  
51    
52  typedef struct  typedef struct
53  {  {
# Line 24  Line 56 
56          int core_build;         [out]          int core_build;         [out]
57  } XVID_INIT_PARAM;  } XVID_INIT_PARAM;
58    
 This is filled by xvid_init with the correct CPU flags for initialization  
 (auto-detect), unless you pass flag to it (cpu_flags!=0). Do not use that  
 unless you really know what you are doing.  
 api_version can (should) be checked against API_VERSION, to see if you  
 have the right core library.  
   
59  Used in:  xvid_init(NULL, 0, &xinit, NULL);  Used in:  xvid_init(NULL, 0, &xinit, NULL);
60    
61  --------------------------------------------------------------------------  This tructure is  used and filled by the  xvid_init function depending
62    on the cpu_flags value.
63    
64    List of valid flags for the cpu_flags member :
65    
66     - XVID_CPU_MMX      : cpu feature
67     - XVID_CPU_MMXEXT   : cpu feature
68     - XVID_CPU_SSE      : cpu feature
69     - XVID_CPU_SSE2     : cpu feature
70     - XVID_CPU_3DNOW    : cpu feature
71     - XVID_CPU_3DNOWEXT : cpu feature
72     - XVID_CPU_TSC      : cpu feature
73     - XVID_CPU_IA64     : cpu feature
74     - XVID_CPU_CHKONLY  : command
75     - XVID_CPU_FORCE    : command
76    
77    In order to set a flag : xinit.cpu_flags |= desired_flag_constant;
78    
79    1st case : you call  xvid_init without setting the XVID_CPU_CHKONLY or
80    the XVID_CPU_FORCE flag, the xvid_init function detects auto magically
81    the host  cpu features and  fills the cpu_flags member.  The xvid_init
82    function also  performs all internal  function pointers initialization
83    according to deteced features and then returns XVID_ERR_OK.
84    
85    2nd case :  you call xvid_init setting the  XVID_CPU_CHKONLY flag, the
86    xvid_init function will  just detect the host cpu  features and return
87    XVID_ERR_OK without  initializing the internal  function pointers (NB:
88    The XviD library is not usable after such a call to xvid_init).
89    
90    3rd case  : you call  xvid_init with the cpu_flags  XVID_CPU_FORCE and
91    desired feature  flags set up  (eg : XVID_CPU_SSE |  XVID_CPU_MMX). In
92    this case you  force XviD to use the given cpu  features passed in the
93    cpu_flags member. Use this if you know what you're doing.
94    
95    NB for PowerPC  archs : the ppc arch has  not automatic detection, the
96    library must  be compiled  for a specific  ppc target using  the right
97    Makefile  (the  cpu_flags  is   irrevelevant  for  these  archs).  Use
98    Makefile.linuxppc   for   standard   ppc   optimized   functions   and
99    Makefile.linuxppc_altivec for altivec simd optimized functions.
100    
101    NB for IA64 archs : There's optimized ia64 assembly functions provided
102    in    the    library,    they     must    be    forced    using    the
103    XVID_CPU_FORCE|XVID_CPU_IA64 pair of flags.
104    
105    To check the  XviD library version against your  own XviD header file,
106    you have just to call the xvid_init function (no matter the cpu_flags)
107    and  compare   the  returnded  xinit.api_version   integer  with  your
108    API_VERSION number. The core_build build member is not relevant at the
109    moment but is reserved for future  use (when XviD would have reached a
110    certain stability in its API and releases).
111    
112    
113    
114                     Chapter 3 : XVID_ENC_PARAM structure
115     +-----------------------------------------------------------------+
116    
117    
118  typedef struct  typedef struct
119  {  {
120    int width, height;          int width, height;              [in]
121    int fincr;    // frame increment is relative to fbase          int fincr, fbase;               [in]
122    int fbase;    // so each frame takes "fincr/fbase" seconds          int rc_bitrate;                 [in]
123    int bitrate;  // the bitrate of the target encoded stream, in bits/second          int rc_reaction_delay_factor;   [in]
124    int rc_period;                // the intended rate control averaging period          int rc_averaging_period;        [in]
125    int rc_reaction_period;       // the reaction period for rate control          int rc_buffer;                  [in]
126    int rc_reaction_ratio;        // the ratio for down/up rate control          int max_quantizer;              [in]
127    int max_quantizer;    // the upper limit of the quantizer          int min_quantizer;              [in]
128    int min_quantizer;    // the lower limit of the quantizer          int max_key_interval;           [in]
129    int max_key_interval; // the maximum interval between key frames  
130    int motion_search;    // the quality of compression ( 1=fastest, 6=best )          void *handle;                   [out]
131    int lum_masking;      // lum masking on/off  }
132    int quant_type;       // 0=h.263, 1=mpeg4  XVID_ENC_PARAM;
   
   void * handle;        // [out] encoder instance handle  
133    
134  } XVID_ENC_PARAM;  Used in:    xerr = xvid_encore(NULL, XVID_ENC_CREATE, &xparam, NULL);
135    
136  This structure has to be filled to create a new encoding instance:  This structure has to be filled to create a new encoding instance:
137    
138  width and height are the size of the image to be encoded.   - width and height.
139    
140    They have to be set to the size of the image to be encoded.
141    
142     - fincr and fbase (<0 forces default value 25fps - [25,1]).
143    
144  fincr and fbase are the MPEG-way of defining the framerate.  They  are the  MPEG-way of  defining the  framerate.  If  you  have an
145  If you have an integer framerate, say 24, 25 or 30fps, use  integer framerate, say 24,  25 or 30fps, use fincr=1, fbase=framerate.
 fincr=1, fbase=framerate.  
146  However, if framerate is non-integer, like 23.996fps you can  However, if framerate is non-integer, like 23.996fps you can
147  e.g. multiply with 1000, getting fincr=1000 and fbase=23996,  e.g. multiply  with 1000,  getting fincr=1000 and  fbase=23996, giving
148  giving you integer values again.  you integer values again.
149    
150     - rc_bitrate (<0 forces default value : 900000).
151    
152    This  the desired  target bitrate.  XviD will  try to  do its  best to
153    respect this setting but keep in mind XviD is still in development and
154    it has not been tuned for very low bitrates.
155    
156     - Any other rc_xxxx parameter are for the bit rate controler in order
157       to  respect your  rc_bitrate setting  the best  it can.  (<0 forces
158       default values)
159    
160    Default's are good enough and you should not change them.
161    
162    ToDo :  describe briefly their impact  on the bit  rate variations and
163    the rc_bitrate setting respect.
164    
165     - min_quantizer and max_quantizer (<0 forces default values : 1,31).
166    
167    These  2 memebers limit  the range  of allowed  quantizers.  Normally,
168    quantizer's range is [1..31], so min=1 and max=31.
169    
170  rc-parameters are for ratecontrol, you don't have to change them,  NB : the HIGHER the quantizer, the LOWER the quality.
171  good defaults are       the HIGHER the quantizer, the HIGHER the compression ratio.
 rc_period = 2000    rc_reacton_period = 10    rc_reaction_ratio = 20  
   
 min_quantizer, max_quantizer limit the range of allowed quantizers.  
 normally quantizers range is [1..31], so min=1 and max=31.  
 !!! the HIGHER the quantizer, the LOWER the quality  !!!  
 !!! the HIGHER the quantizer, the HIGHER the compression ratio !!!  
   
 min_quant=1 is somewhat overkill, min_quant=2 is good enough  
 max_quant depends on what you encode, leave it with 31 or lower it  
 to something like 15 or 10 for better quality (but encoding with  
 very low bitrate might fail then).  
172    
173  max_key_interval is the maximum value of frames between two keyframe  min_quant=1 is somewhat overkill, min_quant=2 is good enough max_quant
174    depends on what you encode, leave  it with 31 or lower it to something
175    like 15 or  10 for better quality (but encoding  with very low bitrate
176    might fail then).
177    
178     - max_key_interval (<0 forces default value : 10*framerate == 10s)
179    
180    This   is  the  maximum   value  of   frames  between   two  keyframes
181  (I-frames). Keyframes are also inserted dynamically at scene breaks.  (I-frames). Keyframes are also inserted dynamically at scene breaks.
182  It is important to have some keyframes, even in longer scenes, if you  It is important to have some keyframes, even in longer scenes, if you
183  want to skip position in the resulting file, because skipping is only  want to skip position in the resulting file, because skipping is only
184  possible from one keyframe to the next. However, keyframes are much larger  possible from  one keyframe to  the next. However, keyframes  are much
185  than non-keyframes, so do not use too many of them.  larger than non-keyframes, so do not use too many of them.  A value of
186  A value of framerate*10 is a good choice normally.  framerate*10 is a good choice normally.
   
 motion_search determines the quality of motion search done by the codec.  
 The better the search, the smaller the files (or the better the quality).  
 Since low modes (1-3) are hardly faster than high modes (4,5) a value of  
 5 is a good choice normally. 6 is possible, but a little slower. If you  
 want absolutely highest quality, use 6.  
   
 lum_masking stand for "luminance masking" which is an experimental feature.  
 It tries to compress better by using facts about the human eye.  
 You might try to switch it on and decide yourself, if you gain anything from it.  
187    
188  quant_type is technical, is changes the way coefficient are quantized.   - handle
 Both values are okay, though a value of 0 might be faster.  
189    
190  Used in:    xerr = xvid_encore(NULL, XVID_ENC_CREATE, &xparam, NULL);  This is the returned internal encoder instance.
191    
 --------------------------------------------------------------------------  
192    
193    
194                  Chapter 4 : the XVID_ENC_FRAME structure.
195     +-----------------------------------------------------------------+
196    
197  typedef struct  typedef struct
198  {  {
199          void * bitstream;       // [in] bitstream ptr          int general;            [in]
200          int length;             // [out] bitstream length (bytes)          int motion;             [in]
201            void *bitstream;        [in]
202            int length;             [out]
203    
204            void *image;            [in]
205            int colorspace;         [in]
206    
207            unsigned char *quant_intra_matrix;  [in]
208            unsigned char *quant_inter_matrix;  [in]
209            int quant;                          [in]
210            int intra;                          [in/out]
211    
212            HINTINFO hint;                      [in/out]
213    }
214    XVID_ENC_FRAME;
215    
216    Used in:
217      xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);
218    
219    This is  the main structure to encode  a frame, it gives  hints to the
220    encoder on how to process an image.
221    
222     - general flag member.
223    
224          void * image;           // [in] image ptr  The general flag member informs XviD on general algorithm choices made
225          int colorspace;         // [in] source colorspace  by the library client.
226    
227          int quant;              // [in] frame quantizer (vbr)  Valid flags :
         int intra;              // [in] force intra frame (vbr only)  
                                 // [out] intra state  
 } XVID_ENC_FRAME;  
228    
229        - XVID_CUSTOM_QMATRIX  :  informs  xvid  to use  the  custom  user
230          matrices.
231    
232  The main structure to encode a frame: image points to the picture,      - XVID_H263QUANT   :  informs  xvid   to  use   H263  quantization
233  in a format that is given by colorspace, e.g. XVID_CSP_RGB24 or        algorithm.
 XVID_CSP_YV12.  
234    
235  If you set quant=0, then the ratecontrol chooses quantizer for you.      - XVID_MPEGQUANT   :  informs  xvid   to  use   MPEG  quantization
236  If quant!=0, then this value is used as quantizer, so make 1<=quant<=31.        algorithm.
237    
238  intra decides where the frame is going to be a keyframe or not.      - XVID_HALFPEL  : informs  xvid  to perform  a  half pixel  motion
239    intra=1 means: make it a keyframe        estimation.
   intra=0 means: don't make it a keyframe  
   intra=-1 means: let encoder decide (based on contents and max_key_interval)  
240    
241  So for an ordinary encoding step, you would set quant=0 and intra=-1.      - XVID_ADAPTIVEQUANT  :  informs  xvid  to perform  an  adaptative
242          quantization.
243    
244  The length of the MPEG4-bitstream is returned in length, and      - XVID_LUMIMASKING : infroms xvid to use a lumimasking algorithm.
 if you set intra to -1, it now contains the encoder's decision:  
   0 for non-keyframe,  
   1 for keyframe because of a scene change,  
   2 for keyframe because max_key_interval was reached.  
245    
246  Used in:    xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);      - XVID_LATEINTRA : ???
247    
248        - XVID_INTERLACING  : informs  xvid  to use  the MPEG4  interlaced
249          mode.
250    
251        - XVID_TOPFIELDFIRST : ???
252    
253        - XVID_ALTERNATESCAN : ???
254    
255        - XVID_HINTEDME_GET  : informs  xvid to  return  Motion Estimation
256          vectors from the ME encoder algorithm. Used during a first pass.
257    
258        - XVID_HINTEDME_SET :  informs xvid to  use the user  given motion
259          estimation vectors as hints  for the encoder ME algorithms. Used
260          during a 2nd pass.
261    
262        - XVID_INTER4V : ???
263    
264        - XVID_ME_ZERO : forces XviD to use the zero ME algorithm.
265    
266        - XVID_ME_LOGARITHMIC  :  forces   XviD  to  use  the  logarithmic
267          ME algorithm.
268    
269        - XVID_ME_FULLSEARCH  : forces  XviD  to use  the  full search  ME
270          algorithm.
271    
272        - XVID_ME_PMVFAST : forces XviD to use the PMVFAST ME algorithm.
273    
274        - XVID_ME_EPZS : forces XviD to use the EPZS ME algorithm.
275    
276    ToDo :  fill the void entries  in flags, and describe  briefly each ME
277    algorithm.
278    
279     - motion member.
280    
281    ToDo : add all the XVID_ME flags here and detail the effect of each
282    flag.
283    
284     - quant member.
285    
286    The quantizer value  is used when the DCT  coefficients are divided to
287    zero those coefficients not important (according to the target bitrate
288    not the image quality :-)
289    
290    Valid values :
291    
292         - 0 (zero) : Then the  rate controler chooses the right quantizer
293           for you.  Tipically used in ABR encoding or first pass of a VBR
294           encoding session.
295    
296         - !=  0  :  Then you  force  the  encoder  to use  this  specific
297           quantizer   value.     It   is   clamped    in   the   interval
298           [1..31]. Tipically used  during the 2nd pass of  a VBR encoding
299           session.
300    
301     - intra member.
302    
303    [in usage]
304    The intra value  decides wether the frame is going to  be a keyframe or
305    not.
306    
307    Valid values :
308    
309        - 1 : forces the encoder  to create a keyframe. Mainly used during
310          a VBR 2nd pass.
311    
312        - 0 :  forces the  encoder not to  create a keyframe.  Minaly used
313          during a VBR second pass
314    
315        - -1   :  let   the  encoder   decide  (based   on   contents  and
316           max_key_interval). Mainly  used in ABR  mode and dunring  a 1st
317           VBR pass.
318    
319    [out usage]
320    
321    When first set to -1, the encoder returns the effective keyframe state
322    of the frame.
323    
324        - 0 : the resulting frame is not a keyframe
325    
326        - 1 : the resulting frame is a keyframe (scene change).
327    
328        - 2  : the resulting  frame is  a keyframe  (max_keyframe interval
329          reached)
330    
331     - quant_intra_matrix and quant_inter_matrix members.
332    
333    These are  pointers to  to a pair  of user quantization  matrices. You
334    must set the  general XVID_CUSTOM_QMATRIX flag to make  sure XviD uses
335    them.
336    
337    When set to NULL, the default XviD matrices are used.
338    
339    NB : each time the matrices  change, XviD must write a header into the
340    bitstream, so  try not changing  these matrices very often.  This will
341    save space.
342    
343    
344    
345                   Chapter 5 : The XVID_ENC_STATS structure
346     +-----------------------------------------------------------------+
347    
 --------------------------------------------------------------------------  
348    
349  typedef struct  typedef struct
350  {  {
# Line 151  Line 354 
354    
355  } XVID_ENC_STATS;  } XVID_ENC_STATS;
356    
357  In this structure the encoder return statistical data about the encoding  Used in:
358  process, e.g. to be saved for two-pass-encoding.    xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);
359  quant is the quantizer chosen for this frame (if you let ratecontrol do it)  
360  hlength is the length of the frame's header, including motion information etc.  In  this  structure the  encoder  return  statistical  data about  the
361  kblks, mblks, ublks are unused at the moment.  encoding process,  e.g. to be  saved for two-pass-encoding.   quant is
362    the quantizer  chosen for  this frame (if  you let ratecontrol  do it)
363    hlength  is  the  length  of  the  frame's  header,  including  motion
364    information etc.  kblks, mblks, ublks are unused at the moment.
365    
366    
367    
368  Used in:    xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);                   Chapter 6 : The xvid_encode function
369     +-----------------------------------------------------------------+
370    
 --------------------------------------------------------------------------  
371    
372  int xvid_encore(void * handle,  int xvid_encore(void * handle,
373                  int opt,                  int opt,
# Line 167  Line 375 
375                  void * param2);                  void * param2);
376    
377    
378  XviD uses a single-function API, so everything you want to do is done by  XviD uses a single-function API, so  everything you want to do is done
379  this routine. The opt parameter chooses the behaviour of the routine:  by  this routine.  The  opt  parameter chooses  the  behaviour of  the
380    routine:
381    
382  XVID_ENC_CREATE:   create a new encoder, XVID_ENC_PARAM in param1,  XVID_ENC_CREATE:  create a  new encoder,  XVID_ENC_PARAM in  param1, a
383                     a handle to the new encoder is returned in handle  handle to the new encoder is returned in handle.
384    
385  XVID_ENC_ENCODE:   encode one frame, XVID_ENC_FRAME-structure in param1,  XVID_ENC_ENCODE:   encode one frame, XVID_ENC_FRAME-structure in param1,
386                     XVID_ENC_STATS in param2 (or NULL, if you are not  XVID_ENC_STATS  in param2  (or  NULL,  if you  are  not interested  in
387                     interested in statistical data).  statistical data).
388    
389  XVID_DEC_DESTROY:  shut down this encoder, do not use handle afterwards  XVID_DEC_DESTROY: shut down this encoder, do not use handle afterwards.
390    

Legend:
Removed from v.3  
changed lines
  Added in v.239

No admin address has been configured
ViewVC Help
Powered by ViewVC 1.0.4