1 |
/************************************************************* |
+--------------------------------------------------------------------+ |
2 |
* Short explanation for the XviD data strutures and routines |
Short explanation for the XviD data strutures and routines |
|
* |
|
|
* encoding part |
|
|
* |
|
|
* if you have further questions, visit http://www.xvid.org |
|
|
* |
|
|
**************************************************************/ |
|
3 |
|
|
4 |
/* these are are structures/routines from xvid.h needed for encoding */ |
The encoding part |
5 |
|
|
6 |
-------------------------------------------------------------------------- |
If you have further questions, visit http://www.xvid.org |
7 |
|
+--------------------------------------------------------------------+ |
8 |
|
|
9 |
#define API_VERSION ((1 << 16) | (0)) |
Document version : |
10 |
|
$Id: xvid-encoder.txt,v 1.2 2002-06-24 17:03:03 edgomez Exp $ |
11 |
|
|
12 |
This is the revision of the xvid.h file that you have in front of you. Check it against the library's version. |
+--------------------------------------------------------------------+ |
13 |
|
| Abstract |
14 |
|
+--------------------------------------------------------------------+ |
15 |
|
|
16 |
|
This document presents the basic structures and API of XviD. It tries |
17 |
|
to explain how to use them to obtain a simple profile compliant MPEG4 |
18 |
|
stream feeding the encoder with a sequence of frames. |
19 |
|
|
20 |
|
+-------------------------------------------------------------------+ |
21 |
|
| Document |
22 |
|
+-------------------------------------------------------------------+ |
23 |
|
|
24 |
|
|
25 |
|
|
26 |
|
Chapter 1 : The XviD version |
27 |
|
+-----------------------------------------------------------------+ |
28 |
|
|
29 |
|
The Xvid version is defined at library compilation time using the |
30 |
|
constant defined in xvid.h |
31 |
|
|
32 |
|
#define API_VERSION ((2 << 16) | (1)) |
33 |
|
|
34 |
|
Where 2 stands for the major XviD version, and 1 for the minor version |
35 |
|
number. |
36 |
|
|
37 |
|
The current version of the API is 2.1 and should be incremented each |
38 |
|
time a user defined structure is modified (XVID_INIT_PARAM, |
39 |
|
XVID_ENC_PARAM ... we will discuss about them later). |
40 |
|
|
41 |
|
When you're writing a program/library which uses the XviD library, you |
42 |
|
must check your XviD API version against the available library |
43 |
|
version. We will see how to check the version number in the next |
44 |
|
chapter. |
45 |
|
|
46 |
|
|
47 |
|
|
48 |
|
Chapter 2 : The XVID_INIT_PARAM |
49 |
|
+-----------------------------------------------------------------+ |
50 |
|
|
|
-------------------------------------------------------------------------- |
|
51 |
|
|
52 |
typedef struct |
typedef struct |
53 |
{ |
{ |
56 |
int core_build; [out] |
int core_build; [out] |
57 |
} XVID_INIT_PARAM; |
} XVID_INIT_PARAM; |
58 |
|
|
|
This is filled by xvid_init with the correct CPU flags for initialization |
|
|
(auto-detect), unless you pass flag to it (cpu_flags!=0). Do not use that |
|
|
unless you really know what you are doing. |
|
|
api_version can (should) be checked against API_VERSION, to see if you |
|
|
have the right core library. |
|
|
|
|
59 |
Used in: xvid_init(NULL, 0, &xinit, NULL); |
Used in: xvid_init(NULL, 0, &xinit, NULL); |
60 |
|
|
61 |
-------------------------------------------------------------------------- |
This tructure is used and filled by the xvid_init function depending |
62 |
|
on the cpu_flags value. |
63 |
|
|
64 |
|
List of valid flags for the cpu_flags member : |
65 |
|
|
66 |
|
- XVID_CPU_MMX : cpu feature |
67 |
|
- XVID_CPU_MMXEXT : cpu feature |
68 |
|
- XVID_CPU_SSE : cpu feature |
69 |
|
- XVID_CPU_SSE2 : cpu feature |
70 |
|
- XVID_CPU_3DNOW : cpu feature |
71 |
|
- XVID_CPU_3DNOWEXT : cpu feature |
72 |
|
- XVID_CPU_TSC : cpu feature |
73 |
|
- XVID_CPU_IA64 : cpu feature |
74 |
|
- XVID_CPU_CHKONLY : command |
75 |
|
- XVID_CPU_FORCE : command |
76 |
|
|
77 |
|
In order to set a flag : xinit.cpu_flags |= desired_flag_constant; |
78 |
|
|
79 |
|
1st case : you call xvid_init without setting the XVID_CPU_CHKONLY or |
80 |
|
the XVID_CPU_FORCE flag, the xvid_init function detects auto magically |
81 |
|
the host cpu features and fills the cpu_flags member. The xvid_init |
82 |
|
function also performs all internal function pointers initialization |
83 |
|
according to deteced features and then returns XVID_ERR_OK. |
84 |
|
|
85 |
|
2nd case : you call xvid_init setting the XVID_CPU_CHKONLY flag, the |
86 |
|
xvid_init function will just detect the host cpu features and return |
87 |
|
XVID_ERR_OK without initializing the internal function pointers (NB: |
88 |
|
The XviD library is not usable after such a call to xvid_init). |
89 |
|
|
90 |
|
3rd case : you call xvid_init with the cpu_flags XVID_CPU_FORCE and |
91 |
|
desired feature flags set up (eg : XVID_CPU_SSE | XVID_CPU_MMX). In |
92 |
|
this case you force XviD to use the given cpu features passed in the |
93 |
|
cpu_flags member. Use this if you know what you're doing. |
94 |
|
|
95 |
|
NB for PowerPC archs : the ppc arch has not automatic detection, the |
96 |
|
library must be compiled for a specific ppc target using the right |
97 |
|
Makefile (the cpu_flags is irrevelevant for these archs). Use |
98 |
|
Makefile.linuxppc for standard ppc optimized functions and |
99 |
|
Makefile.linuxppc_altivec for altivec simd optimized functions. |
100 |
|
|
101 |
|
NB for IA64 archs : There's optimized ia64 assembly functions provided |
102 |
|
in the library, they must be forced using the |
103 |
|
XVID_CPU_FORCE|XVID_CPU_IA64 pair of flags. |
104 |
|
|
105 |
|
To check the XviD library version against your own XviD header file, |
106 |
|
you have just to call the xvid_init function (no matter the cpu_flags) |
107 |
|
and compare the returnded xinit.api_version integer with your |
108 |
|
API_VERSION number. The core_build build member is not relevant at the |
109 |
|
moment but is reserved for future use (when XviD would have reached a |
110 |
|
certain stability in its API and releases). |
111 |
|
|
112 |
|
|
113 |
|
|
114 |
|
Chapter 3 : XVID_ENC_PARAM structure |
115 |
|
+-----------------------------------------------------------------+ |
116 |
|
|
117 |
|
|
118 |
typedef struct |
typedef struct |
119 |
{ |
{ |
120 |
int width, height; |
int width, height; [in] |
121 |
int fincr; // frame increment is relative to fbase |
int fincr, fbase; [in] |
122 |
int fbase; // so each frame takes "fincr/fbase" seconds |
int rc_bitrate; [in] |
123 |
int bitrate; // the bitrate of the target encoded stream, in bits/second |
int rc_reaction_delay_factor; [in] |
124 |
int rc_period; // the intended rate control averaging period |
int rc_averaging_period; [in] |
125 |
int rc_reaction_period; // the reaction period for rate control |
int rc_buffer; [in] |
126 |
int rc_reaction_ratio; // the ratio for down/up rate control |
int max_quantizer; [in] |
127 |
int max_quantizer; // the upper limit of the quantizer |
int min_quantizer; [in] |
128 |
int min_quantizer; // the lower limit of the quantizer |
int max_key_interval; [in] |
129 |
int max_key_interval; // the maximum interval between key frames |
|
130 |
int motion_search; // the quality of compression ( 1=fastest, 6=best ) |
void *handle; [out] |
131 |
int lum_masking; // lum masking on/off |
} |
132 |
int quant_type; // 0=h.263, 1=mpeg4 |
XVID_ENC_PARAM; |
|
|
|
|
void * handle; // [out] encoder instance handle |
|
133 |
|
|
134 |
} XVID_ENC_PARAM; |
Used in: xerr = xvid_encore(NULL, XVID_ENC_CREATE, &xparam, NULL); |
135 |
|
|
136 |
This structure has to be filled to create a new encoding instance: |
This structure has to be filled to create a new encoding instance: |
137 |
|
|
138 |
width and height are the size of the image to be encoded. |
- width and height. |
139 |
|
|
140 |
|
They have to be set to the size of the image to be encoded. |
141 |
|
|
142 |
|
- fincr and fbase (<0 forces default value 25fps - [25,1]). |
143 |
|
|
144 |
fincr and fbase are the MPEG-way of defining the framerate. |
They are the MPEG-way of defining the framerate. If you have an |
145 |
If you have an integer framerate, say 24, 25 or 30fps, use |
integer framerate, say 24, 25 or 30fps, use fincr=1, fbase=framerate. |
|
fincr=1, fbase=framerate. |
|
146 |
However, if framerate is non-integer, like 23.996fps you can |
However, if framerate is non-integer, like 23.996fps you can |
147 |
e.g. multiply with 1000, getting fincr=1000 and fbase=23996, |
e.g. multiply with 1000, getting fincr=1000 and fbase=23996, giving |
148 |
giving you integer values again. |
you integer values again. |
149 |
|
|
150 |
|
- rc_bitrate (<0 forces default value : 900000). |
151 |
|
|
152 |
|
This the desired target bitrate. XviD will try to do its best to |
153 |
|
respect this setting but keep in mind XviD is still in development and |
154 |
|
it has not been tuned for very low bitrates. |
155 |
|
|
156 |
|
- Any other rc_xxxx parameter are for the bit rate controler in order |
157 |
|
to respect your rc_bitrate setting the best it can. (<0 forces |
158 |
|
default values) |
159 |
|
|
160 |
|
Default's are good enough and you should not change them. |
161 |
|
|
162 |
|
ToDo : describe briefly their impact on the bit rate variations and |
163 |
|
the rc_bitrate setting respect. |
164 |
|
|
165 |
|
- min_quantizer and max_quantizer (<0 forces default values : 1,31). |
166 |
|
|
167 |
|
These 2 memebers limit the range of allowed quantizers. Normally, |
168 |
|
quantizer's range is [1..31], so min=1 and max=31. |
169 |
|
|
170 |
rc-parameters are for ratecontrol, you don't have to change them, |
NB : the HIGHER the quantizer, the LOWER the quality. |
171 |
good defaults are |
the HIGHER the quantizer, the HIGHER the compression ratio. |
|
rc_period = 2000 rc_reacton_period = 10 rc_reaction_ratio = 20 |
|
|
|
|
|
min_quantizer, max_quantizer limit the range of allowed quantizers. |
|
|
normally quantizers range is [1..31], so min=1 and max=31. |
|
|
!!! the HIGHER the quantizer, the LOWER the quality !!! |
|
|
!!! the HIGHER the quantizer, the HIGHER the compression ratio !!! |
|
|
|
|
|
min_quant=1 is somewhat overkill, min_quant=2 is good enough |
|
|
max_quant depends on what you encode, leave it with 31 or lower it |
|
|
to something like 15 or 10 for better quality (but encoding with |
|
|
very low bitrate might fail then). |
|
172 |
|
|
173 |
max_key_interval is the maximum value of frames between two keyframe |
min_quant=1 is somewhat overkill, min_quant=2 is good enough max_quant |
174 |
|
depends on what you encode, leave it with 31 or lower it to something |
175 |
|
like 15 or 10 for better quality (but encoding with very low bitrate |
176 |
|
might fail then). |
177 |
|
|
178 |
|
- max_key_interval (<0 forces default value : 10*framerate == 10s) |
179 |
|
|
180 |
|
This is the maximum value of frames between two keyframes |
181 |
(I-frames). Keyframes are also inserted dynamically at scene breaks. |
(I-frames). Keyframes are also inserted dynamically at scene breaks. |
182 |
It is important to have some keyframes, even in longer scenes, if you |
It is important to have some keyframes, even in longer scenes, if you |
183 |
want to skip position in the resulting file, because skipping is only |
want to skip position in the resulting file, because skipping is only |
184 |
possible from one keyframe to the next. However, keyframes are much larger |
possible from one keyframe to the next. However, keyframes are much |
185 |
than non-keyframes, so do not use too many of them. |
larger than non-keyframes, so do not use too many of them. A value of |
186 |
A value of framerate*10 is a good choice normally. |
framerate*10 is a good choice normally. |
|
|
|
|
motion_search determines the quality of motion search done by the codec. |
|
|
The better the search, the smaller the files (or the better the quality). |
|
|
Since low modes (1-3) are hardly faster than high modes (4,5) a value of |
|
|
5 is a good choice normally. 6 is possible, but a little slower. If you |
|
|
want absolutely highest quality, use 6. |
|
|
|
|
|
lum_masking stand for "luminance masking" which is an experimental feature. |
|
|
It tries to compress better by using facts about the human eye. |
|
|
You might try to switch it on and decide yourself, if you gain anything from it. |
|
187 |
|
|
188 |
quant_type is technical, is changes the way coefficient are quantized. |
- handle |
|
Both values are okay, though a value of 0 might be faster. |
|
189 |
|
|
190 |
Used in: xerr = xvid_encore(NULL, XVID_ENC_CREATE, &xparam, NULL); |
This is the returned internal encoder instance. |
191 |
|
|
|
-------------------------------------------------------------------------- |
|
192 |
|
|
193 |
|
|
194 |
|
Chapter 4 : the XVID_ENC_FRAME structure. |
195 |
|
+-----------------------------------------------------------------+ |
196 |
|
|
197 |
typedef struct |
typedef struct |
198 |
{ |
{ |
199 |
void * bitstream; // [in] bitstream ptr |
int general; [in] |
200 |
int length; // [out] bitstream length (bytes) |
int motion; [in] |
201 |
|
void *bitstream; [in] |
202 |
|
int length; [out] |
203 |
|
|
204 |
|
void *image; [in] |
205 |
|
int colorspace; [in] |
206 |
|
|
207 |
|
unsigned char *quant_intra_matrix; [in] |
208 |
|
unsigned char *quant_inter_matrix; [in] |
209 |
|
int quant; [in] |
210 |
|
int intra; [in/out] |
211 |
|
|
212 |
|
HINTINFO hint; [in/out] |
213 |
|
} |
214 |
|
XVID_ENC_FRAME; |
215 |
|
|
216 |
|
Used in: |
217 |
|
xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats); |
218 |
|
|
219 |
|
This is the main structure to encode a frame, it gives hints to the |
220 |
|
encoder on how to process an image. |
221 |
|
|
222 |
|
- general flag member. |
223 |
|
|
224 |
void * image; // [in] image ptr |
The general flag member informs XviD on general algorithm choices made |
225 |
int colorspace; // [in] source colorspace |
by the library client. |
226 |
|
|
227 |
int quant; // [in] frame quantizer (vbr) |
Valid flags : |
|
int intra; // [in] force intra frame (vbr only) |
|
|
// [out] intra state |
|
|
} XVID_ENC_FRAME; |
|
228 |
|
|
229 |
|
- XVID_CUSTOM_QMATRIX : informs xvid to use the custom user |
230 |
|
matrices. |
231 |
|
|
232 |
The main structure to encode a frame: image points to the picture, |
- XVID_H263QUANT : informs xvid to use H263 quantization |
233 |
in a format that is given by colorspace, e.g. XVID_CSP_RGB24 or |
algorithm. |
|
XVID_CSP_YV12. |
|
234 |
|
|
235 |
If you set quant=0, then the ratecontrol chooses quantizer for you. |
- XVID_MPEGQUANT : informs xvid to use MPEG quantization |
236 |
If quant!=0, then this value is used as quantizer, so make 1<=quant<=31. |
algorithm. |
237 |
|
|
238 |
intra decides where the frame is going to be a keyframe or not. |
- XVID_HALFPEL : informs xvid to perform a half pixel motion |
239 |
intra=1 means: make it a keyframe |
estimation. |
|
intra=0 means: don't make it a keyframe |
|
|
intra=-1 means: let encoder decide (based on contents and max_key_interval) |
|
240 |
|
|
241 |
So for an ordinary encoding step, you would set quant=0 and intra=-1. |
- XVID_ADAPTIVEQUANT : informs xvid to perform an adaptative |
242 |
|
quantization. |
243 |
|
|
244 |
The length of the MPEG4-bitstream is returned in length, and |
- XVID_LUMIMASKING : infroms xvid to use a lumimasking algorithm. |
|
if you set intra to -1, it now contains the encoder's decision: |
|
|
0 for non-keyframe, |
|
|
1 for keyframe because of a scene change, |
|
|
2 for keyframe because max_key_interval was reached. |
|
245 |
|
|
246 |
Used in: xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats); |
- XVID_LATEINTRA : ??? |
247 |
|
|
248 |
|
- XVID_INTERLACING : informs xvid to use the MPEG4 interlaced |
249 |
|
mode. |
250 |
|
|
251 |
|
- XVID_TOPFIELDFIRST : ??? |
252 |
|
|
253 |
|
- XVID_ALTERNATESCAN : ??? |
254 |
|
|
255 |
|
- XVID_HINTEDME_GET : informs xvid to return Motion Estimation |
256 |
|
vectors from the ME encoder algorithm. Used during a first pass. |
257 |
|
|
258 |
|
- XVID_HINTEDME_SET : informs xvid to use the user given motion |
259 |
|
estimation vectors as hints for the encoder ME algorithms. Used |
260 |
|
during a 2nd pass. |
261 |
|
|
262 |
|
- XVID_INTER4V : ??? |
263 |
|
|
264 |
|
- XVID_ME_ZERO : forces XviD to use the zero ME algorithm. |
265 |
|
|
266 |
|
- XVID_ME_LOGARITHMIC : forces XviD to use the logarithmic |
267 |
|
ME algorithm. |
268 |
|
|
269 |
|
- XVID_ME_FULLSEARCH : forces XviD to use the full search ME |
270 |
|
algorithm. |
271 |
|
|
272 |
|
- XVID_ME_PMVFAST : forces XviD to use the PMVFAST ME algorithm. |
273 |
|
|
274 |
|
- XVID_ME_EPZS : forces XviD to use the EPZS ME algorithm. |
275 |
|
|
276 |
|
ToDo : fill the void entries in flags, and describe briefly each ME |
277 |
|
algorithm. |
278 |
|
|
279 |
|
- motion member. |
280 |
|
|
281 |
|
ToDo : add all the XVID_ME flags here and detail the effect of each |
282 |
|
flag. |
283 |
|
|
284 |
|
- quant member. |
285 |
|
|
286 |
|
The quantizer value is used when the DCT coefficients are divided to |
287 |
|
zero those coefficients not important (according to the target bitrate |
288 |
|
not the image quality :-) |
289 |
|
|
290 |
|
Valid values : |
291 |
|
|
292 |
|
- 0 (zero) : Then the rate controler chooses the right quantizer |
293 |
|
for you. Tipically used in ABR encoding or first pass of a VBR |
294 |
|
encoding session. |
295 |
|
|
296 |
|
- != 0 : Then you force the encoder to use this specific |
297 |
|
quantizer value. It is clamped in the interval |
298 |
|
[1..31]. Tipically used during the 2nd pass of a VBR encoding |
299 |
|
session. |
300 |
|
|
301 |
|
- intra member. |
302 |
|
|
303 |
|
[in usage] |
304 |
|
The intra value decides wether the frame is going to be a keyframe or |
305 |
|
not. |
306 |
|
|
307 |
|
Valid values : |
308 |
|
|
309 |
|
- 1 : forces the encoder to create a keyframe. Mainly used during |
310 |
|
a VBR 2nd pass. |
311 |
|
|
312 |
|
- 0 : forces the encoder not to create a keyframe. Minaly used |
313 |
|
during a VBR second pass |
314 |
|
|
315 |
|
- -1 : let the encoder decide (based on contents and |
316 |
|
max_key_interval). Mainly used in ABR mode and dunring a 1st |
317 |
|
VBR pass. |
318 |
|
|
319 |
|
[out usage] |
320 |
|
|
321 |
|
When first set to -1, the encoder returns the effective keyframe state |
322 |
|
of the frame. |
323 |
|
|
324 |
|
- 0 : the resulting frame is not a keyframe |
325 |
|
|
326 |
|
- 1 : the resulting frame is a keyframe (scene change). |
327 |
|
|
328 |
|
- 2 : the resulting frame is a keyframe (max_keyframe interval |
329 |
|
reached) |
330 |
|
|
331 |
|
- quant_intra_matrix and quant_inter_matrix members. |
332 |
|
|
333 |
|
These are pointers to to a pair of user quantization matrices. You |
334 |
|
must set the general XVID_CUSTOM_QMATRIX flag to make sure XviD uses |
335 |
|
them. |
336 |
|
|
337 |
|
When set to NULL, the default XviD matrices are used. |
338 |
|
|
339 |
|
NB : each time the matrices change, XviD must write a header into the |
340 |
|
bitstream, so try not changing these matrices very often. This will |
341 |
|
save space. |
342 |
|
|
343 |
|
|
344 |
|
|
345 |
|
Chapter 5 : The XVID_ENC_STATS structure |
346 |
|
+-----------------------------------------------------------------+ |
347 |
|
|
|
-------------------------------------------------------------------------- |
|
348 |
|
|
349 |
typedef struct |
typedef struct |
350 |
{ |
{ |
354 |
|
|
355 |
} XVID_ENC_STATS; |
} XVID_ENC_STATS; |
356 |
|
|
357 |
In this structure the encoder return statistical data about the encoding |
Used in: |
358 |
process, e.g. to be saved for two-pass-encoding. |
xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats); |
359 |
quant is the quantizer chosen for this frame (if you let ratecontrol do it) |
|
360 |
hlength is the length of the frame's header, including motion information etc. |
In this structure the encoder return statistical data about the |
361 |
kblks, mblks, ublks are unused at the moment. |
encoding process, e.g. to be saved for two-pass-encoding. quant is |
362 |
|
the quantizer chosen for this frame (if you let ratecontrol do it) |
363 |
|
hlength is the length of the frame's header, including motion |
364 |
|
information etc. kblks, mblks, ublks are unused at the moment. |
365 |
|
|
366 |
|
|
367 |
|
|
368 |
Used in: xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats); |
Chapter 6 : The xvid_encode function |
369 |
|
+-----------------------------------------------------------------+ |
370 |
|
|
|
-------------------------------------------------------------------------- |
|
371 |
|
|
372 |
int xvid_encore(void * handle, |
int xvid_encore(void * handle, |
373 |
int opt, |
int opt, |
375 |
void * param2); |
void * param2); |
376 |
|
|
377 |
|
|
378 |
XviD uses a single-function API, so everything you want to do is done by |
XviD uses a single-function API, so everything you want to do is done |
379 |
this routine. The opt parameter chooses the behaviour of the routine: |
by this routine. The opt parameter chooses the behaviour of the |
380 |
|
routine: |
381 |
|
|
382 |
XVID_ENC_CREATE: create a new encoder, XVID_ENC_PARAM in param1, |
XVID_ENC_CREATE: create a new encoder, XVID_ENC_PARAM in param1, a |
383 |
a handle to the new encoder is returned in handle |
handle to the new encoder is returned in handle. |
384 |
|
|
385 |
XVID_ENC_ENCODE: encode one frame, XVID_ENC_FRAME-structure in param1, |
XVID_ENC_ENCODE: encode one frame, XVID_ENC_FRAME-structure in param1, |
386 |
XVID_ENC_STATS in param2 (or NULL, if you are not |
XVID_ENC_STATS in param2 (or NULL, if you are not interested in |
387 |
interested in statistical data). |
statistical data). |
388 |
|
|
389 |
XVID_DEC_DESTROY: shut down this encoder, do not use handle afterwards |
XVID_DEC_DESTROY: shut down this encoder, do not use handle afterwards. |
390 |
|
|