A quick explaination of an MPEG "GOP", or, "Group Of Pictures",
GOP - Begins with an "I" frame, followed usually by a number of "P" and "B" frames (divx5 only uses B frames I believe)
- each GOP is independant: all frames needed for predictions are
contained within each GOP
- GOP's can be as small as a single I frame, or as large as desired, but usually no more than 15 frames in length.
- the longer the GOP, the more efficient, but less rubust the
coding
I frame - "Intra-coded" frames : average 7:1 reduction.
- like JPEG, every video frame is broken into blocks of 8x8
pixels of Y, R-Y, and B-Y (although, I am not sure how
this "1/4 pixels" divx5 has plays into all this)
- blocks are grouped into "macroblocks" of 16x16
- macroblocks are grouped horizontally into slices which
have similar average block levels.
- multiple slices form a frame, and these frames are the
resulting "I" frames.
P frame - P frames are predicted based on prior I or P frames plus
the addition of data for changed macroblocks.
- average about 20:1 reduction, or about half the size of I
frames - I don't think divx5 uses these, MPEG2 does though.
B frame - Bidirectionally predicted frames based on appearance and positions of past and future frames macroblocks.
- B frames require less data than P frames, averaging about
50:1 reduction.
- B frames require more decoder buffer memory because 2
frames are compared during the reconstruction process.
- B frames also require manipulation of the coding order:
frames moving from the coder to the decoder are NOT in
presentation sequence.
Basically, the the B frame will say something like "this frame is the same as the GOP's "I" frame except this one part, I will only contain the data needed to encode this one part, and combine it with the info from the I frame", in laymen's terms of course. This give DivX5 it's optimal reduction capability. This also means of course, that your P3500 media box in you living room might struggle with decoding a high rate D5 encode (not sure about that, but D5 is a more intense encoding/decoding process, but DVD's use I, P, and B frames, sooooooo...
Oh, BTW, in MPEG2 at least, a GOP order is always IPBBPBBPBBIPBBPBB etc etc. (pending on your GOP size), but it is always 1 I, 1 P, and 2 B's, then you can stack more groups of "PBB"'s in that one GOP if needed (usually up to 15 total frames.
Note: this has no role in 'fps'
One thing though:
DivX5 doesn't use only B-frames, it uses P-frames as well like its predecessor. Contrary to MPEG2, DivX5 uses a "PB" sequence chain instead of "PBB". The latter would result in better compression, but the way I understand it the avi format won't work correctly with more that one sequential B-frame.
Still, using "PB" instead of only P-frames results in a serious size decrease already, so it was definately worth it.
To be entirely accurate it uses "BP" grouping (i.e. frame n is B and frame n+1 is P) but it appears as "PB" because you cant decode a frame until the frames it predicted off are decoded.
And so p frames have to be decoded before the B frames between them. 1/4 Pixel accuracy is used in motion estimation to get the best fit for each macro-block. say you have a panning camera and each frame the picture hasnt moved an entire number of pixels accross the screen then 1/4 accuracy helps get the predicted macro-blocks in a much better possition. This isnt used in I-Frames because motion estimation isnt used in I-frames because it is basically a JPEG encoded frame using a different Quant value (typically 16).
I dont think MPEG-2 uses more than 1 B frame but i could be wrong.
The use of B-Frames does improve the compression/quality of the codec by a considerable amount but requires a lot more motion estimation (a large portion of hte encoding time) to be done hence hte much longer encoding times. B frames are typically 1/2 the size of P-Frames but require more encoding and decoding.
++++++++++++++++++++++++
L A M E R ! ! !
GOP - Begins with an "I" frame, followed usually by a number of "P" and "B" frames (divx5 only uses B frames I believe)
- each GOP is independant: all frames needed for predictions are
contained within each GOP
- GOP's can be as small as a single I frame, or as large as desired, but usually no more than 15 frames in length.
- the longer the GOP, the more efficient, but less rubust the
coding
I frame - "Intra-coded" frames : average 7:1 reduction.
- like JPEG, every video frame is broken into blocks of 8x8
pixels of Y, R-Y, and B-Y (although, I am not sure how
this "1/4 pixels" divx5 has plays into all this)
- blocks are grouped into "macroblocks" of 16x16
- macroblocks are grouped horizontally into slices which
have similar average block levels.
- multiple slices form a frame, and these frames are the
resulting "I" frames.
P frame - P frames are predicted based on prior I or P frames plus
the addition of data for changed macroblocks.
- average about 20:1 reduction, or about half the size of I
frames - I don't think divx5 uses these, MPEG2 does though.
B frame - Bidirectionally predicted frames based on appearance and positions of past and future frames macroblocks.
- B frames require less data than P frames, averaging about
50:1 reduction.
- B frames require more decoder buffer memory because 2
frames are compared during the reconstruction process.
- B frames also require manipulation of the coding order:
frames moving from the coder to the decoder are NOT in
presentation sequence.
Basically, the the B frame will say something like "this frame is the same as the GOP's "I" frame except this one part, I will only contain the data needed to encode this one part, and combine it with the info from the I frame", in laymen's terms of course. This give DivX5 it's optimal reduction capability. This also means of course, that your P3500 media box in you living room might struggle with decoding a high rate D5 encode (not sure about that, but D5 is a more intense encoding/decoding process, but DVD's use I, P, and B frames, sooooooo...
Oh, BTW, in MPEG2 at least, a GOP order is always IPBBPBBPBBIPBBPBB etc etc. (pending on your GOP size), but it is always 1 I, 1 P, and 2 B's, then you can stack more groups of "PBB"'s in that one GOP if needed (usually up to 15 total frames.
Note: this has no role in 'fps'
One thing though:
DivX5 doesn't use only B-frames, it uses P-frames as well like its predecessor. Contrary to MPEG2, DivX5 uses a "PB" sequence chain instead of "PBB". The latter would result in better compression, but the way I understand it the avi format won't work correctly with more that one sequential B-frame.
Still, using "PB" instead of only P-frames results in a serious size decrease already, so it was definately worth it.
To be entirely accurate it uses "BP" grouping (i.e. frame n is B and frame n+1 is P) but it appears as "PB" because you cant decode a frame until the frames it predicted off are decoded.
And so p frames have to be decoded before the B frames between them. 1/4 Pixel accuracy is used in motion estimation to get the best fit for each macro-block. say you have a panning camera and each frame the picture hasnt moved an entire number of pixels accross the screen then 1/4 accuracy helps get the predicted macro-blocks in a much better possition. This isnt used in I-Frames because motion estimation isnt used in I-frames because it is basically a JPEG encoded frame using a different Quant value (typically 16).
I dont think MPEG-2 uses more than 1 B frame but i could be wrong.
The use of B-Frames does improve the compression/quality of the codec by a considerable amount but requires a lot more motion estimation (a large portion of hte encoding time) to be done hence hte much longer encoding times. B frames are typically 1/2 the size of P-Frames but require more encoding and decoding.
++++++++++++++++++++++++
L A M E R ! ! !
Comment