More questions about the Movie-JPEG format, please:
- Why was it decided to invent a format rather than use the existing
Motion-JPEG standard (which I didn't know about until Bolot sent me these URLs):
http://bmrc.berkeley.edu/research/cmt/versions/4.0/doc/cmtmjpeg/MJPEG_ chunkfile.html http://neptune.netcomp.monash.edu.au/cpe3013/MPEG/Reading/MJPEG/step1.htm
I developed and sold a commercial M-JPEG video codec for Windows platforms for many years so know HEAPS about M-JPEG formats and digital video in general. Hopefully this message can educate people about video formats, and point out the potholes.
The first issue is there are a bunch of M-JPEG formats, not just "the" standard like MPEG. The two most widely used M-JPEG formats are probably Microsoft AVI's using an Open DML compatable codec and Quicktime using M-JPEG A or B. The above links are some proprietary format that happen to use JPEG like compression. Also note that even though companies claim to conform to a "standard" often they don't, so lots of compatibility issues come up with M-JPEG.
Unlike MPEG, the file format (AVI or QuickTime) are very distinct from the codec format (the compressed frame format). The thing to do would be to write Squeak code that understood one or both of these file formats, and then also had some code that implemented codec's. File formats tend to be pretty stable, codec's change rapidly. The simplest codec format is uncompressed RGBA or YCrCb.
M-JPEG as a codec format has some significant limitations compared to newer formats like DV or i-Frame MPEG. Including:
- there is no universal M-JPEG format
- it's quite tricky to get constant data rates using M-JPEG, easily available JPEG code sets a "quality" factor before compression, which depending on the frame contents will give a large range of compressed frame sizes (pure random noise frames actually can be larger after compression)
- JPEG compression also has a single "quality" (the quantization factor) for the WHOLE image, which is one reason it's hard to generate constant data rates, this also degrades image quality for a given compressed size, because you can't allocate more bits to picture areas that have more detail, both MPEG and DV can change the quantization dynamically through the frame, for M-JPEG the easy strategy to make a constant data rate (or at least not above some data rate) is to compress the frame and then keep recompressing it (a binary search), adjusting the global quality until the frame is an ok size (not so good for performance, but predicting quality settings from previous frames helps, except for scene transitions that suddenly change the amount of picture detail), there also are some patented algorithms to estimate the correct quality setting to use based one samples of the data
- movies for actual video display, as opposed to computer display, are also generally interlaced, blindly compressing a "frame" vs. just a field (every other scan line) will often not work so well, M-JPEG typically compresses each field separately, concatenating the result as a "frame", DV and MPEG-2 have algorithmic support to deal with picture areas where the two fields have significant interframe motion, specifically they have alternative DCT's (a normal one and one that understand that alternating lines may not correlate) that get chosen on a 8x8 cell basis
For high quality video editing, nothing beats uncompressed fields. Most of the compression formats subsample the color resolution, which makes things like chroma keying not work so well. If you have to make multiple compression/decompression passes, most codec's also introduce ugly artifacts, as you build up layers for the final output. The downside to uncompressed video editing is high data rates. CPU loads are actually less than with compressed formats, but merging two uncompressed full quality data streams and writing an output stream is a total disk data rate of 62 MBytes/sec (for NTSC 29.97 fps*720x480*2 bytes/pixel (assuming YCrCb color space). Seeking is also super simple on uncompressed data, as all frames are a fixed size.
Also note that NTSC or PAL video is NOT square pixels. I should also add that fun things like gamma correction and color gamut mapping should be done to make high quality output. It's a LOT more complex than just taking your RGB animation and feeding it to a JPEG algorithm.
A BIG advantage of M-JPEG format is it's almost totally free of patent issues. I've been told DV format is also mostly not a patent issue. MPEG on the other hand is a patent mine field.
A very viable way to edit video might be to keep a shadow file of metadata for an i-Frame MPEG file (or other video file format). Frames could be decompressed using a standard MPEG decoder code. Frames could be assembled by seeking to the correct file offset based on the metadata. Output could be uncompressed or i-Frame MPEG. Deferring decompression of all the frames (or even just bypassing decompressing, by copying the input to the output) is best, but often not posible. There would be a reasonable fast pre-edit step to parse the input file into metatdata (no decompression, just finding the frame boundaries). Having pluggable file format's (MPEG flavors, QuickTime, AVI) and compression formats (MPEG I/II, DV, uncompressed, M-JPEG) would be best.
Other random thoughts on video:
- alpha is important, and generally ignored by most compressed formats, uncompressed+alpha is probably the ideal working video format
- gamma corrected YCrCb with 4:2:2 subsampling is closest to most native video formats, so doing effects in a YCrCbA color space might be desirable, I see 160 GByte disks were for sale at $299
- sound is a whole can of worms, consumer DV cameras use what's called unlocked audio, which means the number of sound samples varies for each frame, this confuses timebase logic, generally you have to make video run at the correct frame rate (about 29.97 fps for NTSC, but not exactly), and fixup the sound samples. Pro video devices often have a common clock for video and audio samples, so can keep the two synchronized.
- all these details can be ignored if of you just want to play little videos on your computer screen, if you want to produce video that shows up on the Discovery Channel, you have to get it right
- Jan