2015-03-01

Comparison of the colorspace conversion quality in Baka Encoder and Avisynth

tl;dr; Assuming that x264 / x265 build and encoding parameters are identical, encoding an RGB video with Baka Encoder will provide a better quality output compared to MeGUI and other tools utilizing Avisynth or libav for colorspace conversion. Here is a fragment of the error distribution occurred after encoding of the sample image (contrast enhanced):

Comparison of the error distribution occured after encoding of the sample image with the Avisynth and Baka Encoder.

It is really difficult to stand out among other x264 / x265 frontends. Development of the Baka Encoder was initially focused mostly on the usability. Back then I encoded directly with the shell scripts or used MeGUI as a frontend and so I tried to create a tool that will allow efficient manual editing of the encoding presets while providing a handy GUI to control encoding process. Old versions of the Baka Encoder were using a straight approach and let the internal x264 filters do the job of colorspace conversion and resampling. It was done with a help of swscale library which is embedded along with other libav components in many of the x264 builds. swscale provides acceptable quality for such a transformations but only in the YCbCr to RGB direction as it was initially meant to be used for decoding purposes only (just like all the libav and ffmpeg). It could only perform RGB -> YCbCr conversion using basic 8bit integer arithmetics and only with bt601 matrix. A simple solution would be to go MeGUI way and dump the burden of colorspace conversion on Avisynth, accepting only .avs scripts (or generating them) with a ConvertToYV12() function appended. But in it's turn Avisynth lacked the support for high bit depth, 4:4:4 and 4:2:2 subsampling (support for these have been added in recent unstable builds), and used the same basic integer arithmetics (with better rounding however). I failed to find any libraries properly implementing colorspace conversion and so I decided to write my own, with floating-point arithmetics and precise rounding.

YCbCr colorspace visual representation.

Starting from the version 1.2 Baka Encoder uses it's own colorspace conversion and resampling implementation. In the article I will describe a straightforward test to compare Baka Encoder colorspace conversion routine with ConvertToYV12() function available in Avisynth. The idea behind this test is quite simple: get some proper RGB video, convert it to YCbCr with 4:2:0 subsampling, perform a lossless encoding with x264, convert it back to RGB and compare with the original. As a source video I decided to use two generated gradient images, one with 640x360 resolution and another with 1920x1080 resolution, both containing full hue gradient fading to pure black and to pure white. I suppose it is a good choice for a colorspace conversion test as such a gradient covers most of the crucial colors while SD and HD resolution images allow both bt601 and bt709 matrices to be tested.

Sample gradient image, SD version.
Sample gradient image, SD version.

In order to make a video from this still image I'm going to use the following Avisynth script:

ImageSource(file_path, end = 25, use_DevIL=true)
AssumeFps(25)

It will output a one second of RGB video with 25 duplicated RGB frames. Baka Encoder will encode such a script automatically handling colorspace conversion using a proper matrix (unless overridden in the configuration file). To employ Avisynth ConvertToYV12 function is appended:

ConvertToYV12(matrix="rec601")

for SD video and

ConvertToYV12(matrix="rec709")

for HD video.

After encoding all the 4 videos using lossless preset in Baka Encoder v1.3.2 obtained .mp4 files are converted back to RGB with a help of Avisynth (again):

ffVideoSource(file_path)
ConvertToRGB(matrix="rec601")

or

ffVideoSource(file_path)
ConvertToRGB(matrix="rec709")

Combining output images with the original ones and narrowing input range to 0-7 in order to enhance visibility gives the following images (Avisynth is on the left, darker color - less error):

Difference with the original of the SD image converted with Avisynth.
Difference with the original of the SD image converted with Avisynth.
Difference with the original of the SD image converted with Baka Encoder.
Difference with the original of the SD image converted with Baka Encoder.

Difference with the original of the HD image converted with Avisynth.
Difference with the original of the HD image converted with Avisynth.
Difference with the original of the HD image converted with Baka Encoder.
Difference with the original of the HD image converted with Baka Encoder.

And here are statistics collected from these images:

Colorspace conversion test results.
Colorspace conversion test results.

The first thing to be noted is that since RGB to YCbCr conversion (especially with TV range) is lossy in both directions and downsampling from 4:4:4 to 4:2:0 is lossy as well output video is lossy even though lossless x264 preset was used. But at least we didn't get any losses from the codec itself. As stated in the table above only about 1/10 of the SD video pixels and about 1/5 of the HD video pixels preserved their exact value. This difference is mostly caused by the fact that in HD video neighboring pixels tend to have closer values compared to SD variant so croma downsampling doesn't affect them as much. Another conspicuous thing is colorful vertical stripes. They mark the shift of the error accumulation components occurring upon crossing of the hue subrange borders.

But what is more important is that in both cases colorspace conversion performed by the Baka Encoder gave better results, improving about 1% of the SD image pixels, about 4% of the HD image pixels and greatly reducing number of pixels with high error. Furthermore, Baka Encoder offers proper support for high bit depth video, which allows it to show even better results when dealing with 16 bpp inputs and 10 bpp x264 / x265 outputs.

All the files used in this test are available for download: