HEVC (x265) encoding guide

HEVC (x265) encoding guide

The HEVC-Codec (High-Efficiency-Video-Coding) known has H.265 is the de-facto standard when it comes to encoding videos in bitrate-starved conditions. One may argue that you encounter bitrate-starved conditions almost anywhere and for the most part this is true, but it becomes especially apparent when converting high-resolution content, such as 4k (2160p) videos. There is a definite advantage over the older H.264. The latest implementation can achieve the same level of quality at half the bitrate when compared to H.264.

Anybody can encode videos using the HEVC codec using the open-source software implementation known as x265. I used Handbrake along with its HEVC encoder version 2.5. All encodes were performed on a Ryzen 1800x running at 3.8 GHz at maximum idle-CPU. Keep in mind the RAM footprint increases with the frame size of the video content. For 2160p videos, I’d recommend at last 8 GB of RAM.

Constant quality vs average bitrate

The debate over which to use is more a question of commitment. If you are aiming for a specific filesize then calculating the bitrate is merely a matter of dividing the final size by the length of the video in seconds. Also using two passes increases the final output quality as more effort is being put into motion estimation and frame-ahead encoding. However the end result may not look as good as constant-quality encoding, as the average calculated bitrate is applied over the entire length of the film. Complex scenes with high motion will receive less bitrate and still or monotonous scenes will receive a higher bitrate than needed compared to constant-quality mode. That is why I recommend using constant quality over average bitrate.

CRF values and presets explained

The CRF values are logarithmic in nature in relation to file size with 0 applying the lowest (lossless) and 51 the highest compression. Near indistinguishable quality is achieved at 10 where more practical values are encountered between 16 and 24. Lower CRF values should be applied to 720p content, and higher values to 1080p to 2160p content, as file size can get out of hand pretty quickly. Also keep in mind which output device you are aiming content for. If the video is likely to be viewed on a small screen such as a tablet or smartphone, you can (and should!) get along with higher CRF values. Blocking becomes a factor at higher CRF, however this is hardly noticeable on a small screen. Also if your video source has a high degree of action, then a lower CRF value will preserve more detail. I shall include scenes with high motion for you to try out.

There are 10 known presets which all impact encoding time to a higher and file size to a lower degree. (placebo, veryslow, slower, slow, medium, fast, faster, veryfast, superfast, ultrafast) with placebo taking the longest time and ultrafast being the quickest method. Both of these are completely impractical, but if you happen to own a CPU cluster, then placebo is the way to go. Still the difference is superficial. The methods apply different algorithms which vary in complexity, but for the average user it’s a question of CPU investment vs. time to spare. Setting the method to an encoding frame rate you find bearable is the best choice. The encoding times increase exponentially at slow to slower, as can be seen in the charts.

Tune grain or tune none?

One more setting has a very noticeable impact on encoding quality. The Encoder Tune allows for the following options: None, Film, Animation, Grain, Still Image, PSNR, SSIM, Zero Latency. Personally I use None or Grain depending on film source. As most modern content is shot digitally, film grain is virtually non-existent. However 4K-scans of older 35 mm or IMAX 70 mm reels preserve the film grain which inherently results from the resolution of the silver-halide crystals. Most studios preserve this effect and never apply de-noising algorithms and sometimes add it digitally. However encoding grainy content without the “grain” encoder tune setting will result in less than satisfying results. The “grain”-setting is not designed to eliminate or preserve film grain, but rather to prevent the artifacts that result due to the compression. If your source is grainy, use this setting, but take into account filesize will increase by at least 50%.

The conclusion

The images below are a guideline and demonstrate the effects the presets have in terms of image quality. You can play with different settings to get an idea which is adequate to use. I shall add new scenes in the future, some with high detail or high amount of motion. Check this guide from time to time as I will update it constantly. The charts a based on a 30-second (720 frames) clip from the movie Guardians of the Galaxy Vol. 2 at 4k-2160p resolution. As you can see, choosing the placebo setting does take almost 2 hours to encode, and using CRF 10 with tune-grain does yield a ridiculous 380 MiB file. Also take into account that you need extra space for the audio tracks.