Add color handling to VideoEncoder GPU #1125

Dan-Flores · 2025-12-11T06:18:30Z

This PR resolves this TODO:
// TODO-VideoEncoder: Enable configuration of color properties, similar to FFmpeg.

Support is added for the following parameters:

Color spaces: BT.601, BT.709, BT.2020
Color ranges: tv (limited), pc (full)

As a result, 6 ColorConversionMatrices are stored and utilized for NPP color conversion functions. To my understanding, these are the most commonly used or newer color spaces parameters (docs for AVColorSpace)

The testing caveats:

FFmpeg 4 and 6 fails frame equal assertions in test_nvenc_against_ffmpeg_cli. It only passes when using non-default color range and colorspace (specifically, not limited range and BT601). Since these tests pass on FFmpeg 7 and 8, I suspect there is some issue in FFmpeg 4 and 6. I've opened Enable color parameters in NVENC test on FFmpeg 4 and 6 #1140 to track this issue.
The av1_nvenc test is disabled on FFmpeg4, as the codec is not implemented.

…o gpu_pix_fmts

src/torchcodec/_core/CudaDeviceInterface.cpp

test/test_encoders.py

…o gpu_pix_fmts

Dan-Flores · 2025-12-18T14:48:30Z

src/torchcodec/_core/Encoder.cpp


  if (videoStreamOptions.pixelFormat.has_value()) {
+    // TODO-VideoEncoder: Enable pixel formats to be set by user
+    // and handled with the appropriate NPP function on GPU.


I moved this TODO from setupHardwareFrameContextForEncoding to here in initializeEncoder to centralize pixel_format handling.

The behavior is unchanged: If pixel_format argument is used while frames are on GPU, an error is raised.
The default usage of nv12 is moved into initializeEncoder` as well.

are we raising an error for pixel_format on gpu because of what nicolas mentioned below? that passing NV12 leads to yuv420?

Essentially yes. Because we do not understand the codec's behavior yet, we do not want the user to set or expect a pixel format.

…o gpu_pix_fmts

… into gpu_pix_fmts

NicolasHug · 2026-01-07T14:03:00Z

test/test_encoders.py

+        if ffmpeg_version == 4 and codec == "av1_nvenc":
+            pytest.skip("av1_nvenc is not supported on FFmpeg 4")


For this one, we might be able to use the skipif mark as part of the parametrization, which is our new preferred strategy to keep internal CI healthy.

But, I am curious: is this actually related to this PR? If av1_nvenc really isn't supported on FFmpeg, then how can our existing test (on main) pass?

These skips are examples of covariate parametrization, where we do not know both values in either parametrization line. I considered using the new pattern, but I believe it is impossible in these cases.

is this actually related to this PR?

It is not related to this PR, I likely did not locally test on FFmpeg 4 previously, because my installation of FFmpeg 4 does not support av1_nvenc.

Details

(ffmpeg4_cuda12_6) dev@gpu-dev-d63de2d5:~/torchcodec$ ffmpeg -encoders | grep nvenc ffmpeg version 4.4.2 Copyright (c) 2000-2021 the FFmpeg developers built with gcc 10.4.0 (conda-forge gcc 10.4.0-16) configuration: --prefix=/home/dev/.conda/envs/ffmpeg4_cuda12_6 --cc=/home/conda/feedstock_root/build_artifacts/ffmpeg_1660334214659/_build_env/bin/x86_64-conda-linux-gnu-cc --cxx=/home/conda/feedstock_root/build_artifacts/ffmpeg_1660334214659/_build_env/bin/x86_64-conda-linux-gnu-c++ --nm=/home/conda/feedstock_root/build_artifacts/ffmpeg_1660334214659/_build_env/bin/x86_64-conda-linux-gnu-nm --ar=/home/conda/feedstock_root/build_artifacts/ffmpeg_1660334214659/_build_env/bin/x86_64-conda-linux-gnu-ar --disable-doc --disable-openssl --enable-avresample --enable-demuxer=dash --enable-hardcoded-tables --enable-libfreetype --enable-libfontconfig --enable-libopenh264 --enable-gnutls --enable-libmp3lame --enable-libvpx --enable-pthreads --enable-vaapi --enable-gpl --enable-libx264 --enable-libx265 --enable-libaom --enable-libsvtav1 --enable-libxml2 --enable-pic --enable-shared --disable-static --enable-version3 --enable-zlib --pkg-config=/home/conda/feedstock_root/build_artifacts/ffmpeg_1660334214659/_build_env/bin/pkg-config libavutil 56. 70.100 / 56. 70.100 libavcodec 58.134.100 / 58.134.100 libavformat 58. 76.100 / 58. 76.100 libavdevice 58. 13.100 / 58. 13.100 libavfilter 7.110.100 / 7.110.100 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 9.100 / 5. 9.100 libswresample 3. 9.100 / 3. 9.100 libpostproc 55. 9.100 / 55. 9.100 V....D h264_nvenc NVIDIA NVENC H.264 encoder (codec h264) V..... nvenc NVIDIA NVENC H.264 encoder (codec h264) V..... nvenc_h264 NVIDIA NVENC H.264 encoder (codec h264) V..... nvenc_hevc NVIDIA NVENC hevc encoder (codec hevc) V....D hevc_nvenc NVIDIA NVENC hevc encoder (codec hevc)

how can our existing test (on main) pass?

They do not!
The av1_nvenc test is skipped if IN_GITHUB_CI, so it would not have appeared there. Since this test is also marked as @needs_ffmpeg_cli, its not run internally either.

Got it thank you - I understand how that's relevant for this PR now.

Agreed the first skip (the one with color_space == "bt470bg" and color_range == "tv") is a covariate parametrization, but I think this one just above is not: the only thing that is parametrized over is codec, so I think we could use something like pytest.param("av1_nvenc", marks=pytest.mark.skipif(get_ffmpeg_major_version() == 4).

That's not very critical in any case, feel free to ignore.

test/test_encoders.py

NicolasHug · 2026-01-07T14:12:06Z

test/test_encoders.py

-            assert_tensor_close_on_at_least(ff_frame, enc_frame, percentage=96, atol=2)
+            assert_tensor_close_on_at_least(
+                ff_frame, enc_frame, percentage=percentage, atol=2
+            )


Out of curiosity can you try to validate an upper bound on the absolute difference? E.g. something like

assert_close(..., rtol=0, atol=5)

would be pretty good. Might have to be higher than 5. If it's something like 50 then it's not worth it, as it's not informative. But if we can get a low upper-bound, that gives confidence that we're not too far off.

The lowest bound I tested that passed was atol=40, which is pretty high. The current test tells us that 96% of the frames are within 2, which seems to be fairly high accuracy

NicolasHug · 2026-01-07T14:15:18Z

test/test_encoders.py

+            # Default values vary by codec, so we only assert when
+            # color_range and color_space are not None.
+            if color_range is not None:
+                color_range = ffmpeg_metadata["color_range"]


I think you meant an assert here.

Also, while this is functionally equivalent (because of the assert encoder_metadata[...] == ffmpeg_metadata[...] call above), what we logically want to assert here is that the encode_metadata respected the input color_range and the input color_space. So I'd suggest to write this as `assert encode_metadata[...] == color_range.

Same below

src/torchcodec/_core/CudaDeviceInterface.cpp

NicolasHug · 2026-01-07T14:45:28Z

src/torchcodec/_core/Encoder.cpp

    deviceInterface_->registerHardwareDeviceWithCodec(avCodecContext_.get());
    deviceInterface_->setupHardwareFrameContextForEncoding(
-        avCodecContext_.get());
+        avCodecContext_.get(), outPixelFormat_);


So, we have this invariant throughout the GPU encoder that the output pixel format is always AV_PIX_FMT_NV12. it's good practice to enforce this invariant in the APIs themselves. Here for example, there is no need for deviceInterface_->setupHardwareFrameContextForEncoding() to take the pixel format as a parameter, because it's an invariant that it should always be AV_PIX_FMT_NV12.

The same is true for all the other APIs of the device interface. There are different ways to enforce this invariant in the API, one simple one is to make outputPixelFormatForEncoding_ a public constexpr member.

I was aiming to follow up quickly on adding support for other pixel formats, but I see how this parameter is confusing at the moment. I will remove it from the APIs until it is needed.

NicolasHug · 2026-01-07T14:47:16Z

src/torchcodec/_core/CudaDeviceInterface.cpp

+          "GPU encoding expected to encode into nv12 pixel format, but got ",
+          av_get_pix_fmt_name(targetPixelFormat),
+          ". This should not happen, please report this to the TorchCodec repo");
+  }


Above: this is related to my other comment below about enforcing targetPixelFormat to be AV_PIX_FMT_NV12 through the APIs. It's great that you are raising on this, and indeed this should never happen and this is a torchcodec bug. But we can avoid risking the bug entirely by simply just not making targetPixelFormat a parameter! That's the value of "enforcing the invariant through the API".

NicolasHug

Thanks @Dan-Flores !

NicolasHug · 2026-01-09T13:48:13Z

test/test_encoders.py

+        if ffmpeg_version == 4 and codec == "av1_nvenc":
+            pytest.skip("av1_nvenc is not supported on FFmpeg 4")


Got it thank you - I understand how that's relevant for this PR now.

Agreed the first skip (the one with color_space == "bt470bg" and color_range == "tv") is a covariate parametrization, but I think this one just above is not: the only thing that is parametrized over is codec, so I think we could use something like pytest.param("av1_nvenc", marks=pytest.mark.skipif(get_ffmpeg_major_version() == 4).

That's not very critical in any case, feel free to ignore.

NicolasHug · 2026-01-09T13:49:37Z

test/test_encoders.py

+                # when color_range='tv' and color_space=None on FFmpeg 7/8.
+                # Since this failure is rare, I suspect its a bug related to these
+                # older container formats on newer FFmpeg versions.
+                if (


use if not ... rather than if: pass; else: ....

NicolasHug · 2026-01-09T13:54:24Z

src/torchcodec/_core/CudaDeviceInterface.h


  std::string getDetails() override;

+  constexpr static AVPixelFormat CUDA_PIXEL_FORMAT = AV_PIX_FMT_NV12;


Nit, name this CUDA_ENCODING_PIXEL_FORMAT (or any variant that is explicitly about encoding), because this file and class contain a lot of stuff about decoding. We don't want to imply that this is a "decoding pixel format".

NicolasHug · 2026-01-09T13:56:10Z

src/torchcodec/_core/Encoder.cpp

-        : AV_PIX_FMT_YUV420P;
+    if (frames_.device().is_cuda()) {
+      // Default to nv12 pixel format when encoding on GPU.
+      outPixelFormat_ = AV_PIX_FMT_NV12;


To avoid duplication of hard-coded constants, we can use the interface's CUDA_PIXEL_FORMAT field here (if it's public, which I think it is?)

NicolasHug · 2026-01-09T13:57:54Z

src/torchcodec/_core/CudaDeviceInterface.cpp

+// - In limited range, Y is offset by 16 to add the lower margin.
+// - In both color ranges, U,V are offset by 128 to be centered around 0.
+//
+// RGB to YUV conversion matrices to use in NPP color conversion functions


Thanks for the note, it will be super useful if we ever need to go back to this in the future.

Dan-Flores added 4 commits December 11, 2025 05:06

only 2 pixfmts, enable 6 color param combos

cd5f8aa

Merge branch 'main' of https://github.com/meta-pytorch/torchcodec int…

939240b

…o gpu_pix_fmts

comments

b538d13

comments2

0480f1f

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 11, 2025

NicolasHug reviewed Dec 11, 2025

View reviewed changes

src/torchcodec/_core/CudaDeviceInterface.cpp Show resolved Hide resolved

test/test_encoders.py Show resolved Hide resolved

test/test_encoders.py Show resolved Hide resolved

Dan-Flores added 8 commits December 12, 2025 05:44

adjust test, fix pixel format checks

351a55d

keep plumbing, only use nv12

ffcf872

error sooner on gpu on any pixel format

a671f31

skip non-default color params on 4+6, skip av1 gpu on 4

a9ad8e0

Merge branch 'main' of https://github.com/meta-pytorch/torchcodec int…

261549d

…o gpu_pix_fmts

remove unused option

f4d777c

reduce diff

da3a6d7

add TODO, liink issue

1dc7690

Dan-Flores changed the title ~~Add pixel formats and color handling to VideoEncoder GPU~~ Add color handling to VideoEncoder GPU Dec 17, 2025

Dan-Flores added 2 commits December 17, 2025 15:31

restore None test case

ce86d61

reuse codecContext color params, no hardcoded defaults

daf2fda

Dan-Flores marked this pull request as ready for review December 18, 2025 14:48

Dan-Flores commented Dec 18, 2025

View reviewed changes

Dan-Flores added 5 commits January 6, 2026 15:49

3 decimal places

8c2bcee

Merge branch 'main' of https://github.com/meta-pytorch/torchcodec int…

193d7c9

…o gpu_pix_fmts

Merge branch 'gpu_pix_fmts' of https://github.com/Dan-Flores/torchcodec…

e83c130

… into gpu_pix_fmts

actually 4 decimal places

e63f118

restore higher decimal points for BT2020

5e8745f

NicolasHug reviewed Jan 7, 2026

View reviewed changes

Dan-Flores mentioned this pull request Jan 7, 2026

Extend color range note to explain limited range #1154

Closed

Dan-Flores added 3 commits January 8, 2026 05:37

add note, test adjustments

d1cbaeb

drop targetPixelFormat from CUDA api

ef03ad5

skip 3 failing checks on ffmpeg 7/8

40f4c66

NicolasHug approved these changes Jan 9, 2026

View reviewed changes

		if ffmpeg_version == 4 and codec == "av1_nvenc":
		pytest.skip("av1_nvenc is not supported on FFmpeg 4")


		std::string getDetails() override;

		constexpr static AVPixelFormat CUDA_PIXEL_FORMAT = AV_PIX_FMT_NV12;

Add color handling to VideoEncoder GPU #1125

Are you sure you want to change the base?

Add color handling to VideoEncoder GPU #1125

Conversation

Dan-Flores commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Dan-Flores commented Dec 11, 2025 •

edited

Loading