-
Notifications
You must be signed in to change notification settings - Fork 80
Add color handling to VideoEncoder GPU #1125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| {-0.148f, -0.291f, 0.439f, 128.0f}, | ||
| // V = 0.439*R - 0.368*G - 0.071*B + 128 (BT.601 coefficients) | ||
| {0.439f, -0.368f, -0.071f, 128.0f}}; | ||
| // RGB to YUV conversion matrices to use in NPP color conversion functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share how these were derived? What were the original values that were used as reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These follow the pattern described in the note in CudaCommon, I can add a comment referencing that note here
torchcodec/src/torchcodec/_core/CUDACommon.cpp
Lines 43 to 44 in ee8ce04
| // Color space and color range | |
| // --------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The note is about YUV -> RGB so it's not 100% targeted to what the matrices are doing. But yes, add a ref to that note, it's still useful.
You asked me offline whether we should update the note to explain limited range: yes, we should :)
There's a TODO in the note for that, but I never had the chance to do it - and frankly I forgot the underlying logic lol. If you'd like to give it a go, please do it - in a follow-up PR.
| assert encoder_metadata["color_range"] == ffmpeg_metadata["color_range"] | ||
| assert encoder_metadata["color_space"] == ffmpeg_metadata["color_space"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll want to be stricter here:
| assert encoder_metadata["color_range"] == ffmpeg_metadata["color_range"] | |
| assert encoder_metadata["color_space"] == ffmpeg_metadata["color_space"] | |
| assert encoder_metadata["color_range"] == ffmpeg_metadata["color_range"] == color_range | |
| assert encoder_metadata["color_space"] == ffmpeg_metadata["color_space"] == color_space |
| assert encoder_metadata["pix_fmt"] == "yuv420p" | ||
| assert ffmpeg_metadata["pix_fmt"] == "yuv420p" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we're not assert pix_fmt anymore, which makes it hard to verify that the changes in this PR are correct. IIRC, passing NV12 actually resulted in a yuv420p format at the end. We should try to undertand why that was the case. It may not add a lot of value to support both nv12 and yuv420p as parameter values if they're both the same thing (and if they both end-up being yuv420 anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the format changes occur based on codec implementation. By adding back this assertion, I observed a deprecated pixel format yuvj420p is set when pc (full) color range is used by h264_nvenc and hevc_nvenc, but not av1_nvenc.
I can incorporate pixel formats into my benchmarking PR, to see if there is some optimization to using nv12.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new assertions you added are good. But I personally still do not understand why passing NV12 actually ends up being reported yuv420.
Is it actually still NV12, and it's just FFmpeg that can't tell the difference? Or is it indeed yuv420?
Until we get a good understanding on that, I think we should refrain from allowing pixel_format with CUDA encoding. We have a surprising behavior that we cannot explain right now: passing NV12 leads to yuv420. If we're surprised, our users will be surprised too, and we won't have a good explanation to give them. It is safer to simply not expose this functionality just yet, and let them rely on the default behavior.
|
|
||
| if (videoStreamOptions.pixelFormat.has_value()) { | ||
| // TODO-VideoEncoder: Enable pixel formats to be set by user | ||
| // and handled with the appropriate NPP function on GPU. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this TODO from setupHardwareFrameContextForEncoding to here in initializeEncoder to centralize pixel_format handling.
The behavior is unchanged: If pixel_format argument is used while frames are on GPU, an error is raised.
The default usage of nv12 is moved into initializeEncoder` as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we raising an error for pixel_format on gpu because of what nicolas mentioned below? that passing NV12 leads to yuv420?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially yes. Because we do not understand the codec's behavior yet, we do not want the user to set or expect a pixel format.
This PR resolves this TODO:
// TODO-VideoEncoder: Enable configuration of color properties, similar to FFmpeg.Support is added for the following parameters:
BT.601,BT.709,BT.2020tv(limited),pc(full)As a result, 6
ColorConversionMatricesare stored and utilized for NPP color conversion functions. To my understanding, these are the most commonly used or newer color spaces parameters (docs for AVColorSpace)The testing caveats:
test_nvenc_against_ffmpeg_cli. It only passes when using non-default color range and colorspace (specifically, not limited range and BT601). Since these tests pass on FFmpeg 7 and 8, I suspect there is some issue in FFmpeg 4 and 6. I've opened Enable color parameters in NVENC test on FFmpeg 4 and 6 #1140 to track this issue.av1_nvenctest is disabled on FFmpeg4, as the codec is not implemented.