From 39ab58c7451bf3d75960e5cc12097db1b1cb752f Mon Sep 17 00:00:00 2001 From: Good Guy Date: Fri, 27 Feb 2026 16:25:53 -0700 Subject: [PATCH] Credit Andrea - updated for experimental Vulkan --- parts/Configuration.tex | 2 +- parts/Tips.tex | 13 ++++++++++--- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/parts/Configuration.tex b/parts/Configuration.tex index 18fb0b3..96f1046 100644 --- a/parts/Configuration.tex +++ b/parts/Configuration.tex @@ -181,7 +181,7 @@ The main focus of the performance section is rendering parameters not available \item[Seconds to preroll renders] \index{pre-roll} some effects need a certain amount of time to settle in. Checking this option sets a number of seconds to render without writing to disk before the selected region is rendered. When using the render farm, you will sometimes need to preroll to get seamless transitions between the jobs. Every job in a render farm is prerolled by this value. This does not affect background rendering because background rendering uses a different preroll value. \item[Force single processor use] \index{force single processor} \CGG{} tries to use all processors on the system by default, but sometimes you will only want to use one processor, like in a render farm client. This forces only one processor to be used. The operating system usually uses the second processor for disk access. The value of this parameter is used in render farm clients. \item[Project SMP cpus] \index{SMP cpus} to restrict the number of processors utilized, change the count number. This number will be used for the plugin per load balance operation cpu limit, which uses smp-cpus to stripe your data. It does not affect the number of cpus used in any other \CGG{} operation besides plugins. On large cpu systems, it can come in handy to downgrade the number of cpus used for some plugins; otherwise it uses all of the processors and splits up the program into too many pieces which may add considerable overhead in high cpu count systems. - \item[Use HW Device] \index{HW Device} \CGG{} can use hardware timeline acceleration (decoding) via GPUs thanks to the \textit{OpenGL} video driver. By default the \textit{X11} video driver is used, which works only with the CPU. See \nameref{sub:video_out_section}. For h264, h265 (HEVC) and VP9 codecs you can use libraries specific to AMD, Nvidia and Intel graphics cards. For AMD and Intel set \textbf{vaapi}; for Nvidia set \textbf{vdpau} (also works with AMD thanks to a wrapper). \textbf{Cuda} does not accelerate decoding with Nvidia, but it does allow some plugins to run that are not available otherwise. \textbf{None} (default) uses the video driver set in the \texttt{Playback A/B} tab. + \item[Use HW Device] \index{HW Device} \CGG{} can use hardware timeline acceleration (decoding) via GPUs thanks to the \textit{OpenGL} video driver. By default the \textit{X11} video driver is used, which works only with the CPU. See \nameref{sub:video_out_section}. For h264, h265 (HEVC) and VP9 codecs you can use libraries specific to AMD, Nvidia and Intel graphics cards. For AMD and Intel set \textbf{vaapi}; for Nvidia set \textbf{vdpau} (also works with AMD thanks to a wrapper). \textbf{Cuda} does not accelerate decoding with Nvidia, but it does allow some plugins to run that are not available otherwise. \textbf{Vulkan} (experimental) provides hardware-agnostic acceleration for supported codecs. \textbf{None} (default) uses the video driver set in the \texttt{Playback A/B} tab. \end{description} \subsection{Background Rendering section}% diff --git a/parts/Tips.tex b/parts/Tips.tex index c326457..dde7f8a 100644 --- a/parts/Tips.tex +++ b/parts/Tips.tex @@ -24,13 +24,21 @@ Besides the above hardware recommendations, this section covers tips for perform \label{sec:hardware_video_acceleration} \index{hardware!acceleration} -With certain newer, more powerful graphics boards and newer device drivers, there is the potential for enhanced \textit{decode} and \textit{encode} performance. Decode refers to loading and playing video in \CGG{}. The GPU, Graphics Processing Unit, on the graphics board is accessed via one of the following libraries: vdpau or vaapi. The hardware acceleration done by the graphics card increases performance by activating certain functions in connection with a few of the FFmpeg decoders. This use makes it possible for the graphics card to decode video, thus offloading the CPU. Decode operations are described here next. +With certain newer, more powerful graphics boards and newer device drivers, there is the potential for enhanced \textit{decode} and \textit{encode} performance. Decode refers to loading and playing video in \CGG{}. The GPU, Graphics Processing Unit, on the graphics board is accessed via one of the following libraries: vdpau or vaapi. The hardware acceleration done by the graphics card increases performance by activating certain functions in connection with a few of the FFmpeg decoders. This use makes it possible for the graphics card to decode video, thus offloading the CPU. + +NOTE: Hardware acceleration in Linux is constantly evolving and causes numerous problems (but if it doesn't work, it automatically reverts in \CGG{} to software mode with the CPU). It depends on the generation and brand of the video card; on proprietary or open source drivers and on mesa user space drivers. Drivers change constantly, especially with the succession of new graphics card models, and what worked before may no longer work afterwards, or vice versa. Furthermore, being able to take advantage of hardware acceleration could mean having to compile \CGG{} yourself by activating specific flags that are necessary (for example: having hardware acceleration in decoding and encoding the \textit{ffv1} codec requires compiling with the \texttt{libplacebo} library). Another case is having to start the program with environment variables (for example, for older AMD cards, before RDNA, use \texttt{RADV\_PERFTEST="video\_decode,video\_encode"} to get hardware acceleration via Vulkan. Or use \texttt{ANV\_DEBUG="video-decode,\\video-encode"} for acceleration via Intel, etc.). It should also be noted that sometimes software decoding can be more efficient than hardware acceleration. You need to experiment to see what works best for your use case. + +Decode operations are described here next. They are located in: \texttt{Settings $\rightarrow$ Preferences... $\rightarrow$ Tab Performance $\rightarrow$ Use HW Device}. + Encode refers to rendering video and is described at the end of this section under \hyperref[sub:gpu_hardware_encoding]{GPU hardware encoding}. VDPAU, Video Decode and Presentation API for Unix, is an open source library to offload portions of the video decoding process and video post-processing to the GPU of graphics boards, such as Nvidia. It may also apply to Nouveau and Amdgpu boards (by wrapper), but that has not been verified. -VA-API, Video Acceleration API, is an open source library which provides both hardware accelerated video encoding and decoding for use mostly with Intel (and AMD) graphics boards. +VA-API, Video Acceleration API, is an open source library which provides both hardware accelerated video encoding and decoding for use mostly with Intel (and AMD) graphics boards. + +VULKAN (experimental) Vulkan, like OpenGL and unlike Vdpau and Vaapi, is hardware agnostic and could work for any type and brand of video card. Vulkan is direct GPU access, with better performance, efficient multithreading, and reduced driver overhead compared to the older OpenGL standard, It emulates OpenGL functionality. +Vulkan requires initialization of its device type. This is done automatically by \CGG{} for decoding but still needs to be made explicit in the encoding presets. (\texttt{CIN\_HW\_DEV=vulkan}). AppImage will probably not allow for either VDPAU or VA-API hardware acceleration because the computer where AppImage is created will not have the same graphics card and/or vaapi/vdpau @@ -818,4 +826,3 @@ distortions. In the case of the pillarbox, we will leave H unchanged while calculating the new value of W. The formula $\frac{x}{y} = \frac{W}{H}$ is valid for any aspect ratio ($4:3; 16:9; 2.35:1$; etc). - -- 2.34.1