I have a high end computer with 12900K and rtx 4090. I was using ncnn/Vulkan with no issues with Vapoursynth. However I updated SVP today and tried the new NVIDIA TensorRT and when I activate it, command line appears on the screen and the video freezes as black screen. I waited 10 minutes but nothing happened. Then I tried to close the command line 3 times as it appears again, then MPC-HC wrote error message on the screen.
I checked the log in the error message and I am posting here what is in the log:
&&&& RUNNING TensorRT.trtexec [TensorRT v8501] # C:/Program Files (x86)/SVP 4/rife\vsmlrt-cuda\trtexec --onnx=C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx --timingCacheFile=C:\Users\onurco\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx.1920x1088_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4090_3dcbe72f.engine.cache --device=0 --saveEngine=C:\Users\onurco\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx.1920x1088_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4090_3dcbe72f.engine --shapes=input:1x11x1088x1920 --fp16 --tacticSources=-CUBLAS,-CUBLAS_LT --useCudaGraph --noDataTransfers --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw
[01/19/2023-02:27:19] [i] === Model Options ===
[01/19/2023-02:27:19] [i] Format: ONNX
[01/19/2023-02:27:19] [i] Model: C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx
[01/19/2023-02:27:19] [i] Output:
[01/19/2023-02:27:19] [i] === Build Options ===
[01/19/2023-02:27:19] [i] Max batch: explicit batch
[01/19/2023-02:27:19] [i] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[01/19/2023-02:27:19] [i] minTiming: 1
[01/19/2023-02:27:19] [i] avgTiming: 8
[01/19/2023-02:27:19] [i] Precision: FP32+FP16
[01/19/2023-02:27:19] [i] LayerPrecisions:
[01/19/2023-02:27:19] [i] Calibration:
[01/19/2023-02:27:19] [i] Refit: Disabled
[01/19/2023-02:27:19] [i] Sparsity: Disabled
[01/19/2023-02:27:19] [i] Safe mode: Disabled
[01/19/2023-02:27:19] [i] DirectIO mode: Disabled
[01/19/2023-02:27:19] [i] Restricted mode: Disabled
[01/19/2023-02:27:19] [i] Build only: Disabled
[01/19/2023-02:27:19] [i] Save engine: C:\Users\onurco\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx.1920x1088_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4090_3dcbe72f.engine
[01/19/2023-02:27:19] [i] Load engine:
[01/19/2023-02:27:19] [i] Profiling verbosity: 0
[01/19/2023-02:27:19] [i] Tactic sources: cublas [OFF], cublasLt [OFF],
[01/19/2023-02:27:19] [i] timingCacheMode: global
[01/19/2023-02:27:19] [i] timingCacheFile: C:\Users\onurco\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx.1920x1088_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4090_3dcbe72f.engine.cache
[01/19/2023-02:27:19] [i] Heuristic: Disabled
[01/19/2023-02:27:19] [i] Preview Features: Use default preview flags.
[01/19/2023-02:27:19] [i] Input(s): fp16:chw
[01/19/2023-02:27:19] [i] Output(s): fp16:chw
[01/19/2023-02:27:19] [i] Input build shape: input=1x11x1088x1920+1x11x1088x1920+1x11x1088x1920
[01/19/2023-02:27:19] [i] Input calibration shapes: model
[01/19/2023-02:27:19] [i] === System Options ===
[01/19/2023-02:27:19] [i] Device: 0
[01/19/2023-02:27:19] [i] DLACore:
[01/19/2023-02:27:19] [i] Plugins:
[01/19/2023-02:27:19] [i] === Inference Options ===
[01/19/2023-02:27:19] [i] Batch: Explicit
[01/19/2023-02:27:19] [i] Input inference shape: input=1x11x1088x1920
[01/19/2023-02:27:19] [i] Iterations: 10
[01/19/2023-02:27:19] [i] Duration: 3s (+ 200ms warm up)
[01/19/2023-02:27:19] [i] Sleep time: 0ms
[01/19/2023-02:27:19] [i] Idle time: 0ms
[01/19/2023-02:27:19] [i] Streams: 1
[01/19/2023-02:27:19] [i] ExposeDMA: Disabled
[01/19/2023-02:27:19] [i] Data transfers: Disabled
[01/19/2023-02:27:19] [i] Spin-wait: Disabled
[01/19/2023-02:27:19] [i] Multithreading: Disabled
[01/19/2023-02:27:19] [i] CUDA Graph: Enabled
[01/19/2023-02:27:19] [i] Separate profiling: Disabled
[01/19/2023-02:27:19] [i] Time Deserialize: Disabled
[01/19/2023-02:27:19] [i] Time Refit: Disabled
[01/19/2023-02:27:19] [i] NVTX verbosity: 0
[01/19/2023-02:27:19] [i] Persistent Cache Ratio: 0
[01/19/2023-02:27:19] [i] Inputs:
[01/19/2023-02:27:19] [i] === Reporting Options ===
[01/19/2023-02:27:19] [i] Verbose: Disabled
[01/19/2023-02:27:19] [i] Averages: 10 inferences
[01/19/2023-02:27:19] [i] Percentiles: 90,95,99
[01/19/2023-02:27:19] [i] Dump refittable layers:Disabled
[01/19/2023-02:27:19] [i] Dump output: Disabled
[01/19/2023-02:27:19] [i] Profile: Disabled
[01/19/2023-02:27:19] [i] Export timing to JSON file:
[01/19/2023-02:27:19] [i] Export output to JSON file:
[01/19/2023-02:27:19] [i] Export profile to JSON file:
[01/19/2023-02:27:19] [i]
[01/19/2023-02:27:19] [i] === Device Information ===
[01/19/2023-02:27:19] [i] Selected Device: NVIDIA GeForce RTX 4090
[01/19/2023-02:27:19] [i] Compute Capability: 8.9
[01/19/2023-02:27:19] [i] SMs: 128
[01/19/2023-02:27:19] [i] Compute Clock Rate: 2.58 GHz
[01/19/2023-02:27:19] [i] Device Global Memory: 24563 MiB
[01/19/2023-02:27:19] [i] Shared Memory per SM: 100 KiB
[01/19/2023-02:27:19] [i] Memory Bus Width: 384 bits (ECC disabled)
[01/19/2023-02:27:19] [i] Memory Clock Rate: 10.501 GHz
[01/19/2023-02:27:19] [i]
[01/19/2023-02:27:19] [i] TensorRT version: 8.5.1
[01/19/2023-02:27:20] [i] [TRT] [MemUsageChange] Init CUDA: CPU +436, GPU +0, now: CPU 13780, GPU 1771 (MiB)
My Vapoursynth version is Vapoursynth Filter v1.4.5 # svp with Vapoursynth R61 API R4.0
If anybody could point me about what to do, I would be really glad.
Also I am adding here what was written in the command line:
&&&& RUNNING TensorRT.trtexec [TensorRT v8501] # C:/Program Files (x86)/SVP 4/rife\vsmlrt-cuda\trtexec --onnx=C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx --timingCacheFile=C:\Users\onurco\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx.1920x832_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4090_3dcbe72f.engine.cache --device=0 --saveEngine=C:\Users\onurco\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx.1920x832_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4090_3dcbe72f.engine --shapes=input:1x11x832x1920 --fp16 --tacticSources=-CUBLAS,-CUBLAS_LT --useCudaGraph --noDataTransfers --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw
[01/19/2023-02:47:37] [i] === Model Options ===
[01/19/2023-02:47:37] [i] Format: ONNX
[01/19/2023-02:47:37] [i] Model: C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx
[01/19/2023-02:47:37] [i] Output:
[01/19/2023-02:47:37] [i] === Build Options ===
[01/19/2023-02:47:37] [i] Max batch: explicit batch
[01/19/2023-02:47:37] [i] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[01/19/2023-02:47:37] [i] minTiming: 1
[01/19/2023-02:47:37] [i] avgTiming: 8
[01/19/2023-02:47:37] [i] Precision: FP32+FP16
[01/19/2023-02:47:37] [i] LayerPrecisions:
[01/19/2023-02:47:37] [i] Calibration:
[01/19/2023-02:47:37] [i] Refit: Disabled
[01/19/2023-02:47:37] [i] Sparsity: Disabled
[01/19/2023-02:47:37] [i] Safe mode: Disabled
[01/19/2023-02:47:37] [i] DirectIO mode: Disabled
[01/19/2023-02:47:37] [i] Restricted mode: Disabled
[01/19/2023-02:47:37] [i] Build only: Disabled
[01/19/2023-02:47:37] [i] Save engine: C:\Users\onurco\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx.1920x832_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4090_3dcbe72f.engine
[01/19/2023-02:47:37] [i] Load engine:
[01/19/2023-02:47:37] [i] Profiling verbosity: 0
[01/19/2023-02:47:37] [i] Tactic sources: cublas [OFF], cublasLt [OFF],
[01/19/2023-02:47:37] [i] timingCacheMode: global
[01/19/2023-02:47:37] [i] timingCacheFile: C:\Users\onurco\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx.1920x832_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4090_3dcbe72f.engine.cache
[01/19/2023-02:47:37] [i] Heuristic: Disabled
[01/19/2023-02:47:37] [i] Preview Features: Use default preview flags.
[01/19/2023-02:47:37] [i] Input(s): fp16:chw
[01/19/2023-02:47:37] [i] Output(s): fp16:chw
[01/19/2023-02:47:37] [i] Input build shape: input=1x11x832x1920+1x11x832x1920+1x11x832x1920
[01/19/2023-02:47:37] [i] Input calibration shapes: model
[01/19/2023-02:47:37] [i] === System Options ===
[01/19/2023-02:47:37] [i] Device: 0
[01/19/2023-02:47:37] [i] DLACore:
[01/19/2023-02:47:37] [i] Plugins:
[01/19/2023-02:47:37] [i] === Inference Options ===
[01/19/2023-02:47:37] [i] Batch: Explicit
[01/19/2023-02:47:37] [i] Input inference shape: input=1x11x832x1920
[01/19/2023-02:47:37] [i] Iterations: 10
[01/19/2023-02:47:37] [i] Duration: 3s (+ 200ms warm up)
[01/19/2023-02:47:37] [i] Sleep time: 0ms
[01/19/2023-02:47:37] [i] Idle time: 0ms
[01/19/2023-02:47:37] [i] Streams: 1
[01/19/2023-02:47:37] [i] ExposeDMA: Disabled
[01/19/2023-02:47:37] [i] Data transfers: Disabled
[01/19/2023-02:47:37] [i] Spin-wait: Disabled
[01/19/2023-02:47:37] [i] Multithreading: Disabled
[01/19/2023-02:47:37] [i] CUDA Graph: Enabled
[01/19/2023-02:47:37] [i] Separate profiling: Disabled
[01/19/2023-02:47:37] [i] Time Deserialize: Disabled
[01/19/2023-02:47:37] [i] Time Refit: Disabled
[01/19/2023-02:47:37] [i] NVTX verbosity: 0
[01/19/2023-02:47:37] [i] Persistent Cache Ratio: 0
[01/19/2023-02:47:37] [i] Inputs:
[01/19/2023-02:47:37] [i] === Reporting Options ===
[01/19/2023-02:47:37] [i] Verbose: Disabled
[01/19/2023-02:47:37] [i] Averages: 10 inferences
[01/19/2023-02:47:37] [i] Percentiles: 90,95,99
[01/19/2023-02:47:37] [i] Dump refittable layers:Disabled
[01/19/2023-02:47:37] [i] Dump output: Disabled
[01/19/2023-02:47:37] [i] Profile: Disabled
[01/19/2023-02:47:37] [i] Export timing to JSON file:
[01/19/2023-02:47:37] [i] Export output to JSON file:
[01/19/2023-02:47:37] [i] Export profile to JSON file:
[01/19/2023-02:47:37] [i]
[01/19/2023-02:47:37] [i] === Device Information ===
[01/19/2023-02:47:37] [i] Selected Device: NVIDIA GeForce RTX 4090
[01/19/2023-02:47:37] [i] Compute Capability: 8.9
[01/19/2023-02:47:37] [i] SMs: 128
[01/19/2023-02:47:37] [i] Compute Clock Rate: 2.58 GHz
[01/19/2023-02:47:37] [i] Device Global Memory: 24563 MiB
[01/19/2023-02:47:37] [i] Shared Memory per SM: 100 KiB
[01/19/2023-02:47:37] [i] Memory Bus Width: 384 bits (ECC disabled)
[01/19/2023-02:47:37] [i] Memory Clock Rate: 10.501 GHz
[01/19/2023-02:47:37] [i]
[01/19/2023-02:47:37] [i] TensorRT version: 8.5.1
[01/19/2023-02:47:38] [i] [TRT] [MemUsageChange] Init CUDA: CPU +448, GPU +0, now: CPU 13279, GPU 1771 (MiB)
[01/19/2023-02:47:39] [i] [TRT] [MemUsageChange] Init builder kernel library: CPU +430, GPU +116, now: CPU 14166, GPU 1887 (MiB)
[01/19/2023-02:47:39] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[01/19/2023-02:47:39] [i] Start parsing network model
[01/19/2023-02:47:39] [i] [TRT] ----------------------------------------------------------------
[01/19/2023-02:47:39] [i] [TRT] Input filename: C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx
[01/19/2023-02:47:39] [i] [TRT] ONNX IR version: 0.0.8
[01/19/2023-02:47:39] [i] [TRT] Opset version: 16
[01/19/2023-02:47:39] [i] [TRT] Producer name: pytorch
[01/19/2023-02:47:39] [i] [TRT] Producer version: 1.12.0
[01/19/2023-02:47:39] [i] [TRT] Domain:
[01/19/2023-02:47:39] [i] [TRT] Model version: 0
[01/19/2023-02:47:39] [i] [TRT] Doc string:
[01/19/2023-02:47:39] [i] [TRT] ----------------------------------------------------------------
[01/19/2023-02:47:39] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/19/2023-02:47:39] [i] Finish parsing network model
[01/19/2023-02:47:39] [W] Could not read timing cache from: C:\Users\onurco\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.6.onnx.1920x832_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4090_3dcbe72f.engine.cache. A new timing cache will be generated and written.
[01/19/2023-02:47:40] [i] [TRT] [MemUsageChange] Init cuDNN: CPU +1083, GPU +406, now: CPU 14928, GPU 2293 (MiB)
[01/19/2023-02:47:40] [i] [TRT] Global timing cache in use. Profiling results in this builder pass will be stored.