1 (edited by Cryptokid 17-10-2024 19:11:11)

Topic: RIFE Transcoding Fails After SVP Update

For the last 2+ SVP Updates now my RIFE has been broken, I primarily use it for transcoding. After the update last month, I could no longer transcode videos, they would instantly fail with an error. I'm not sure if there was an earlier update I missed that broke it first, but this was when I first noticed the issue. This happens with all resolutions and formats. I also haven't changed anything about my RIFE's profile settings (performance boost on, using model 4.9 etc.).

I'm using a 1080Ti, Windows 10, Python 3.8.10
I've checked and I have the latest NVidia Drivers available for my card.

13:44:03.687: ===== Starting mpv ======
13:44:03.687: Command line: C:\Program Files (x86)\SVP 4\mpv64\mpv.exe T:/Deep3D/upscales/test_ghq5.mp4 --o=T:/Deep3D/upscales/test_ghq5.SVP.temporary.mkv --no-audio --no-sub --no-sub-auto --input-ipc-server=mpvencodepipe --input-media-keys=no --no-msg-color --video-crop=0x0+0+0 --vf=vapoursynth:[C:\Users\Crypto\AppData\Roaming\SVP4\scripts\ffff.py]:4:8 --of=matroska --ovc=libx264 --ovcopts=preset=slower,crf=16,x264opts=opencl,threads=8
13:44:04.036: ● Video --vid=1 (h264 2560x1440 23.976 fps) [default]
13:44:04.036: ○ Audio --aid=1 (aac 2ch 48000 Hz) [default]
13:44:04.442: vstrt: failed to preload C:\Program Files (x86)\SVP 4\rife\vsmlrt-cuda\nvinfer_10.dll, errno 126
13:44:04.442: vstrt: failed to preload C:\Program Files (x86)\SVP 4\rife\vsmlrt-cuda\nvinfer_plugin_10.dll, errno 126
13:44:04.442: vstrt: failed to preload C:\Program Files (x86)\SVP 4\rife\vsmlrt-cuda\nvinfer_10.dll, errno 126
13:44:04.446: vstrt: TensorRT failed to load.
13:44:04.560: [vapoursynth] Script evaluation failed:
13:44:04.560: [vapoursynth] Python exception: vsmlrt: cannot load any filters
13:44:04.560: [vapoursynth]
13:44:04.560: [vapoursynth] Traceback (most recent call last):
13:44:04.560: [vapoursynth] File "src\\cython\\vapoursynth.pyx", line 3121, in vapoursynth._vpy_evaluate
13:44:04.560: [vapoursynth] File "src\\cython\\vapoursynth.pyx", line 3122, in vapoursynth._vpy_evaluate
13:44:04.560: [vapoursynth] File "C:\Users\Crypto\AppData\Roaming\SVP4\scripts\ffff.py", line 16, in <module>
13:44:04.560: [vapoursynth] from vsmlrt import Backend
13:44:04.560: [vapoursynth] File "C:\Program Files (x86)\SVP 4\rife\vsmlrt.py", line 76, in <module>
13:44:04.560: [vapoursynth] plugins_path: str = get_plugins_path()
13:44:04.561: [vapoursynth] ^^^^^^^^^^^^^^^^^^
13:44:04.561: [vapoursynth] File "C:\Program Files (x86)\SVP 4\rife\vsmlrt.py", line 71, in get_plugins_path
13:44:04.561: [vapoursynth] raise RuntimeError("vsmlrt: cannot load any filters")
13:44:04.561: [vapoursynth] RuntimeError: vsmlrt: cannot load any filters
13:44:04.561: [vapoursynth]
13:44:04.578: (!!!) Intermediate file may be broken: T:\Deep3D\upscales\test_ghq5.SVP.temporary.mkv
13:44:04.578: ===== mpv exited with code 62097 =====

I've tried re-installing SVP from scratch, re-updating components, nothing fixes it except manually replacing vsmlrt with an older version.

I downloaded

vsmlrt-windows-x64-cuda.v13.1.zip

from github and replaced the

vlmsrt-cuda folder / vstrt.dll

which got things working again, but obviously I don't want to keep swapping this one back after every update.

I haven't tested all the vsmlrt versions between 13.1 and the current, but I had problems with several of the ones I tried.

Re: RIFE Transcoding Fails After SVP Update

outdated NV drivers, probably?

Re: RIFE Transcoding Fails After SVP Update

My drivers are up to date. After making this post, I also attempted to swap back to my previous fix using 13.1 like I mentioned, but now the transcode takes far longer than it did before, I have no idea why. I tried multiple different combinations of files and I can't get it to run properly again. I tried re-installing the components via svp manager and it had no effect. I also tried restarting my PC just to be sure, same thing. I also made sure to clear the rife cache in between each time.

I tried again to go back to the latest version installed by svp / tried the latest version available on github, I got this log:

16:48:40.616: ===== Starting mpv ======
16:48:40.616: Command line: C:\Program Files (x86)\SVP 4\mpv64\mpv.exe T:/Deep3D/upscales/test.mp4 --o=T:/Deep3D/upscales/test.SVP.temporary.mkv --no-audio --no-sub --no-sub-auto --input-ipc-server=mpvencodepipe --input-media-keys=no --no-msg-color --video-crop=0x0+0+0 --vf=vapoursynth:[C:\Users\Crypto\AppData\Roaming\SVP4\scripts\ffff.py]:4:8 --of=matroska --ovc=libx264 --ovcopts=preset=slower,crf=16,x264opts=opencl,threads=8
16:48:40.687: ● Video --vid=1 (h264 2560x1440 23.976 fps) [default]
16:48:40.687: ○ Audio --aid=1 (aac 2ch 48000 Hz) [default]
16:48:41.016: C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine is not writable
16:48:41.016: change engine path to C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine
16:48:41.074: &&&& RUNNING TensorRT.trtexec [TensorRT v100500] [b18] # C:/Program Files (x86)/SVP 4/rife\vsmlrt-cuda\trtexec --onnx=C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx --timingCacheFile=C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine.cache --device=0 --saveEngine=C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine --shapes=input:1x11x1440x2560 --fp16 --tacticSources=-CUBLAS,-CUBLAS_LT,-CUDNN,+EDGE_MASK_CONVOLUTIONS,+JIT_CONVOLUTIONS --useCudaGraph --noDataTransfers --noTF32 --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --layerPrecisions=*:fp16 --layerOutputTypes=*:fp16 --precisionConstraints=obey --builderOptimizationLevel=3
16:48:41.076: [10/17/2024-16:48:41] [i] === Model Options ===
16:48:41.076: [10/17/2024-16:48:41] [i] Format: ONNX
16:48:41.076: [10/17/2024-16:48:41] [i] Model: C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx
16:48:41.076: [10/17/2024-16:48:41] [i] Output:
16:48:41.076: [10/17/2024-16:48:41] [i] === Build Options ===
16:48:41.076: [10/17/2024-16:48:41] [i] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default, tacticSharedMem: default
16:48:41.076: [10/17/2024-16:48:41] [i] avgTiming: 8
16:48:41.076: [10/17/2024-16:48:41] [i] Precision: FP32+FP16 (obey precision constraints)
16:48:41.076: [10/17/2024-16:48:41] [i] LayerPrecisions: *:fp16
16:48:41.076: [10/17/2024-16:48:41] [i] Layer Device Types:
16:48:41.076: [10/17/2024-16:48:41] [i] Calibration:
16:48:41.076: [10/17/2024-16:48:41] [i] Refit: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Strip weights: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Version Compatible: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] ONNX Plugin InstanceNorm: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] TensorRT runtime: full
16:48:41.076: [10/17/2024-16:48:41] [i] Lean DLL Path:
16:48:41.076: [10/17/2024-16:48:41] [i] Tempfile Controls: { in_memory: allow, temporary: allow }
16:48:41.076: [10/17/2024-16:48:41] [i] Exclude Lean Runtime: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Sparsity: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Safe mode: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Build DLA standalone loadable: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Allow GPU fallback for DLA: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] DirectIO mode: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Restricted mode: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Skip inference: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Save engine: C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine
16:48:41.076: [10/17/2024-16:48:41] [i] Load engine:
16:48:41.076: [10/17/2024-16:48:41] [i] Profiling verbosity: 0
16:48:41.076: [10/17/2024-16:48:41] [i] Tactic sources: cublas [OFF], cublasLt [OFF], cudnn [OFF], edge mask convolutions [ON], JIT convolutions [ON],
16:48:41.076: [10/17/2024-16:48:41] [i] timingCacheMode: global
16:48:41.076: [10/17/2024-16:48:41] [i] timingCacheFile: C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine.cache
16:48:41.076: [10/17/2024-16:48:41] [i] Enable Compilation Cache: Enabled
16:48:41.076: [10/17/2024-16:48:41] [i] errorOnTimingCacheMiss: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Preview Features: Use default preview flags.
16:48:41.076: [10/17/2024-16:48:41] [i] MaxAuxStreams: -1
16:48:41.076: [10/17/2024-16:48:41] [i] BuilderOptimizationLevel: 3
16:48:41.076: [10/17/2024-16:48:41] [i] MaxTactics: -1
16:48:41.076: [10/17/2024-16:48:41] [i] Calibration Profile Index: 0
16:48:41.076: [10/17/2024-16:48:41] [i] Weight Streaming: Disabled
16:48:41.076: [10/17/2024-16:48:41] [i] Runtime Platform: Same As Build
16:48:41.076: [10/17/2024-16:48:41] [i] Debug Tensors:
16:48:41.076: [10/17/2024-16:48:41] [i] Input(s): fp16:chw
16:48:41.076: [10/17/2024-16:48:41] [i] Output(s): fp16:chw
16:48:41.076: [10/17/2024-16:48:41] [i] Input build shape (profile 0): input=1x11x1440x2560+1x11x1440x2560+1x11x1440x2560
16:48:41.076: [10/17/2024-16:48:41] [i] Input calibration shapes: model
16:48:41.076: [10/17/2024-16:48:41] [i] === System Options ===
16:48:41.076: [10/17/2024-16:48:41] [i] Device: 0
16:48:41.076: [10/17/2024-16:48:41] [i] DLACore:
16:48:41.076: [10/17/2024-16:48:41] [i] Plugins:
16:48:41.076: [10/17/2024-16:48:41] [i] setPluginsToSerialize:
16:48:41.076: [10/17/2024-16:48:41] [i] dynamicPlugins:
16:48:41.076: [10/17/2024-16:48:41] [i] ignoreParsedPluginLibs: 0
16:48:41.076: [10/17/2024-16:48:41] [i]
16:48:41.076: [10/17/2024-16:48:41] [i] === Inference Options ===
16:48:41.076: [10/17/2024-16:48:41] [i] Batch: Explicit
16:48:41.076: [10/17/2024-16:48:41] [i] Input inference shape : input=1x11x1440x2560
16:48:41.076: [10/17/2024-16:48:41] [i] Iterations: 10
16:48:41.076: [10/17/2024-16:48:41] [i] Duration: 3s (+ 200ms warm up)
16:48:41.076: [10/17/2024-16:48:41] [i] Sleep time: 0ms
16:48:41.076: [10/17/2024-16:48:41] [i] Idle time: 0ms
16:48:41.076: [10/17/2024-16:48:41] [i] Inference Streams: 1
16:48:41.076: [10/17/2024-16:48:41] [i] ExposeDMA: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] Data transfers: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] Spin-wait: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] Multithreading: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] CUDA Graph: Enabled
16:48:41.077: [10/17/2024-16:48:41] [i] Separate profiling: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] Time Deserialize: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] Time Refit: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] NVTX verbosity: 0
16:48:41.077: [10/17/2024-16:48:41] [i] Persistent Cache Ratio: 0
16:48:41.077: [10/17/2024-16:48:41] [i] Optimization Profile Index: 0
16:48:41.077: [10/17/2024-16:48:41] [i] Weight Streaming Budget: 100.000000%
16:48:41.077: [10/17/2024-16:48:41] [i] Inputs:
16:48:41.077: [10/17/2024-16:48:41] [i] Debug Tensor Save Destinations:
16:48:41.077: [10/17/2024-16:48:41] [i] === Reporting Options ===
16:48:41.077: [10/17/2024-16:48:41] [i] Verbose: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] Averages: 10 inferences
16:48:41.077: [10/17/2024-16:48:41] [i] Percentiles: 90,95,99
16:48:41.077: [10/17/2024-16:48:41] [i] Dump refittable layers:Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] Dump output: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] Profile: Disabled
16:48:41.077: [10/17/2024-16:48:41] [i] Export timing to JSON file:
16:48:41.077: [10/17/2024-16:48:41] [i] Export output to JSON file:
16:48:41.077: [10/17/2024-16:48:41] [i] Export profile to JSON file:
16:48:41.077: [10/17/2024-16:48:41] [i]
16:48:41.077: [10/17/2024-16:48:41] [i] === Device Information ===
16:48:41.097: [10/17/2024-16:48:41] [i] Available Devices:
16:48:41.097: [10/17/2024-16:48:41] [i] Device 0: "NVIDIA GeForce GTX 1080 Ti" UUID: GPU-7d3bcf50-539f-7218-8f2d-187b2ec4c543
16:48:41.167: [10/17/2024-16:48:41] [i] Selected Device: NVIDIA GeForce GTX 1080 Ti
16:48:41.167: [10/17/2024-16:48:41] [i] Selected Device ID: 0
16:48:41.167: [10/17/2024-16:48:41] [i] Selected Device UUID: GPU-7d3bcf50-539f-7218-8f2d-187b2ec4c543
16:48:41.167: [10/17/2024-16:48:41] [i] Compute Capability: 6.1
16:48:41.167: [10/17/2024-16:48:41] [i] SMs: 28
16:48:41.167: [10/17/2024-16:48:41] [i] Device Global Memory: 11263 MiB
16:48:41.167: [10/17/2024-16:48:41] [i] Shared Memory per SM: 96 KiB
16:48:41.167: [10/17/2024-16:48:41] [i] Memory Bus Width: 352 bits (ECC disabled)
16:48:41.167: [10/17/2024-16:48:41] [i] Application Compute Clock Rate: 1.582 GHz
16:48:41.167: [10/17/2024-16:48:41] [i] Application Memory Clock Rate: 5.505 GHz
16:48:41.167: [10/17/2024-16:48:41] [i]
16:48:41.167: [10/17/2024-16:48:41] [i] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
16:48:41.167: [10/17/2024-16:48:41] [i]
16:48:41.167: [10/17/2024-16:48:41] [i] TensorRT version: 10.5.0
16:48:41.167: [10/17/2024-16:48:41] [i] Loading standard plugins
16:48:41.188: [10/17/2024-16:48:41] [i] [TRT] [MemUsageChange] Init CUDA: CPU +1, GPU +0, now: CPU 11677, GPU 1060 (MiB)
16:48:41.674: [10/17/2024-16:48:41] [i] [TRT] [MemUsageChange] Init builder kernel library: CPU +17, GPU +0, now: CPU 12022, GPU 1060 (MiB)
16:48:41.676: [10/17/2024-16:48:41] [i] Start parsing network model.
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] ----------------------------------------------------------------
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] Input filename: C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] ONNX IR version: 0.0.8
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] Opset version: 16
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] Producer name: pytorch
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] Producer version: 2.2.0
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] Domain:
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] Model version: 0
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] Doc string:
16:48:41.712: [10/17/2024-16:48:41] [i] [TRT] ----------------------------------------------------------------
16:48:41.728: [10/17/2024-16:48:41] [i] Finished parsing network model. Parse time: 0.0520507
16:48:41.728: [10/17/2024-16:48:41] [i] Set shape of input tensor input for optimization profile 0 to: MIN=1x11x1440x2560 OPT=1x11x1440x2560 MAX=1x11x1440x2560
16:48:41.730: [10/17/2024-16:48:41] [i] Set layer /Split to precision fp16
... TRIMMED TO SAVE SPACE ...
16:48:41.736: [10/17/2024-16:48:41] [i] Set layer /Add_7 to precision fp16
16:48:41.736: [10/17/2024-16:48:41] [i] Skipped setting precisions for some layers. Check verbose logs for more details.
16:48:41.736: [10/17/2024-16:48:41] [i] Set output 0 of layer /Split to type fp16
... TRIMMED THIS TO SAVE SPACE ...
16:48:41.745: [10/17/2024-16:48:41] [i] Skipped setting output types for some layers. Check verbose logs for more details.
16:48:41.748: [TRT] Could not read timing cache from: C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine.cache. A new timing cache will be generated and written.
16:48:41.751: [10/17/2024-16:48:41] [E] Error[9]: IBuilder::buildSerializedNetwork: Error Code 9: API Usage Error (Target GPU SM 61 is not supported by this TensorRT release.)
16:48:41.751: [10/17/2024-16:48:
16:48:41.751: 41] [E] Engine could not be created from network
16:48:41.751: 16:48:41] [E] Building engine failed
16:48:41.751: [10/17/2024-16:48:41
16:48:41.752: ] [E] Failed to create engine from model or file.
16:48:41.752: 16:48:41] [E] Engine set up failed
16:48:41.752: &&&& FAILED TensorRT.trtexec [TensorRT v100500] [b18] # C:/Program Files (x86)/SVP 4/rife\vsmlrt-cuda\trtexec --onnx=C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx --timingCacheFile=C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine.cache --device=0 --saveEngine=C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine --shapes=input:1x11x1440x2560 --fp16 --tacticSources=-CUBLAS,-CUBLAS_LT,-CUDNN,+EDGE_MASK_CONVOLUTIONS,+JIT_CONVOLUTIONS --useCudaGraph --noDataTransfers --noTF32 --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --layerPrecisions=*:fp16 --layerOutputTypes=*:fp16 --precisionConstraints=obey --builderOptimizationLevel=3
16:48:41.937: [vapoursynth] Script evaluation failed:
16:48:41.937: [vapoursynth] Python exception: trtexec execution fails, log has been written to C:\Users\Crypto\AppData\Local\Temp\trtexec_241017_164841.log
16:48:41.937: [vapoursynth]
16:48:41.937: [vapoursynth] Traceback (most recent call last):
16:48:41.937: [vapoursynth] File "src\\cython\\vapoursynth.pyx", line 3121, in vapoursynth._vpy_evaluate
16:48:41.937: [vapoursynth] File "src\\cython\\vapoursynth.pyx", line 3122, in vapoursynth._vpy_evaluate
16:48:41.937: [vapoursynth] File "C:\Users\Crypto\AppData\Roaming\SVP4\scripts\ffff.py", line 81, in <module>
16:48:41.937: [vapoursynth] smooth = interpolate(clip)
16:48:41.938: [vapoursynth] ^^^^^^^^^^^^^^^^^
16:48:41.938: [vapoursynth] File "C:\Users\Crypto\AppData\Roaming\SVP4\scripts\ffff.py", line 60, in interpolate
16:48:41.938: [vapoursynth] smooth = RIFE_imp(input_rife,multi=rife_num,model=rife_mnum,backend=trt_backend)
16:48:41.938: [vapoursynth] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16:48:41.938: [vapoursynth] File "C:\Program Files (x86)\SVP 4\rife\helpers.py", line 23, in RIFE_imp
16:48:41.938: [vapoursynth] return RIFE(clip,multi,1.0,None,None,None,model_num,backend,ensemble,False,implementation)
16:48:41.938: [vapoursynth] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16:48:41.938: [vapoursynth] File "C:\Program Files (x86)\SVP 4\rife\vsmlrt.py", line 1184, in RIFE
16:48:41.938: [vapoursynth] output0 = RIFEMerge(
16:48:41.938: [vapoursynth] ^^^^^^^^^^
16:48:41.938: [vapoursynth] File "C:\Program Files (x86)\SVP 4\rife\vsmlrt.py", line 1051, in RIFEMerge
16:48:41.938: [vapoursynth] return inference_with_fallback(
16:48:41.939: [vapoursynth] ^^^^^^^^^^^^^^^^^^^^^^^^
16:48:41.939: [vapoursynth] File "C:\Program Files (x86)\SVP 4\rife\vsmlrt.py", line 2670, in inference_with_fallback
16:48:41.939: [vapoursynth] raise e
16:48:41.939: [vapoursynth] File "C:\Program Files (x86)\SVP 4\rife\vsmlrt.py", line 2647, in inference_with_fallback
16:48:41.939: [vapoursynth] ret = _inference(
16:48:41.939: [vapoursynth] ^^^^^^^^^^^
16:48:41.939: [vapoursynth] File "C:\Program Files (x86)\SVP 4\rife\vsmlrt.py", line 2514, in _inference
16:48:41.939: [vapoursynth] engine_path = trtexec(
16:48:41.939: [vapoursynth] ^^^^^^^^
16:48:41.939: [vapoursynth] File "C:\Program Files (x86)\SVP 4\rife\vsmlrt.py", line 2070, in trtexec
16:48:41.939: [vapoursynth] raise RuntimeError(f"trtexec execution fails, log has been written to {log_filename}")
16:48:41.939: [vapoursynth] RuntimeError: trtexec execution fails, log has been written to C:\Users\Crypto\AppData\Local\Temp\trtexec_241017_164841.log
16:48:41.939: [vapoursynth]
16:48:41.959: (!!!) Intermediate file may be broken: T:\Deep3D\upscales\test.SVP.temporary.mkv
16:48:41.959: ===== mpv exited with code 62097 =====

Trtexec log:

&&&& RUNNING TensorRT.trtexec [TensorRT v100500] [b18] # C:/Program Files (x86)/SVP 4/rife\vsmlrt-cuda\trtexec --onnx=C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx --timingCacheFile=C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine.cache --device=0 --saveEngine=C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine --shapes=input:1x11x1440x2560 --fp16 --tacticSources=-CUBLAS,-CUBLAS_LT,-CUDNN,+EDGE_MASK_CONVOLUTIONS,+JIT_CONVOLUTIONS --useCudaGraph --noDataTransfers --noTF32 --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --layerPrecisions=*:fp16 --layerOutputTypes=*:fp16 --precisionConstraints=obey --builderOptimizationLevel=3
[10/17/2024-16:48:41] [i] === Model Options ===
[10/17/2024-16:48:41] [i] Format: ONNX
[10/17/2024-16:48:41] [i] Model: C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx
[10/17/2024-16:48:41] [i] Output:
[10/17/2024-16:48:41] [i] === Build Options ===
[10/17/2024-16:48:41] [i] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default, tacticSharedMem: default
[10/17/2024-16:48:41] [i] avgTiming: 8
[10/17/2024-16:48:41] [i] Precision: FP32+FP16 (obey precision constraints)
[10/17/2024-16:48:41] [i] LayerPrecisions: *:fp16
[10/17/2024-16:48:41] [i] Layer Device Types: 
[10/17/2024-16:48:41] [i] Calibration: 
[10/17/2024-16:48:41] [i] Refit: Disabled
[10/17/2024-16:48:41] [i] Strip weights: Disabled
[10/17/2024-16:48:41] [i] Version Compatible: Disabled
[10/17/2024-16:48:41] [i] ONNX Plugin InstanceNorm: Disabled
[10/17/2024-16:48:41] [i] TensorRT runtime: full
[10/17/2024-16:48:41] [i] Lean DLL Path: 
[10/17/2024-16:48:41] [i] Tempfile Controls: { in_memory: allow, temporary: allow }
[10/17/2024-16:48:41] [i] Exclude Lean Runtime: Disabled
[10/17/2024-16:48:41] [i] Sparsity: Disabled
[10/17/2024-16:48:41] [i] Safe mode: Disabled
[10/17/2024-16:48:41] [i] Build DLA standalone loadable: Disabled
[10/17/2024-16:48:41] [i] Allow GPU fallback for DLA: Disabled
[10/17/2024-16:48:41] [i] DirectIO mode: Disabled
[10/17/2024-16:48:41] [i] Restricted mode: Disabled
[10/17/2024-16:48:41] [i] Skip inference: Disabled
[10/17/2024-16:48:41] [i] Save engine: C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine
[10/17/2024-16:48:41] [i] Load engine: 
[10/17/2024-16:48:41] [i] Profiling verbosity: 0
[10/17/2024-16:48:41] [i] Tactic sources: cublas [OFF], cublasLt [OFF], cudnn [OFF], edge mask convolutions [ON], JIT convolutions [ON], 
[10/17/2024-16:48:41] [i] timingCacheMode: global
[10/17/2024-16:48:41] [i] timingCacheFile: C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine.cache
[10/17/2024-16:48:41] [i] Enable Compilation Cache: Enabled
[10/17/2024-16:48:41] [i] errorOnTimingCacheMiss: Disabled
[10/17/2024-16:48:41] [i] Preview Features: Use default preview flags.
[10/17/2024-16:48:41] [i] MaxAuxStreams: -1
[10/17/2024-16:48:41] [i] BuilderOptimizationLevel: 3
[10/17/2024-16:48:41] [i] MaxTactics: -1
[10/17/2024-16:48:41] [i] Calibration Profile Index: 0
[10/17/2024-16:48:41] [i] Weight Streaming: Disabled
[10/17/2024-16:48:41] [i] Runtime Platform: Same As Build
[10/17/2024-16:48:41] [i] Debug Tensors: 
[10/17/2024-16:48:41] [i] Input(s): fp16:chw
[10/17/2024-16:48:41] [i] Output(s): fp16:chw
[10/17/2024-16:48:41] [i] Input build shape (profile 0): input=1x11x1440x2560+1x11x1440x2560+1x11x1440x2560
[10/17/2024-16:48:41] [i] Input calibration shapes: model
[10/17/2024-16:48:41] [i] === System Options ===
[10/17/2024-16:48:41] [i] Device: 0
[10/17/2024-16:48:41] [i] DLACore: 
[10/17/2024-16:48:41] [i] Plugins:
[10/17/2024-16:48:41] [i] setPluginsToSerialize:
[10/17/2024-16:48:41] [i] dynamicPlugins:
[10/17/2024-16:48:41] [i] ignoreParsedPluginLibs: 0
[10/17/2024-16:48:41] [i] 
[10/17/2024-16:48:41] [i] === Inference Options ===
[10/17/2024-16:48:41] [i] Batch: Explicit
[10/17/2024-16:48:41] [i] Input inference shape : input=1x11x1440x2560
[10/17/2024-16:48:41] [i] Iterations: 10
[10/17/2024-16:48:41] [i] Duration: 3s (+ 200ms warm up)
[10/17/2024-16:48:41] [i] Sleep time: 0ms
[10/17/2024-16:48:41] [i] Idle time: 0ms
[10/17/2024-16:48:41] [i] Inference Streams: 1
[10/17/2024-16:48:41] [i] ExposeDMA: Disabled
[10/17/2024-16:48:41] [i] Data transfers: Disabled
[10/17/2024-16:48:41] [i] Spin-wait: Disabled
[10/17/2024-16:48:41] [i] Multithreading: Disabled
[10/17/2024-16:48:41] [i] CUDA Graph: Enabled
[10/17/2024-16:48:41] [i] Separate profiling: Disabled
[10/17/2024-16:48:41] [i] Time Deserialize: Disabled
[10/17/2024-16:48:41] [i] Time Refit: Disabled
[10/17/2024-16:48:41] [i] NVTX verbosity: 0
[10/17/2024-16:48:41] [i] Persistent Cache Ratio: 0
[10/17/2024-16:48:41] [i] Optimization Profile Index: 0
[10/17/2024-16:48:41] [i] Weight Streaming Budget: 100.000000%
[10/17/2024-16:48:41] [i] Inputs:
[10/17/2024-16:48:41] [i] Debug Tensor Save Destinations:
[10/17/2024-16:48:41] [i] === Reporting Options ===
[10/17/2024-16:48:41] [i] Verbose: Disabled
[10/17/2024-16:48:41] [i] Averages: 10 inferences
[10/17/2024-16:48:41] [i] Percentiles: 90,95,99
[10/17/2024-16:48:41] [i] Dump refittable layers:Disabled
[10/17/2024-16:48:41] [i] Dump output: Disabled
[10/17/2024-16:48:41] [i] Profile: Disabled
[10/17/2024-16:48:41] [i] Export timing to JSON file: 
[10/17/2024-16:48:41] [i] Export output to JSON file: 
[10/17/2024-16:48:41] [i] Export profile to JSON file: 
[10/17/2024-16:48:41] [i] 
[10/17/2024-16:48:41] [i] === Device Information ===
[10/17/2024-16:48:41] [i] Available Devices: 
[10/17/2024-16:48:41] [i]   Device 0: "NVIDIA GeForce GTX 1080 Ti" UUID: GPU-7d3bcf50-539f-7218-8f2d-187b2ec4c543
[10/17/2024-16:48:41] [i] Selected Device: NVIDIA GeForce GTX 1080 Ti
[10/17/2024-16:48:41] [i] Selected Device ID: 0
[10/17/2024-16:48:41] [i] Selected Device UUID: GPU-7d3bcf50-539f-7218-8f2d-187b2ec4c543
[10/17/2024-16:48:41] [i] Compute Capability: 6.1
[10/17/2024-16:48:41] [i] SMs: 28
[10/17/2024-16:48:41] [i] Device Global Memory: 11263 MiB
[10/17/2024-16:48:41] [i] Shared Memory per SM: 96 KiB
[10/17/2024-16:48:41] [i] Memory Bus Width: 352 bits (ECC disabled)
[10/17/2024-16:48:41] [i] Application Compute Clock Rate: 1.582 GHz
[10/17/2024-16:48:41] [i] Application Memory Clock Rate: 5.505 GHz
[10/17/2024-16:48:41] [i] 
[10/17/2024-16:48:41] [i] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[10/17/2024-16:48:41] [i] 
[10/17/2024-16:48:41] [i] TensorRT version: 10.5.0
[10/17/2024-16:48:41] [i] Loading standard plugins
[10/17/2024-16:48:41] [i] [TRT] [MemUsageChange] Init CUDA: CPU +1, GPU +0, now: CPU 11677, GPU 1060 (MiB)
[10/17/2024-16:48:41] [i] [TRT] [MemUsageChange] Init builder kernel library: CPU +17, GPU +0, now: CPU 12022, GPU 1060 (MiB)
[10/17/2024-16:48:41] [i] Start parsing network model.
[10/17/2024-16:48:41] [i] [TRT] ----------------------------------------------------------------
[10/17/2024-16:48:41] [i] [TRT] Input filename:   C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx
[10/17/2024-16:48:41] [i] [TRT] ONNX IR version:  0.0.8
[10/17/2024-16:48:41] [i] [TRT] Opset version:    16
[10/17/2024-16:48:41] [i] [TRT] Producer name:    pytorch
[10/17/2024-16:48:41] [i] [TRT] Producer version: 2.2.0
[10/17/2024-16:48:41] [i] [TRT] Domain:           
[10/17/2024-16:48:41] [i] [TRT] Model version:    0
[10/17/2024-16:48:41] [i] [TRT] Doc string:       
[10/17/2024-16:48:41] [i] [TRT] ----------------------------------------------------------------
[10/17/2024-16:48:41] [i] Finished parsing network model. Parse time: 0.0520507
[10/17/2024-16:48:41] [i] Set shape of input tensor input for optimization profile 0 to: MIN=1x11x1440x2560 OPT=1x11x1440x2560 MAX=1x11x1440x2560
[10/17/2024-16:48:41] [i] Set layer /Split to precision fp16
... TRIMMED THIS TO SAVE SPACE ...
[10/17/2024-16:48:41] [i] Skipped setting precisions for some layers. Check verbose logs for more details.
... TRIMMED TO SAVE SPACE ...
[10/17/2024-16:48:41] [i] Set output 0 of layer /Add_7 to type fp16
[10/17/2024-16:48:41] [i] Skipped setting output types for some layers. Check verbose logs for more details.
[10/17/2024-16:48:41] [W] [TRT] Could not read timing cache from: C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine.cache. A new timing cache will be generated and written.
[10/17/2024-16:48:41] [E] Error[9]: IBuilder::buildSerializedNetwork: Error Code 9: API Usage Error (Target GPU SM 61 is not supported by this TensorRT release.)
[10/17/2024-16:48:41] [E] Engine could not be created from network
[10/17/2024-16:48:41] [E] Building engine failed
[10/17/2024-16:48:41] [E] Failed to create engine from model or file.
[10/17/2024-16:48:41] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100500] [b18] # C:/Program Files (x86)/SVP 4/rife\vsmlrt-cuda\trtexec --onnx=C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx --timingCacheFile=C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine.cache --device=0 --saveEngine=C:\Users\Crypto\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.9.onnx.2560x1440_fp16_no-tf32_trt-100500_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1080-Ti_a8bbf24b.engine --shapes=input:1x11x1440x2560 --fp16 --tacticSources=-CUBLAS,-CUBLAS_LT,-CUDNN,+EDGE_MASK_CONVOLUTIONS,+JIT_CONVOLUTIONS --useCudaGraph --noDataTransfers --noTF32 --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --layerPrecisions=*:fp16 --layerOutputTypes=*:fp16 --precisionConstraints=obey --builderOptimizationLevel=3

4 (edited by flowreen91 17-10-2024 21:02:17)

Re: RIFE Transcoding Fails After SVP Update

[10/17/2024-16:48:41] [E] Error[9]: IBuilder::buildSerializedNetwork: Error Code 9: API Usage Error (Target GPU SM 61 is not supported by this TensorRT release.)
Time to upgrade GPU xD

Re: RIFE Transcoding Fails After SVP Update

does it even supposed to work on a non-RTX card?
any benefits over the Vulkan version?

6 (edited by Cryptokid 17-10-2024 23:57:24)

Re: RIFE Transcoding Fails After SVP Update

I had been using it for over a year now and it worked fine as far as I could tell, I had a pretty consistent workflow with it. I had never even tried the vulkan version.

Upon further research it seems like Pascal support was dropped starting in TensorRT 8.6.1.

This explains why downgrading to vsmlrt to v13.7.1 was working for me, it was the last version before upgrading to 8.6.1.

But now what I can't figure out is what broke for me in the process of reporting this error while writing this post. Putting that same package back into my folder allows it to work again, but it runs significantly slower than before (0.1 - 0.5 fps vs 1-5 fps previously). I'm not sure what could have happened to cause this, as all I did was rename these files temporarily to reproduce the error for the post. When I swapped them back, this new issue arose.

Re: RIFE Transcoding Fails After SVP Update

> I had never even tried the vulkan version.

Maybe you should, since there no RT cores in your card anyway.

Re: RIFE Transcoding Fails After SVP Update

I tried the Vulkan version and it works with 4.4, but when I try adding other models / using them, I have a similar speed issue as described above, my conversion speed with 4.7 and 4.9 was abysmal, literally as recent as 2 days ago it was much faster.