Pezede wrote:

movie *3 :

https://i.imgur.com/cSYcCPx.jpeg

Thanks a lot! 4K HDR x3 in real time with RIFE!!! Unbelievable!!!

And I so wanted to save on RAM and CPU sad

Do you see any colour difference on the HDR screen watching this demo without interpolation and with RIFE interpolation?

grobalt wrote:

I replicated the exactly same setting as shown by Pezede for transcoding -> frames by 5 for a 24p 1080p movie, lossless preset etc
Ryzen 5600x, 3800 MHz Memory -> starts with about 190fps

doing the same for a 4k uhd movie -> 32,8fps
same for the LG New York Demo clip, 32 fps

Is your 1080p also 1920x1080, because that matters too?

DragonicPrime wrote:
UHD wrote:
DragonicPrime wrote:

Just tried the updated instructions. Getting around 50fps at 4k with an RTX 4090 now. Between 170-190fps(it kept going up and down for some reason) at 1080p. These improvements are huge. Thanks for updating the instructions

Could you check if the video below can now be interpolated in real time with RIFE?
Second question: is it possible to preserve the 10 bit colour depth and HDR when interpolating in real time with RIFE the video below?
In other words, could you compare the colours of the video below played back without interpolation and with RIFE interpolation.

LG 4K HDR Demo - New York.ts
File size: 448 MiB
Duration: 1 min 13 sec
Overall bit rate: 51.4 Mbps
HDR format: SMPTE ST 2086, HDR10 compatible
Width: 3 840 pixels
Height: 2 160 pixels
Frame rate: 25.000 FPS
Color space: YUV
Chroma subsampling: 4:2:0
Bit depth: 10 bits

Direct link: https://drive.google.com/file/d/1dfR5TT … _bGfEXUvJ/
Source: http://hdr4k.blogspot.com/

4k HDR still doens't work in real time. I just updated my previous message as well. 4K SDR seems to work with no problems in real time though. 4K with HDR, I only get around 35-40fps


Pezede, would you please check the above demo in real time 4K HDR and fps, we would have an interesting comparison.

Pezede wrote:

I've done a reinstall of SVP and I'm now getting ~280fps on 1080P transcoding with the new guide, that seems almost miraculous...

I've gotten the console window and there are lock files in the rife folder so it seems to be used.

Hardware is 4090 paired with a 7950X and DDR5 6000 ram.

so CPU and RAM really matter!

DragonicPrime wrote:

On my second monitor which is just 1080p, MPV is downscaling it, so it runs perfectly

The question is whether MPV downscaling in real time to 1080p preserves the 10-bit colour depth and HDR when interpolating in real time with RIFE.

In other words, could you compare the colours of the video below played back without interpolation and with RIFE interpolation.

and what interpolation factor (x2, x3, x4) is possible with such a 1080p 10-bit HDR file:

LG 4K HDR Demo - New York.ts

compared to 1080p 8-bit?

grobalt wrote:

will do this soon .. currently generating the engine files for 4k/UHD resolution

If you can please also test this file that I ask DragonicPrime in the above post. I'm very curious, as I'm also planning to buy a 4090 graphics card. If two people succeed, it will be the best confirmation of the capabilities of these highest performance graphics cards.

DragonicPrime wrote:

Just tried the updated instructions. Getting around 50fps at 4k with an RTX 4090 now. Between 170-190fps(it kept going up and down for some reason) at 1080p. These improvements are huge. Thanks for updating the instructions

Could you check if the video below can now be interpolated in real time with RIFE?
Second question: is it possible to preserve the 10 bit colour depth and HDR when interpolating in real time with RIFE the video below?
In other words, could you compare the colours of the video below played back without interpolation and with RIFE interpolation.

LG 4K HDR Demo - New York.ts
File size: 448 MiB
Duration: 1 min 13 sec
Overall bit rate: 51.4 Mbps
HDR format: SMPTE ST 2086, HDR10 compatible
Width: 3 840 pixels
Height: 2 160 pixels
Frame rate: 25.000 FPS
Color space: YUV
Chroma subsampling: 4:2:0
Bit depth: 10 bits

Direct link: https://drive.google.com/file/d/1dfR5TT … _bGfEXUvJ/
Source: http://hdr4k.blogspot.com/

grobalt wrote:

I deleted SVP and startet from scratch, just to check if the inscructions are complete and everything is working.
There is a step missing.
After replacing generate.js and base.py, start SVP4, add the new option TensorRT etc.

Then the missing step:
Copy the Rife AI profile and select the AI Model "rife"
Enable the new Option TensorRT On


This is a good solution. Start from scratch and describe all the steps that are missing.

In other words, create some instruction for a completely new person so that they do not get lost.

Chainik wrote:

GOOD NEWS EVERYONE!

updated instructions

should improve FPS on 4080-and-better (probably 4070/3080 too, dunno), when performance is bound by the system's RAM bandwidth, not GPU power
i.e. for 4K playback


Chainik wrote:

not sure what you're doing, but it's OK even on a 2060 laptop now big_smile

What exactly did you do? I'm very curious to know what solved the memory problems. 1080p real time with RIFE using 2060 laptop is impressive!

In my opinion, the more options are tested and the more test details are given the better. As for example in this already quite old post:

blackmickey1007 wrote:

### Environment ###
Windows 10
DDR4-2933 48GiB
Nvidia RTX2070 8GiB
Nvidia Driver 511.79
CUDA Toolkit 11.3
cuDNN v8.2.1 (June 7th, 2021), for CUDA 11.x

### Software ###
Python 3.10.4
VapourSynth R58-RC2
PyTorch 1.11.0 (CUDA 11.3)
vs_rife v2.0.0
VapourSynth-RIFE-ncnn-Vulkan r3 (model: 4.0)

### Tools & Seting ###
GPU-Z 2.45.0
VapourSynth Editor r19-mod-5-AC2
VapourSynth threads: core.num_threads = 4
Decoder: lsmas.LWLibavSource(format="yuv420p8", prefer_hw=3)
Video: demo.mp4 [720p]

### Result ###
1. RIFE filter for VapourSynth (PyTorch CUDA) - vs_rife v2.0.0
    Interpolation: x2
    RIFE model: 4.0
    scale: 1.0
    FP16: False
   
    FPS: 54.115
    CUDA: ~50%
    PerfCap: VRel, VOp, Pwr

2. RIFE filter for VapourSynth (PyTorch CUDA) - vs_rife v2.0.0
    Interpolation: x2
    RIFE model: 4.0
    scale: 0.5
    FP16: False
   
    FPS: 69.997
    CUDA: ~40%
    PerfCap: VRel, VOp

3. RIFE filter for VapourSynth (PyTorch CUDA) - vs_rife v2.0.0
    Interpolation: x2
    RIFE model: 4.0
    scale: 0.5
    FP16: True
   
    FPS: 70.936
    CUDA: ~32%
    PerfCap: VRel, VOp

4. RIFE filter for VapourSynth (ncnn Vulkan) - VapourSynth-RIFE-ncnn-Vulkan r3
    Interpolation: x2
    RIFE model: 4.0
    GPU thread: 1
    tta: False
    uhd: False
    sc: True
   
    FPS: 27.356
    CUDA: ~1%
    Compute_1: 30%
    PerfCap: Idle
   
5. RIFE filter for VapourSynth (ncnn Vulkan) - VapourSynth-RIFE-ncnn-Vulkan r3
    Interpolation: x2
    RIFE model: 4.0
    GPU thread: 2
    tta: False
    uhd: False
    sc: True
   
    FPS: 92.956
    CUDA: ~15%
    Compute_1: ~94%
    PerfCap: VRel, VOp, Pwr
   
6. RIFE filter for VapourSynth (ncnn Vulkan) - VapourSynth-RIFE-ncnn-Vulkan r3
    Interpolation: x2
    RIFE model: 4.0
    GPU thread: 2
    tta: False
    uhd: True
    sc: True
   
    FPS: 92.366
    CUDA: ~15%
    Compute_1: ~94%
    PerfCap: VRel, VOp, Pwr
   
7. RIFE filter for VapourSynth (ncnn Vulkan) - VapourSynth-RIFE-ncnn-Vulkan r3
    Interpolation: x2
    RIFE model: 4.0
    GPU thread: 2
    tta: False
    uhd: False
    sc: False
   
    FPS: 87.083
    CUDA: ~15%
    Compute_1: ~94%
    PerfCap: VRel, VOp, Pwr
   
8. RIFE filter for VapourSynth (ncnn Vulkan) - VapourSynth-RIFE-ncnn-Vulkan r3
    Interpolation: x2
    RIFE model: 4.0
    GPU thread: 3
    tta: False
    uhd: False
    sc: True
   
    FPS: 90.645
    CUDA: ~15%
    Compute_1: ~94%
    PerfCap: Idle

https://www.svp-team.com/forum/viewtopi … 219#p80219

grobalt wrote:

with software transcoding ?

288 new fps (1080p) for 4090+13900k (TensorRT8.5+vs_threads=4+fp16) (rife46) (num_streams=10) (benchmark was done with vspipe file.py -p . instead of piping into ffmpeg and rendering to avoid cpu bottleneck)

164 new fps (1080p) for 4090+5950x (ncnn+2 threads+4 vs threads+ffmpeg (ultrafast) (rife4.6)

Source: https://github.com/styler00dollar/VSGAN-tensorrt-docker


It is best to check and test all options.

Chainik wrote:

UHD
> We are now testing

you are not big_smile

I'm testing virtually without a proper graphics card, and this is even more difficult lol

grobalt wrote:

this screenshot shows software h264 transcoding, if i use the settings of the screenshot my 6core CPU is at 100%, GPU with 2 threads at about 35% utilization and 19fps (4k) or 41 fps (starts above 50 but after some time 41 is stabelized) (1080p)

Use the 'ultrafast' preset and let us know if performance has improved:
https://trac.ffmpeg.org/wiki/Encode/H.264

We are now testing RIFE and looking for bottlenecks smile

DragonicPrime wrote:

Tested this out with an RTX 4090 and seem to be getting around 115fps on a 1080p video. So much better than the default implementation. Used to only get around 80fps with the default

It is good that there are more 4090 card owners on this forum. It will be easier to compare results smile

grobalt wrote:

Thanks, will invest later tonight how to apply this smile do you read the github thread ?

I hope a solution can be found. Looking at what the 3070Ti card can do, I'm very curious to see what will be achieved with the 4090. Bottlenecks will probably appear somewhere and if they can be identified then the potential for performance gains is huge. You are blazing a new trail, the next ones after you will find it easier wink

aloola wrote:

it did 10% better smile

Why not try even faster settings?

ultrafast
superfast
veryfast
faster

aloola wrote:

flownet_v4.6.pkl_NVIDIA GeForce RTX 3070 Ti_trt-8.5.2.2_1280x768_fp32_workspace-1073741824_scale-1.0_ensemble-False.pt


grobalt wrote:

flownet_v4.6.pkl_NVIDIA GeForce RTX 4090_trt-8.5.2.2_3840x2176_fp32_workspace-1073741824_scale-1.0_ensemble-False


aloola wrote:

clip = clip.resize.Bicubic(format=vs.RGBS, matrix_in_s="709")


I will be following this thread over the weekend and today at the end I propose to change it:

vs.RGBS

to

vs.RGBH

this should force FP16 precision in vs-rife and double the performance.

grobalt wrote:

I am new to SVP but due to RIFE implementation I ordered a 4090. If you guide me how to test / benchmark I will show all results

I have identical plans to buy a NVIDIA GeForce RTX 4090 also because of RIFE. I hope we can work something out together to make the fastest RIFE filter work in real time.

grobalt wrote:

RTX4090realtime mpv crashes or does not even start

You are already the third person to confirm this problem. Thanks for the tests smile

aloola wrote:

the code for mpv.

import vapoursynth as vs
core = vs.core
from vsrife import RIFE

clip = video_in
clip = clip.resize.Bicubic(format=vs.RGBS, matrix_in_s="709")
clip = RIFE(clip,trt=True,factor_num=5,factor_den=1)
clip = clip.resize.Bicubic(format=vs.YUV420P8, matrix_s="709")
clip.set_output()

If I understood correctly, exactly the same settings, but without trt=True (or with trt=False) in your case allow for smooth real-time interpolation?

UHD wrote:

Can you post the settings script you used with mpv?

Something like here: https://github.com/HolyWu/vs-rife/issue … -967073164 but of course together with all the parameters you set for the vs-rife filter

If this vs-rife filter used directly with mpv with setting:

trt=False

allows real-time interpolation of 720p files

and with the setting

trt=True

does not allow real-time interpolation of 720p files, this means that we should report the issue to HolyWu.

Can you post the settings script you used with mpv?

I still don't have a graphics card that allows me to test, but if someone else confirms the same problem then we can report it to HolyWu.

The only point I can see in adding this filter to SVP is that it can interpolate in real time, and faster than using RIFE-ncnn-Vulkan.

aloola wrote:

I tried vsrife trt before, but it doesn't work in real-time MPV, only when transcoding.

Have you tried directly using this filter https://github.com/HolyWu/vs-rife in real time with mpv even on video with lower resolution, for example 720p?

Thanks for the tests aloola smile

aloola wrote:

maybe you should try this? https://github.com/AmusementClub/vs-mlrt/wiki/RIFE
it works fine with mpc and mpv for me, 1080px3 in realtime.

I think we should first try to find the cause of the problems. Particularly since, with real-time interpolation, every frame matters.

vs-rife using TensorRT should be faster than vs-mlrt using TensorRT by at least 36% - https://github.com/HolyWu/vs-rife/discussions/19 :

45.91 fps NVIDIA GeForce RTX 3050 (1080p, FP16, model 4.6, vs-rife using TensorRT)
33.66 fps NVIDIA GeForce RTX 3050 (1080p, FP16, model 4.6, vs-mlrt using TensorRT)

Knowing the performance of graphics cards:

Fourth-generation Tensor Cores - Peak FP16 using the Sparsity feature:

660.6 TFLOPS - NVIDIA GeForce RTX 4090
https://images.nvidia.com/aem-dam/Solut … ecture.pdf

Third-Generation Tensor Cores - Peak FP16 using the Sparsity feature:

174 TFLOPS - NVIDIA GeForce RTX 3070 Ti
https://www.anandtech.com/show/17204/nv … more-money
72.8 TFLOPS - NVIDIA GeForce RTX 3050
https://www.computerbase.de/2022-01/nvi … 3050-test/

we should get the following results, at least in theory with scaling proportional to the increase in performance:

109,73 fps NVIDIA GeForce RTX 3070 Ti (1080p, FP16, model 4.6, vs-rife using TensorRT)
416,60 fps NVIDIA GeForce RTX 4090 (1080p, FP16, model 4.6, vs-rife using TensorRT)

Chainik wrote:

=== RIFE / PyTorch+TensorRT installation ===

Huge thanks Chainik smile

ToasterPC wrote:

https://i.imgur.com/qayREAZ.png


Increase GPU threads to 2. This will double the performance.

But even then, don't count on much, as the GeForce RTX 3070 Laptop GPU has very limited Tensor Cores capabilities compared to its desktop counterpart: https://en.wikipedia.org/wiki/GeForce_30_series

TensorRT can add another 50% performance:
https://github.com/HolyWu/vs-rife/discu … nt-4117604