Chainik wrote:

> Could you find a moment of time and add a new parameter to the RIFE configuration in SVP

yeah... but you do understand that x3 (or hypothetical x2.5) will be 2 times slower, and x4 will be 3 times slower, right?

Have you succeeded in adding the following parameter yet?
https://github.com/HolyWu/vs-rife/blob/ … t__.py#L20

Full list as per details in previous post:

x2    1h41m
x3    3h21m
x4    5h2m
x5    6h42m
x6    8h22m
x7    10h3m
x8    11h43m
x9    13h23m
x10  15h4m

Of course, it may be that 2 new interpolated frames at the x3 setting will not necessarily require 2x the time to interpolate 1 frame. It is possible that some synergy effect will take place and x1.9 time or 1.8 will be enough. This needs to be verified in a real test.

Too bad I don't have a RIFE compatible graphics card yet, but I'll play around with theoretical calculations based on the dlr5668 result:

1920x1080 @23.976

42.5 fps: https://www.svp-team.com/forum/viewtopi … 699#p79699

so

21.25 new interpolated frames per sec.


90 minutes movie has
=90*60*23.976=129470.4 frames

re-encoding will take:

x5 (for 120Hz TV or monitor)
=4*129470.4/21.5=24088 seconds or 6h42m

x10
(for 240Hz TV or monitor)
=9*129470.4/21.5=54197 seconds or 15h4m

smile

If I understand what is written here correctly: https://arxiv.org/pdf/2011.06294.pdf

Position 0 is the first real frame.
Position 1 is the second real frame.

Before RIFEm and RIFE 4.0, only position 0.5 (x2) was possible.
To get positions 0.25 and 0.75 we had to repeat the process again.

Now it is possible to get any position between 0 and 1 the first time on the principle:
0, 0.5, 1 (x2)
0, 0.333, 0.667, 1 (x3)
0, 0.25, 0.5, 0.75, 1 (x4)
0, 0.2, 0.4, 0.6, 0.8, 1 (x5)
... and so on

I, for quality reasons would never use a method that omits real frame 1.
It is better to use: Custom Resolution Utility (CRU): https://www.monitortests.com/forum/Thre … tility-CRU

In anyway, trying to use x2.5 should generate an error report:
https://github.com/HolyWu/vs-rife/blob/ … t__.py#L40

Chainik wrote:

> Could you find a moment of time and add a new parameter to the RIFE configuration in SVP

yeah... but you do understand that x3 (or hypothetical x2.5) will be 2 times slower, and x4 will be 3 times slower, right?

Thanks smile

If I understand correctly only integer is possible:
https://github.com/HolyWu/vs-rife/blob/ … t__.py#L40

so x2, x3, x4, x5 should be possible, but not x2.5.

Also I understand the mechanism, that x3 is 2 new interpolated frames, so theoretically it means 2x more computing power or takes 2x longer than interpolating 1 frame with x2 setting.

dlr5668 wrote:

Sadly cant measure CUDA

https://i.imgur.com/83Dd84a.png


Check out my post here: https://www.svp-team.com/forum/viewtopi … 493#p79493 and especially lwk7454's reply here: https://www.svp-team.com/forum/viewtopi … 497#p79497

Am I seeing correctly? Even 42.5fps manages to be achieved?

Chainik wrote:

=== RIFE / PyTorch installation ===

Many thanks Chainik!

Could you find a moment of time and add a new parameter to the RIFE configuration in SVP:

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L20

dlr5668 wrote:

3.8 model gives me 24 fps and this one 41
https://i.imgur.com/tkP4tFn.png


Many thanks for the tests! A huge improvement in the RIFE algorithm can be seen!

These are for comparison the results on a GeForce RTX 3090 graphics card done by: lwk7454:


lwk7454 wrote:

Parameters:
Test-Time Augmentation: Enabled [sets RIFE filter for VapourSynth (PyTorch)]
re-encoding with x2 interpolation
RIFE model: 3.8
scale=1.0
Encoder: NVIDIA NVENC H.264

720p, FP16
FPS: 63.5
Cuda: 56%

720p, FP32
FPS: 69.8
Cuda: 58%

1080p, FP16
FPS: 26.9
Cuda: 62%

1080p, FP32
FPS: 28.1
Cuda: 66%

https://www.svp-team.com/forum/viewtopi … 526#p79526

We can see how similar results the 3.8 model gave for GeForce RTX 3090 and GeForce RTX 3070 Ti graphics cards:

28.1 FPS vs. 24 FPS

and how huge the difference is with the 4.0 model: 40.9 FPS!!!!!


dlr5668, could you please specify the CUDA load for these tests. This is very important data to determine if the new model makes better use of the graphics cards capabilities or if there is still room for optimization, as I described here: https://github.com/hzwer/arXiv2020-RIFE/issues/217

By the way, thanks for checking that FP32 performance is close to FP16. However, if you would provide precise data for which precision this is, that would be additional information that I could paste in the link provided above.

It would be good if you could present data in the same range as lwk7454, it would be consistent with what I presented here https://github.com/hzwer/arXiv2020-RIFE/issues/217 and maybe the RIFE developer will come up with something extra to achieve 50FPS for 1080p, which would be satisfactory for real time interpolation!

I also have a request to everyone posting on this thread and reading this thread.

I have written a request for help in optimizing RIFE for the latest graphics cards. You can find the details at this link: https://github.com/hzwer/arXiv2020-RIFE/issues/217

I would like as many people as possible to do a simple test, described below. You can share your results either directly on the RIFE project page here: https://github.com/hzwer/arXiv2020-RIFE/issues/217 or here, and I will then paste them together there.

With the test of course you will have to wait until the 4.0 model is working with SVP.


And here are the details of the proposed test:

Fixed test parameters:

SVP & RIFE filter for VapourSynth (PyTorch)
re-encoding with x2 interpolation
RIFE model: 4.0
scale=1.0

Variable test parameters:

Math precision: FP16 and FP32

Test results:

re-encoding speed [FPS]
CUDA utilisation [%]

Video file:

original demo video from the creator of RIFE at: https://github.com/hzwer/arXiv2020-RIFE
720p (1280x720), 25FPS, 53 s 680 ms, 4:2:0 YUV, 8 bits
direct link: https://drive.google.com/file/d/1i3xlKb … sp=sharing

Chainik wrote:

> Now time to edit svp script

update SVP wink


Chainik, please if you find a moment of time add a new parameter to the RIFE configuration in SVP:

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L20


Yes, finally we can with vs-rife not only interpolate x2 but also x3, x4, x5, x6, x7, x8, x9, x10...

If anyone is surprised by x3, x5 - this is the first and most important change compared to previous models.

The second change is the correction of the bugs that lowered the quality in the previous models:
https://github.com/hzwer/Practical-RIFE/issues/3
https://github.com/hzwer/Practical-RIFE/issues/5

The third change is an even better performance of the 4.0 model, which is based on the 3.9 model with the elimination of a bug: https://github.com/hzwer/arXiv2020-RIFE/issues/219

Added RIFE 3.9 model, big speed improvements (54% faster than 3.1, 100% faster than 2.3

.
https://github.com/n00mkrad/flowframes/ … g.full.txt

Fadexz wrote:

This still relevant to you?
If so yeah I use Vulkan in mpv, it doesn't allow for any overlays and stuff in fullscreen because it takes control of the display or something like displays directly, I don't know much about it.
I just use whatever settings to make it run easiest as that's really all that matters, to make it actually run.
I just gave it a test and it doesn't crash now so must have already been fixed.

Yes, information from people who have successfully run real-time interpolation using the RIFE filter for VapourSynth (PyTorch) is all the time most important to me, as this is how I will use the filter most often.

My main goal is to interpolate 1080p files in real time using SVP and the RIFE filter for VapourSynth (PyTorch), once I buy a sufficiently powerful graphics card. Therefore, in addition to real-time testing, I am also interested in the performance of each graphics card and looking for ways to optimise interpolation using the RIFE algorithm.

At the moment the priority is to implement the new 4.0 RIFE model to work with SVP. I have a request that you stay with us on this thread and help test this model.

I will come back to the topic of real-time interpolation and have some more questions for you. Thank you for your answer and also for not forgetting my questions.

Возможно, теперь в SVP можно использовать больше алгоритмов, а не только RIFE: https://www.svp-team.com/forum/viewtopi … 553#p79553

I have a question. Does anyone read this forum and know enough to be able to test some very innovative interpolation model based on machine learning (artificial intelligence)?

This is a special model that could be either revolutionary or still need a lot of work to make it useful to the average interpolation enthusiast.

It differs significantly from RIFE or other top models of traditional AI frame interpolation such as: SoftSplat, XVFI, FLAVR, ABME, QVI and EQVI.

I have high hopes for it but also a lot of concerns.

The model requires at least:

NVIDIA GPU
CUDA 9.0
CuDNN 7.6.5
Pytorch 1.1.0

Only one model and only for someone who has the REAL SKILLS to test it in practice and of course the willingness

Chainik wrote:

yeah, ok... so, wtf is "rvpV1_105661_G.pt" and why the hell you prefer it over rife? big_smile

This seems to be the CAIN model, but it can be changed to anything else:

# currenlty has a hardcoded JIT cain model, you can replace that with anything you want

https://github.com/styler00dollar/vs-vf … ference.py

I don't need CAIN, but there is another model that is interesting.

Возможно, теперь в SVP можно использовать больше алгоритмов, а не только RIFE: https://www.svp-team.com/forum/viewtopi … 553#p79553

Nothing more?

Until now only RIFE models could be used with vs-rife.
Now we can probably use any algorithm based on machine learning (artificial intelligence)!

Isn't it beautiful? Everyone could choose something for himself and in one software tool.

For a programmer like you this is probably nothing new and you could probably write something like this yourself. However, for a person like me all this code is a big black magic. What is important for me is that the greater the possibility of choice the faster the development of better interpolation methods and this also means better quality. And this is what all of us enthusiasts of smooth video should care about most.

Today, while doing my daily GitHub browsing for news related to AI frame interpolation I found something like this: https://github.com/styler00dollar/vs-vfi

The project is based on the code of the famous vs-rife: https://github.com/HolyWu/vs-rife

From what I understood from the new project -  https://github.com/styler00dollar/vs-vfi it is supposed to allow using any interpolation algorithm based on machine learning (artificial intelligence) as vapoursynth filter and work with mpv.

Chainik and MAG79 how do you see it? Is it possible to implement this into SVP?

That would be a real revolution!
SVP that allows to use any interpolation algorithm based on machine learning (artificial intelligence)!

It would end such articles about SVP once and for all:

Best SVP (SmoothVideo Project) Alternative: AI Video Interpolation
https://www.dvdfab.cn/resource/dvd/svp-alternative

Many thanks for the tests and for commenting on them.

What can I add? It is a great pity that no one has done such tests before and shared the results publicly. I am very curious what the developer of RIFE will say about this. After all, he knows the RIFE code best and how it works. If anyone would like to perform such tests on another graphics card I highly encourage you to do it. In about a week I will post the results on the RIFE project website: https://github.com/hzwer/arXiv2020-RIFE/issues

Once again all results together:

FP32

FPS 63.28 - Cuda ~45% - v3.8
FPS 59.43 - Cuda ~50% - v3.1
FPS 58.22 - Cuda ~70% - v2.4
FPS 58.39 - Cuda ~70% - v2.3
FPS 55.98 - Cuda 87% - v1.8

FP16

FPS 61.32 - Cuda ~45% - v3.8
FPS 60.03 - Cuda ~40% - v3.1
FPS 57.06 - Cuda ~50% - v2.4
FPS 57.20 - Cuda ~55% - v2.3
FPS 58.87 - Cuda ~70% - v1.8


lwk7454, when it comes to re-encoding tests I think we already have the full picture regarding the potential of today's most efficient consumer graphics card with relation to all major RIFE models and both computational precisions. I hope you'll continue to follow this thread and the place where I intend to share your results with a wider group of people interested in RIFE along with its creator: https://github.com/hzwer/arXiv2020-RIFE/issues  Maybe someone will come up with some ideas on how to increase the utilization of the graphics card power? I'm especially counting on the creator of RIFE, who is very helpful, as you can see in this example: https://github.com/hzwer/arXiv2020-RIFE/issues/207  If you would like to post results and questions yourself, let us know. After all, it is thanks to your work that we know there is still a lot of potential in the NVIDIA GeForce RTX 3090 graphics card to use.

2. Math precision: FP32 vs. FP16

All tests, the results of which were presented to us by lwk7454, clearly show that for the 3.8 model the performance of FP32 calculations is higher than that of FP16 calculations. This regularity is maintained both for interpolation with SVP+vsrife and for Flowframes.

A similar regularity for this model was observed by dlr5668 at scale=0.5:
https://www.svp-team.com/forum/viewtopi … 246#p79246

Now I think I know what Flowframes creator n00mkrad meant when he wrote on 21 August:

fp16 does not work with newer models

https://github.com/hzwer/arXiv2020-RIFE/issues/188

The question is: did FP16 work better with older models, and if so which ones?

cheesywheesy wrote:

I look forward to the retrained model. Would be great if it
works, because the fp16 speed is phenomenal (~x4).

https://github.com/hzwer/arXiv2020-RIFE/issues/188

This is even more interesting since the author of the above quote also uses an NVIDIA GeForce RTX 3090 graphics card:

I would love to benefit from my tensor cores (rtx 3090).

https://github.com/hzwer/arXiv2020-RIFE/issues/188

Now I don't know if by writing about speed x4 the author meant the performance of the card itself or some older RIFE model.

After the following quote from hzwer, the creator of RIFE, I assumed that the 3.8 model will be the fastest:

the v3.8 model has achieved an acceleration effect of more than 2X while surpassing the effect of the RIFEv2.4 model

https://github.com/hzwer/arXiv2020-RIFE/issues/176

It may be the fastest, but for FP32 precision. It would be interesting to see if older models were faster and by how much for FP16 precision. Especially since there are quite frequent opinions that older models gave better quality:
https://github.com/hzwer/Practical-RIFE/issues/5
https://github.com/hzwer/Practical-RIFE/issues/3
https://github.com/hzwer/arXiv2020-RIFE/issues/176

In vs-rife we have 6 models available:
1.8
2.3
2.4
3.1
3.5
3.8
https://github.com/HolyWu/vs-rife/tree/master/vsrife

The last one we have tested. I know that it would take some time to test the other 5 models for FP16 and FP32, because it is as many as 10 tests, but I think it would be interesting to know at exactly what point there was a performance degradation for FP16 and if there is any model that gives higher performance with FP16 than 3.8 with FP32. The results of such tests I obviously intend to pass on to the RIFE developer with a question about the possibility of future optimisation for model 4.

If that's too much to test then I'm curious about performance in this order of importance: 2.4; 3.1; 2.3; 1.8; 3.5.

Fixed test parameters:

Test-Time Augmentation: Enabled [sets RIFE filter for VapourSynth (PyTorch)]
re-encoding with x2 interpolation
scale=1.0

Variable test parameters:

Math precision: FP16 and FP32
RIFE model: 2.4 optional: 3.1; 2.3; 1.8; 3.5

Test results:

re-encoding speed [FPS]
GPU utilisation [%]
.

Video file:

original demo video from the creator of RIFE at: https://github.com/hzwer/arXiv2020-RIFE
720p (1280x720), 25FPS, 53 s 680 ms, 4:2:0 YUV, 8 bits
direct link: https://drive.google.com/file/d/1i3xlKb … sp=sharing.

lwk7454, I would be very grateful if you could find some time to run these tests.

Very interesting results and again surprising. It turns out that theory is theory and still practice shows best what is possible. Many thanks for the test results I requested as well as for doing the additional tests, which are always welcome. In my requests I ask for some minimum that answers the questions I have. I know that each additional test is additional time required to complete it and I respect your time and am hugely grateful that you are here and sharing invaluable information.

However, the results show that each additional test adds important data to the discussion.

In fact, the test results using Vulkan show the biggest differences and the most surprising. It seemed that Compute_1: 100% and FPS: 25.1 is already the maximum, and here such a surprise. Especially since it seems that Flowframes with Vulkan does not load CUDA at all. Maybe it's a matter of optimization and use of a certain number of GPU threads, which this version allows as opposed to the version dedicated to CUDA? In any case, this is just a curiosity, because for the Vulkan version to be suitable for real-time interpolation, more radical performance improvements are needed, for example by filling in the missing one layer (Interp) in this project: https://github.com/atanmarko/ncnn-with-cuda

The comparison of 720p and 1080p tests (PyTorch) also brings something important.

The FPS for the 1080p file can be assumed to be identical for both Flowframes and SVP+vsrife. However, for a 720p file the advantage of more than 10% in FPS in favour of SVP+vsrife is already significant. It is difficult to deduce what this might be due to. We can see that even lower CUDA load in Flowframes does not translate into performance increase. Maybe VRAM bandwidth is the bottleneck here? Maybe data I/O process?

What matters most to me is that using SVP+vsrife doesn't degrade performance, and that matters firstly because of real-time interpolation, and secondly because it shows us where to look for optimization.

Here I want to emphasize that the tests made are not only to satisfy my curiosity, or to provide a benchmark for testing other graphics cards, but above all to serve the development of the use of RIFE for real-time interpolation. The data we already have can be shared with the developer of RIFE and perhaps some optimizations can still be made to improve performance.

There is one more thing that I would like to share with the developer of RIFE, but before that it still needs to be tested:

1. Using the power of CUDA.

Let's take a look at our reference HD file:

original demo video from the creator of RIFE at: https://github.com/hzwer/arXiv2020-RIFE
720p (1280x720), 25FPS, 53 s 680 ms, 4:2:0 YUV, 8 bits
direct link: https://drive.google.com/file/d/1i3xlKb … sp=sharing

Here are the results we have:

real time playback with x2 interpolation and FP16 - Cuda ~40%
real time playback with x2 interpolation and FP32 - did not work at all

re-encoding with x2 interpolation and FP16 - FPS: 63.5 and Cuda: 56%
re-encoding with x2 interpolation and FP32 - FPS: 69.8 and Cuda: 58%

We also have this quote from n00mkrad the creator of Flowframes about using the power of CUDA:

RIFE and other interpolation networks usually have a usage of 80-95%.

https://github.com/JihyongOh/XVFI/issues/7

Looking at all this immediately raises the question: is the RIFE interpolation implemented in Flowframes more optimized than using SVP with RIFE filter for VapourSynth (PyTorch version)?

Lwk7454 I have a request: could you please download Flowframes - there is a free version or donation option available at this address: https://nmkd.itch.io/flowframes and run comparison tests for the 720p file mentioned above?

All parameters would of course be identical to tests with SVP with RIFE filter for VapourSynth (PyTorch version) ie:

RIFE CUDA/PyTorch version
re-encoding with x2 interpolation
RIFE model: 3.8
scale=1.0

Variable test parameters:

FP16 - "Fast Mode" checked in "AI Specific Settings"
FP32 - "Fast Mode" unchecked in "AI Specific Settings"

Take a look here: https://www.youtube.com/watch?v=vApZg5EO2j4&t=33s

RIFE CUDA Fast Mode: Utilizes Half-Precision (fp16) to speed things up and reduce VRAM usage, but can be unstable

https://github.com/n00mkrad/flowframes

Test results:

re-encoding speed [FPS]
GPU utilisation [%]

Thank you! This is what it was all about! Very interesting results and a bit surprising.

The first thing that comes to mind after reviewing the results is how powerful the NVIDIA GeForce RTX 3090 is. Nearly 70FPS for HD video using an interpolation algorithm based on machine learning (artificial intelligence) and without downscaling (scale=1.0) is pretty impressive.

The second thing that comes to mind after reviewing the results is how much of the NVIDIA GeForce RTX 3090 graphics card's potential is still unfulfilled: 56-66% CUDA utilization and worse FP16 results compared to FP32.

I think a lot of people don't realize what potential there is in RIFE and don't even look at this thread. Yes I know that not everyone has a top of the line graphics card (I know because I don't have one yet either), but it is up to all of us now to make progress and practical use of algorithms based on machine learning for real time interpolation. Up to us, because firstly we can show that we need them. Secondly, because we can signal where there is still room for improvement in terms of efficiency and quality.

The third thing that comes to mind after reviewing the results is how big the differences are when it comes to real-time interpolation during playback versus during re-encoding: the percentage usage of CUDA power and the possibility of using FP32.

In upcoming posts later today and over the next few days, I will outline the areas that I think would be worth testing based on the results that lwk7454 presented and explain why.

I'm back.

Today I'd like to propose tests that are related to two problems that arose when testing frame interpolation during real-time playback using the RIFE filter for VapourSynth (PyTorch):

1.

lwk7454 wrote:

FP32 did not work at all, images were all frozen and all nodes were at 0%.

2.

lwk7454 wrote:

FP16 1080p: Cuda jumps between 35% - 51%, SVP index N/A

My proposition is to run the same tests again as last time, but this time not during real-time playback but during re-encoding of video files.

The re-encoding speed expressed in frames per second will give us a very precise measure of what the potential is for the fastest graphics card on the market today: NVIDIA GeForce RTX 3090. A value for a 1080p file above 48FPS would give us the answer that the current RIFE 3.8 model is too slow for real-time interpolation without downscaling (scale=1.0).

Comparing the results of a 720p and 1080p file will give us an answer as to whether the load increases linearly with the number of pixels and give us benchmarks for comparison with other graphics cards.

Tests for FP32 precision can confirm to us where the problem lies. I hope everything will work and we will see how the performance drop in relation to FP16.

Percentage load tests on the graphics card will show us if it is possible to interpolate at full power.

And here are the details of the proposed 4 tests:

Fixed test parameters:

Test-Time Augmentation: Enabled [sets RIFE filter for VapourSynth (PyTorch)]
re-encoding with x2 interpolation
RIFE model: 3.8
scale=1.0

Variable test parameters:

Math precision: FP16 and FP32
Video files: 720p and 1080p

Test results:

re-encoding speed [FPS]
GPU utilisation [%]

Video files:

Test file 1:
original demo video from the creator of RIFE at: https://github.com/hzwer/arXiv2020-RIFE
720p (1280x720), 25FPS, 53 s 680 ms, 4:2:0 YUV, 8 bits
direct link: https://drive.google.com/file/d/1i3xlKb … sp=sharing

Test file 2:
video tested earlier on this thread by dlr5668
source.mkv
1080p (1920x1080), 23.976 FPS, 35 s 994 ms, 4:2:0 YUV, 8 bits
https://www.playbook.com/s/vadash/wcVXb … Ky5jaQMkLs



lwk7454, I would be very grateful to you if you could find some time for these 4 tests. I am very curious about the results.


Here are the tests done for RIFE interpolation in Flowframes. Last update done 9 months ago, so it's not the fastest 3.8, and we don't know if FP32 or FP16 was used. Graphics card drivers and Windows OS have also changed since then:

https://github.com/n00mkrad/flowframes/ … chmarks.md

Thanks lwk7454 for another test and showing how hardware scheduling in Windows 11 with RIFE works in practice.


lwk7454 wrote:

FP16 720p: Cuda ~40%, SVP index 1.0
FP16 1080p: Cuda jumps between 35% - 51%, SVP index N/A

The result surprised me a bit. I counted that it will be 100% and everything will be clear. That's why such tests are worth doing. It looks like for some reason 50% CUDA utilization is the ceiling. Or maybe it's not CUDA that is the bottleneck? Maybe it's not the amount of VRAM either, because there's plenty of VRAM in the 3090, maybe it's the VRAM bandwidth, something similar to when RAM bandwidth is the bottleneck in the native SVP algorithm: https://www.svp-team.com/forum/viewtopic.php?id=6349
Or maybe as you wrote earlier there is still potential for optimization? After all, 40% or even 46% GPU utilization for 720p gives hope for the future for 1080p.


lwk7454 wrote:

FP32 did not work at all, images were all frozen and all nodes were at 0%

And here's another surprise. Looks like there is a problem somewhere in the software.


lwk7454 wrote:

Turns out only Cuda node was used.

Here is a confirmation that indeed the RIFE filter for VapourSynth (PyTorch) uses CUDA, while the RIFE filter for VapourSynth (ncnn) does not: https://www.svp-team.com/forum/viewtopi … 96&p=2 Although there is hope that the latter filter will also use CUDA, as Chainik mentioned https://github.com/nihui/rife-ncnn-vulkan/issues/22 pointing to this unfinished project: https://github.com/atanmarko/ncnn-with-cuda Maybe some talented interpolation enthusiast will fill the missing one layer (Interp) of the above project?

Until then, however, it is worth focusing on what currently gives us the most potential for real-time interpolation, namely the RIFE filter for VapourSynth (PyTorch).

I would be very interested to see the results of a few more simple tests, which maybe will show where the cause of the FP32 problems lies and what the potential is for interpolating 1080p files.  However, I will be away from the forum for 2 days and will try to describe it all on 3 November. I wouldn't want to ask for something when I won't have the opportunity to thank and comment on the results of my request.  I will try to get it all worded properly somehow by November 3.

Thanks again for the tests and I encourage others to check how the real-time RIFE intertpolation results look on their graphics cards, other media players or other Windows.

lwk7454, I think now it would be worth to check again what is the actual GPU load (most loaded nodes) with the already tested reference 720p file at FP16 and if it reaches 100% nodes saturation for FP32.

Please check also 1080p file, which was already tested earlier on this thread if 100% nodes saturation is reached with it.

source.mkv
Video: 1080p (1920x1080), 23.976 FPS, 35 s 994 ms, 4:2:0 YUV, 8 bits
https://www.playbook.com/s/vadash/wcVXb … Ky5jaQMkLs

If you have a link to any other interesting test demo for interpolation with typical 720p or 1080p, 23.976 FPS, 8 bits that anyone today and in the future can download for comparison testing then let us know.

Fixed test parameters:

Test-Time Augmentation: Enabled [sets RIFE filter for VapourSynth (PyTorch)]
real time playback with x2 interpolation
RIFE model: 3.8
scale=1.0

Variable test parameters:

Math precision: FP16 and FP32
Video files: 720p and 1080p

Test results:
% GPU utilisation - most loaded GPU nodes

The results can be compared to other existing graphics cards, future graphics cards and future better and more efficient RIFE models!