That looks like the difference is caused by using the wrong function for color space conversion.
For example:
YUV --(BT.709 conversion) --> RGB --(BT.601 conversion) --> YUV
This can be checked using the following script.
import vapoursynth as vs
import statistics
core = vs.core
core.num_threads = 4
input_clip = VpsFilterSource
input_YUV = input_clip.resize.Point(format=vs.YUV420P16)
clip_RGB = input_YUV.resize.Point(format=vs.RGBS, matrix_in_s="709")
clip_YUV = clip_RGB.resize.Point(format=vs.YUV420P16, matrix_s="709")
# expr: x - y
# Y Plane
expr = 'x y - '.format(neutral=1 << (input_YUV.format.bits_per_sample - 1))
Diff = core.std.Expr([input_YUV, clip_YUV], [0])
Diff = core.std.PlaneStats(Diff)
# Show the difference value (PlaneStatus - Y Plane)
# If the value is 0, it means it is exactly the same, non-zero means there is a difference.
clip_YUV = core.std.CopyFrameProps(clip_YUV, Diff)
clip_YUV = core.text.FrameProps(clip_YUV)
clip_YUV.set_output()

Incorrect conversion
clip_RGB = input_YUV.resize.Point(format=vs.RGBS, matrix_in_s="709")
clip_YUV = clip_RGB.resize.Point(format=vs.YUV420P16, matrix_s="470bg")

Accuracy error (from floating to integer conversion process)
clip_RGB = input_YUV.resize.Point(format=vs.RGB24, matrix_in_s="709")
clip_YUV = clip_RGB.resize.Point(format=vs.YUV420P16, matrix_s="709")

My request is for the video resolution, not the color space.
The common video format is YUV420, which has three planes, Y-plane, U-plane, and V-plane, where the length and width of UV-plane is only half of Y-plane.
In 2160P YUV420 video, the resolution of Y-plane is 2160P, U-plane is 1080P, and V-plane is 1080P.
YUV444 format has the same resolution for Y-plane, U-plane, and V-plane.
RGB format, there are three planes, R plane, G plane and B plane, all three planes have the same resolution.
RIFE only supports RGB format, but VapourSynth Filter does not support RGB format output, so you must convert RGB format to YUV format.
If you convert from RGB to YUV420 format, the information of UV plane will be lost.
If you want to keep the full resolution, you should convert to YUV444 format.
NOW
             4K HDR Video ---> scale(power saving) ---> RGBS Convert(RIFE) ---> YUV444P10 Convert(Output)
Format:        YUV420                YUV444                   RGBS                  YUV420
Y or R:         2160p                 1080p                  1080p                   1080p
U or G:         1080p                 1080p                  1080p                    540p    
V or B:         1080p                 1080p                  1080p                    540p
Depth:          10bit                 10bit                  32bit                   10bit
Wanted
             4K HDR Video ---> scale(power saving) ---> RGBS Convert(RIFE) ---> YUV444P10 Convert(Output)
Format:        YUV420                YUV444                   RGBS                  YUV444
Y or R:         2160p                 1080p                  1080p                   1080p
U or G:         1080p                 1080p                  1080p                   1080p    
V or B:         1080p                 1080p                  1080p                   1080p
Depth:          10bit                 10bit                  32bit                   10bit