Topic: Error ONNX model, TensorRT
[W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[W] Could not read timing cache from: C:\Users\......\AppData\Roaming\SVP4\cache\Program Files (x86)/SVP 4/rife\models\rife\rife_v4.4.onnx.min64x64_opt2560x1440_max2560x1440_fp16_trt-8502_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-GTX-1070_a8b3b7a9.engine.cache.
[TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-pro … l#env-vars
Is something wrong with NVIDIA TensorRT?
What to do?
I have read the:
Module loading
CUDA_MODULE_LOADING
DEFAULT, LAZY, EAGER
Specifies the module loading mode for the application. When set to EAGER, all kernels from a cubin, fatbin or a PTX file are fully loaded upon corresponding cuModuleLoad* API call. This is the same behavior as in all preceding CUDA releases. When set to LAZY, loading of a specific kernel is delayed to the point a CUfunc handle is extracted with cuModuleGetFunction API call. This mode allows for lowering initial module loading latency and decreasing initial module-related device memory consumption, at the cost of higher latency of cuModuleGetFunction API call. Default behavior is EAGER. Default behavior may change in future CUDA releases.
But i dont understand!