I have a program using ort-genai in which GPU memory apparently accumulates ~1GB per inference iteration when using DirectML execution provider, despite apparently proper OgaGenerator cleanup. Memory ...
I've been looking into memory usage when reading large (wide) Parquet files with the Arrow based API. Two configuration options that greatly reduce memory use are enabling a buffered stream and ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results