Here is some code which increases the RAM and doesn’t garbage collect it:
values = new float[2000*64*64];
for (int i = 0; i < 100; i++)
{
TensorFloat input = new TensorFloat(new TensorShape(1, 2000, 64, 64), values);
/*do stuff*/
input.Dispose();
}
The problem is that when the RAM reaches the limit it starts slowing everything down.
I can release the memory by calling System.GC.Collect();
every so often. But this is quite slow so I’d rather either (1) reuse the same RAM each time or (2) not use the RAM in the first place or (3) automatically free the RAM without slowing down.
This problem comes up quite a lot with any model that you want to run multiple times which has quite a big input. Some inputs can be as big as 100Mb. It’s mainly a comes up if your inputs need to be sent from the CPU/RAM each time which is not always the case.
Is there a better way of doing this that doesn’t increase the RAM each time? If not, here are a few API suggestions that might solve the issue.
- Can we just send the input values straight to the GPU instead of making a copy of the values in RAM? e.g. something like
input = new TensorFloat(..., device=GPU)
- Could we reuse the input tensor instead of disposing it and creating a new one? e.g. something like
input.UpdateValues(values)
i.e. reusing the same bit of RAM?
Or a a combination of the two. e.g. We may want to create the tensor directly on the GPU as well as updating the same bit of memory on the GPU.
This may just be an issue with the latest version: 1.2.0-exp.2 because I don’t recall this being a problem before. (Unity Version: 2021.3.30f1) If I remember correctly, previously, calling Dispose used to clear up the RAM although I’m not certain.
I guess some of it is to do with it has to save things to RAM in order to queue them up. So is there a way to tell it to wait for the queue to be depleted, then you can reuse the memory? e.g. maybe creating a new input should pause until it’s loaded onto the GPU before continuing. Just thinking outloud
1 post - 1 participant