BISTs with high memory consumption

Some built-in self tests (BISTs) are failing on the Windows 32bit GitHub runners with out of memory or dimension too large for Octave's index type.

This might be because the algorithm used by the tested functions are using the memory inefficiently. Or it might be that the tests require an excessive amount of memory.

Should the BISTs be reworked to depend less on the capabilities (like total available memory) of the machine they are running on? I.e. should we use smaller matrices in the tests?

The failing tests are:

>>>>> processing C:\msys64\mingw32\share\octave\7.0.0\m\image\rgb2ind.m
***** test
 ## this should have more than 65536 unique colors
 rgb = rand (1000, 1000, 3);
 [ind, map] = rgb2ind (rgb);
 assert (class (ind), "double");
 assert (class (map), "double");

 ## and this should have between 255 and 65536 unique colors
 rgb = rand (20, 20, 3);
 [ind, map] = rgb2ind (rgb);
 assert (class (ind), "uint16");
 assert (class (map), "double");

 ## and this certainly less than 256 unique colors
 rgb = rand (10, 10, 3);
 [ind, map] = rgb2ind (rgb);
 assert (class (ind), "uint8");
 assert (class (map), "double");
!!!!! test failed
out of memory or dimension too large for Octave's index type
>>>>> processing C:\msys64\mingw32\share\octave\7.0.0\m\plot\util\copyobj.m
***** testif HAVE_MAGICK; any (strcmp ("gnuplot", available_graphics_toolkits ()))
 toolkit = graphics_toolkit ();
 graphics_toolkit ("gnuplot");
 unwind_protect
   h1 = figure ("visible", "off", "paperposition", [0.25, 2.5, 8.0, 6.0]);
   x = 0:0.1:2*pi;
   y1 = sin (x);
   y2 = exp (x - 1);
   ax = plotyy (x,y1, x-1,y2, @plot, @semilogy);
   xlabel ("X");
   ylabel (ax(1), "Axis 1");
   ylabel (ax(2), "Axis 2");
   axes (ax(1));
   text (0.5, 0.5, "Left Axis", ...
         "color", [0 0 1], "horizontalalignment", "center");
   axes (ax(2));
   text (4.5, 80, "Right Axis", ...
         "color", [0 0.5 0], "horizontalalignment", "center");
   s1 = hdl2struct (h1);
   h2 = struct2hdl (s1);
   s2 = hdl2struct (h2);
   png1 = [tempname() ".png"];
   png2 = [tempname() ".png"];
   unwind_protect
     print (h1, png1);
     [img1, map1, alpha1] = imread (png1);
     print (h2, png2);
     [img2, map2, alpha2] = imread (png2);
   unwind_protect_cleanup
     unlink (png1);
     unlink (png2);
   end_unwind_protect
   assert (img1, img2);
   assert (map1, map2);
   assert (alpha1, alpha2);
 unwind_protect_cleanup
   close (h1);
   close (h2);
   graphics_toolkit (toolkit);
 end_unwind_protect
!!!!! test failed
out of memory or dimension too large for Octave's index type
>>>>> processing C:\msys64\mingw32\share\octave\7.0.0\m\plot\util\hgsave.m
***** testif HAVE_MAGICK; (have_window_system () && __have_feature__ ("QT_OFFSCREEN") && strcmp ("qt", graphics_toolkit ())) || strcmp ("gnuplot", graphics_toolkit ());
 h1 = figure ("visible", "off", "paperposition", [0.25, 2.5, 8.0, 6.0]);
 unwind_protect
   x = 0:0.1:2*pi;
   y1 = sin (x);
   y2 = exp (x - 1);
   ax = plotyy (x,y1, x-1,y2, @plot, @semilogy);
   xlabel ("X");
   ylabel (ax(1), "Axis 1");
   ylabel (ax(2), "Axis 2");
   axes (ax(1));
   text (0.5, 0.5, "Left Axis", ...
         "color", [0 0 1], "horizontalalignment", "center");
   axes (ax(2));
   text (4.5, 80, "Right Axis", ...
         "color", [0 0.5 0], "horizontalalignment", "center");
   ftmp = [tempname() ".ofig"];
   png1 = [tempname() ".png"];
   png2 = [tempname() ".png"];
   unwind_protect
     hgsave (h1, ftmp);
     print (h1, png1);
     [img1, map1, alpha1] = imread (png1);
     h2 = hgload (ftmp);
     print (h2, png2);
     [img2, map2, alpha2] = imread (png2);
   unwind_protect_cleanup
     unlink (ftmp);
     unlink (png1);
     unlink (png2);
   end_unwind_protect
   assert (img1, img2);
   assert (map1, map2);
   assert (alpha1, alpha2);
 unwind_protect_cleanup
   close (h1);
   close (h2);
 end_unwind_protect
!!!!! test failed
out of memory or dimension too large for Octave's index type
>>>>> processing C:\msys64\mingw32\share\octave\7.0.0\m\sparse\gmres.m
***** test
 dim = 100;
 A = spdiags ([[1./(2:2:2*(dim-1)) 0]; 1./(1:2:2*dim-1); ...
 [0 1./(2:2:2*(dim-1))]]', -1:1, dim, dim);
 A = A'*A;
 b = rand (dim, 1);
 [x, resvec] = gmres (@(x) A*x, b, dim, 1e-10, dim,...
                      @(x) x./diag (A), [], []);
 assert (x, A\b, 1e-9*norm (x, Inf));
 [x, flag] = gmres (@(x) A*x, b, dim, 1e-10, 1e6,...
                    @(x) diag (diag (A)) \ x, [], []);
 assert (x, A\b, 1e-9*norm (x, Inf));
 [x, flag] = gmres (@(x) A*x, b, dim, 1e-10, 1e6,...
                    @(x) x ./ diag (A), [], []);
 assert (x, A\b, 1e-7*norm (x, Inf));
!!!!! test failed
out of memory or dimension too large for Octave's index type

I’m definitely in favor of reducing the memory usage in the tests where possible.

For the first test, I think we could use something like

rgb = nchoosek (0:80, 3) / 80;
rgb = reshape (rgb, [1, rows(rgb), 3]);

instead of

rand (1000, 1000, 3)

OTOH, if we are specifically trying to test 64-bit indexing then we will probably have large memory requirements. Maybe we should have a separate set of tests for that.

I agree that we should have dedicated tests if we want to check 64-bit indexing. (I haven’t checked if those tests do that anyway.)

Your proposed change for the fist test looks good to me. IIUC, it would reduce the size of the first matrix by a factor of almost 12 which could be significant.
Would you like to do the change yourself? If you prefer not to, I could also do that change for you.

The second and third test will probably also work when using lower-res images.

For the last test, see also bug #57591.
I don’t see immediately which matrix is exceedingly large in that test. The sparse matrix A with less than 300 set elements doesn’t look very large. Also, the 100×100 full matrix (A*x assuming x is full) doesn’t look too bad to me.
Having written that, I didn’t check what gmres actually does. Maybe, these matrices “explode” to much larger ones in its code.

“test gmres” seems to be leaking some memory. the RES memory size of octave seems to be increasing with a number of “test gmres” runs. Try

for jj=1:100; test gmres; endfor

a few times…

1 Like

I think the following change should avoid the memory leak you are seeing with the test:

http://hg.savannah.gnu.org/hgweb/octave/rev/d2294eff5180

I pushed the following changeset for the rgb2ind test:

http://hg.savannah.gnu.org/hgweb/octave/rev/df8982134c3b

Memory leak fix does not seem to apply to stable…

The change wasn’t intended for the stable branch. I found that problem using the address sanitizer and running the test. I’ll try the same with stable and see whether there is another obvious leak.

@dasergatskov I tried running test gmres in a loop on my system with the current stable sources and I’m not seeing any increase in process size while watching with top. I did see a leak with the default branch but that was fixed by my change for numeric tokens.

This is what I got with 6.2.92. RES is what “top” RES memory shows. The run is

for jj=1:100 ; test gmres ; endfor

I started octave as ./run-octave -q -f; the data point at N=0 is the initial reading.

# N	RES
0	92496
1	229176
2	303248
3	369500
4	432108
5	502292

Here is the plot
mem1

I tried compiling with ASAN; it does not find anything. It is possible the issue is with either sparse libs or openblas or some other lib.

It does look like the problem is due to openblas (0.3.15) (I used OMP interface).
When using reference blas I see a RES increase from 92304 to 109800 after the first run, but then it stays the same.

I also do not see the leak if I run with openblas, but set OMP_NUM_THREADS=1.
If I set it to 8 (which is still less then number of the actual CPU cores), I see the leak again, but it is smaller.

Hmm. But if I use a pure blas benchmark

a=randn(1000);for jj=1:100; inv(a)*a; endfor 

I do not see the leak with openblas. So, it is suitesparse (5.4.0) has a problem with blas threads?

I reviewed the second failing image test just to get an idea about what was going on. The test plot is an RGB image that is 900x1200x3 of type uint8 so overall memory consumption is 3.24e6 bytes for img1 and img2. Because figures 1 and 2 are still open during the test it is possible that there is additional memory usage of that same size for h1 and h2. Overall, that suggests up to ~13 MB of memory is used. How much memory should a GitHub runner have? This doesn’t sound like a lot of memory to me.

If we want to cut down on memory I see two choices:

  1. Close figures immediately after printing, although it is uncertain how much this actually saves. Tests with memory seemed to suggest .15 MB per figure.

  2. Call print with -rXX where XX is a lower resolution such as 50. This would cut memory usage to .36 MB per figure.

while running “test gmres” in a loop I see octave RES memory consumption swings by about ~0.5g
(Peaked at ~1.4g, settled to 0.84g after it is finished).

The specifications of GitHub’s free runners (“free” as in “free beer”) are here:
About GitHub-hosted runners - GitHub Docs
The Windows runners should have 7 GiB of RAM. IIRC, 32-bit programs cannot address more than 4 GiB of memory. That could be decreased further by addresses reserved for hardware (e.g. graphics card). I don’t know how much that would effectively be on these runners. But the total memory available for Octave is probably not less than 3.5 GiB.
While that isn’t very much, it should be more than enough for the matrix sizes you cited.

Given those sizes, the images themselves are probably not causing the issue.
I didn’t look at it in any detail. Just guessing: Could it be the struct objects s1 and s2 in the test for copyobj? (Btw, is there a missing assert (s1, s2) in the test? Or could we just remove s2 completely?) IIRC, I read somewhere that struct objects have rather large memory overhead in Octave…

“out of memory or dimension too large for Octave’s index type” is no OS error but octave’s diagnostics.
It looks to me that there is some kind of memory corruption due to race condition or something that put some junk (a very large int number) for octave array index.

IIUC, Octave shows that error message (also) if malloc failed. (That is quite unlikely with over-provisioning on Linux. But more likely on Windows.)

The error seems to be reproducible 100% of the time (at least with 32-bit Octave on the Windows GitHub runners). So, I’m not sure if it is due to a race condition.

OK. On 32-bit there is an issue with memory fragmentation. Also, printing to png by default involves ghostscript which adds additional load on system memory (and potentially race cond issues).
Could/should we change the code here, so printing to png done with -svgconvert? This would eliminate ghostscript involvement in the process?

IIUC, on a 64-bit Windows, each 32-bit process has that limit of 4-x GiB. So, unless that reduces the total free physical memory significantly, spawning ghostscript shouldn’t reduce the total memory available for Octave.
But you are right, IIRC, memory fragmentation is a thing for 32-bit processes (also on a 64-bit Windows).

Nevertheless, IIUC, we are trying to move towards -svgconvert as the default printing toolchain. (Is that correct?) So we could try switching to that toolchain anyway to give it a bit more coverage.

This isn’t a bad idea, although does it apply when the output is PNG? I thought -svgconvert was useful when using the -painters renderer which is used for page-oriented formats like PDF. For image formats like PNG or JPEG I think the toolchain goes through OpenGL and ghostscript.

Given the sizes of the files, ~3.25 MB, I’m surprised that an instance with gigabytes of memory is failing. It could be do to memory fragmentation I suppose.

Actually, I’m wondering if the problem is gnuplot? The copyobj.m test forces the use of gnuplot because the test was created before we had offscreen rendering capabilities with Qt. Given that we don’t actively support, nor recommend, that programmers use gnuplot I think we should change all the BIST tests to stop using gnuplot.

Edited: 7/3/21

I checked in a change to copyobj.m such that the BIST test only uses the “qt” toolkit, not gnuplot. See octave: 8c60542cf30c. If this stops the issue with a failing test then I think we should extend this universally to the BIST tests and avoid gnuplot entirely.