Release candidate available

Eigs passes with openblas if I set OPENBLAS_NUM_THREADS to 8 (the number of actual cores) or less. The simple benchamark “a=randn(4000); tic; inv(a)*a; toc” also runs faster than
with default (I assume 16 threads). Perhaps we should change it in the install somehow.

test subplot fails invariably.

p.s. I just checked that with 4 threads the timing is longer by ~5% (1.8sec vs 1.7 for 8 threads and it is 2.1 for the default (16?) threads). So underestimating num of threads looks better to me than overestimating.

You could compile an .oct file from this:
openblas_get_num_threads.cc (1.6 KB)

For me:

>> mkoctfile openblas_get_num_threads.cc
>> openblas_get_num_threads ()
ans = 4
>>

I meant to say can the install program figure out the number of actual cores (rather than threads) and set up OPENBLAS_NUM_THREADS accordingly. I am pretty sure it uses 2x on a cpu with hyperthreading. Perhaps this is why you do not see this problem on i5 cpu (num of threads == num of cores).

Before we try to do that, it would be nice to know where we currently are.
Does OpenBLAS use the “hyper”-threads by default?

@dasergatskov and @mmuetzel I may be wrong, but the default of OpenBLAS it to use the number of available threads. If the number of cores it not equal the number of threads, performance is degraded.

My machine has 4 threads but only 2 cores.
If I call octave without any other argument I get poor performance in matrix-matrix multiplication and all 4 virtual-cores are used.
However, if i call
OPENBLAS_NUM_THREADS=2 octave
the performance is MUCH better.

And concerning octave 6.2.90, no problem with the build process.
“make check” passed with 0 fails

Yes.
I am away from my Win10 laptop at the moment, but on my linux desktop (Ryzen w/ 16 cpu cores) I see

octave:1> openblas_get_num_threads
ans = 32

When you are back to your Win10 laptop, could you please check what you get for these commands?

>> getenv('NUMBER_OF_PROCESSORS')
ans = 4
>> system('wmic CPU Get NumberOfCores')
NumberOfCores
4

ans = 0
>> system('wmic CPU Get NumberOfEnabledCore')  # Only available since Windows 10
NumberOfEnabledCore
4

ans = 0
>> system('wmic CPU Get NumberOfLogicalProcessors')
NumberOfLogicalProcessors
4

ans = 0
>> getenv('NUMBER_OF_PROCESSORS')
ans = 16
>> system('wmic CPU Get NumberOfCores')
NumberOfCores
8

ans = 0
>> system('wmic CPU Get NumberOfEnabledCore')
NumberOfEnabledCore
8

ans = 0
>> system('wmic CPU Get NumberOfLogicalProcessors')
NumberOfLogicalProcessors
16

ans = 0
>>

Thanks.
I assume openblas_get_num_threads () returns 16?

Yes.

                                                                                                     
>> cd Downloads/
>> mkoctfile openblas_get_num_threads.cc
>> openblas_get_num_threads
ans = 16
>>

BTW I installed octave-2021-06-08-00-28-default-w64-installer.exe from octave.space.
Test eigs passes, test subplot fails. So it appears to be a bug in openblas.

p.s. Fedora 34 has openblas 0.3.15

Hmm… We might want to update OpenBLAS in that case.
However, that change would have a bigger impact (than the updated Octave Forge package).

@jwe and others: Do we want to do that anyway? If I understand correctly, some numeric operations are failing on some CPUs with the version of OpenBLAS that we are currently packaging.
Is that enough of a reason for an update?
If we do, should we create a new candidate of the Windows installer?

Anyway, I played around a little bit with the .vbs script.
@dasergatskov: Could you replace the octave.vbs in Octave’s installation folder with the one here?
octave.vbs (3.3 KB)
Does that set the OPENBLAS_NUM_THREADS correctly?

@dasergatskov: In the meantime, could you please show a screenshot of the figure after these commands?

close all
figure ();
for ii = 1:9
  hax(ii) = subplot (3,3,ii);
endfor
subplot (2,1,1);

For me, it looks like this:

It appears to work:

>> getenv('NUMBER_OF_PROCESSORS')
ans = 16
>> cd Downloads/
>> openblas_get_num_threads
ans = 8
>>
1 Like

Here is with qt

and with gnuplot

Does it make a difference if you run the commands line by line or in a script?

@mmuetzel: If it fixes some bugs, then it’s fine with me to update the OpenBLAS version. I also don’t mind creating a new release candidate but it may take a day or two for me to create and upload.

Does not make any difference. I also tried to add pause() here and there, but it is all the same.
Also “demo subplot” seems to run fine.

I grafted the OpenBLAS update to the release branch here:
mxe-octave: 580d6e6b0025

IIUC, subplot should delete axes which it is overlapping. For some reason that doesn’t seem to be happening for you.
Is the extent calculated correctly?
For me:

>> close all
>> figure ();
>> for ii = 1:9
  hax(ii) = subplot (3,3,ii);
endfor
>> get(hax(4), 'position')
ans =

   0.1300   0.4255   0.1818   0.1840

>> hax_new = subplot (2,1,1);
>> get(hax_new, 'position')
ans =

   0.1300   0.5999   0.7750   0.3251

Also: Is this a regression? Does it work correctly in Octave 6.2 or Octave 5.2?