Q=A\B --- Matlab vs Octave handling of near-singular matrices

Dear all,

I frequently need to solve
Q=A\B

where RCOND is really really bad (see example)

typical matrices:
A.mat (437 Bytes)
B.mat (235 Bytes)

Matlab result for Q=A\B:
13
-0,00339999999999897
1,89583333333254e-07
6,80920474620880e-49
-4,52149219027753e-52
1,13879271303334e-55
-1,28389900313649e-59
5,45139902882326e-64
“Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 8.275234e-067.”

GNU Octave result for Q=A\B:
2.7663e-104
-7.441e-55
3.6253e-59
1.16e-55
1.7042e-52
-8.341e-56
1.3639e-59
-7.4494e-64
“matrix singular to machine precision, rcond = 8.27523e-67”

The Matlab solution is correct, and the GNU Octave solution is totally wrong.
Any ideas?

Many thanks,
Ken

My system

  • OS: Windows 10 PRO (version 1909)
  • Octave version: Version 6.2.0
  • Installation method: Downloaded and installed “octave-6.2.0-w64-installer.exe” from Download

I can confirm your results with Octave 6.3.0 on Windows 10.
If the backslash (\) operator in Octave is implemented similarly to the one in Matlab, it should have used a LU-decomposition for your input (see the flow chart in their documentation).
However, if I try to solve the LSE manually using a LU-decomposition, the solution is quite different (and closer to what Matlab returns):

load A
load B
[L, U, P] = lu (A);
Y = L \ (P*B);
QQ = U \ Y;
A*QQ - B

The last command returns the following for me:

>> A*QQ - B
ans =

            0
   2.0504e-15
  -1.9016e-18
   5.8842e-14
  -7.1665e-14
   1.5700e-13
   3.5298e-14
  -3.1693e-15

That is much closer to zero as using the Q returned directly by the \ operator.

Could that be a bug in the implementation of mldivide in Octave?

It is, and a pretty old one at that:
https://savannah.gnu.org/bugs/?48564

I don’t think that is the same bug…

Is the behavior you are seeing this difference between Octave and Matlab?

https://wiki.octave.org/Differences_between_Octave_and_Matlab#Solvers_for_singular.2C_under-_and_over-determined_matrices

I’m not sure if this is just a case where Octave returns a different solution from Matlab.
In this case, Octave’s solution seems to be completely off. The returned value doesn’t seem to be a solution at all:

>> load A

>> load B

>> Q=A\B;
warning: matrix singular to machine precision, rcond = 8.27523e-67

>> A*Q-B

ans =

  -1.3000e+01
   6.8000e-03
  -4.5500e-06
   3.2614e-04
  -3.0760e-04
  -5.2615e-03
   1.8686e-02
  -1.7742e-02

>>

The errors in the solution using a LU-decomposition are several orders of magnitudes smaller. (See the other comment above.)

Edit: FWIW, I see the same result with OpenBLAS and reference BLAS.

If you use the reference BLAS, do you see the same error?

Could this be a bug in LAPACK?

Octave begins by calling Matrix::solve. That function calls Matrix::fsolve for the given A matrix and it calls DGETRF and then DGECON in the block of code beginning here:

http://hg.savannah.gnu.org/hgweb/octave/file/b24567df50ab/liboctave/array/dMatrix.cc#l1481

DGECON returns with info != 0 so Matrix::solve then calls Matrix::lssolve, which calls DGELSD:

http://hg.savannah.gnu.org/hgweb/octave/file/b24567df50ab/liboctave/array/dMatrix.cc#l1911

I believe this code path works for other matrices. Is it not correct, at least for the intended purpose of finding a minimum norm solution?

FWIW, their documentation: LAPACK: dgelsd (netlib.org)

DGELSD computes the minimum-norm solution to a real linear least
squares problem:
minimize 2-norm(| b - A*x |)
using the singular value decomposition (SVD) of A. A is an M-by-N
matrix which may be rank-deficient.

The solution we get clearly doesn’t satisfy this.
I’m not an expert at all in that field. But it looks like the algorithm didn’t converge.

The smallest singular value of A is much smaller than the other ones:

>> s=svd(A,0)
s =

   7.5743e+65
   4.1206e+60
   3.8731e+55
   3.9451e+50
   6.6539e+49
   3.2132e+45
   1.1259e+43
   2.5598e-08

Does that mean anything?
I’m reading:

[in] RCOND
RCOND is DOUBLE PRECISION
RCOND is used to determine the effective rank of A.
Singular values S(i) <= RCOND*S(1) are treated as zero.
If RCOND < 0, machine precision is used instead.

IIUC, we are setting rcond to -1.0.

Dear all,

I’m amazed about the in-depth discussion I initiated. I guess it is the point where a master in engineering knocks on the door of the math classroom LOL

I just want to thank all of you for the input so far. The workaround with the LU-decomposition seems to work out well

Thanks a lot
Ken

According to the DGELSD documentation, INFO should be >0
in this case

      INFO is INTEGER
      = 0:  successful exit
      < 0:  if INFO = -i, the i-th argument had an illegal value.
      > 0:  the algorithm for computing the SVD failed to converge;
            if INFO = i, i off-diagonal elements of an intermediate
            bidiagonal form did not converge to zero.

but I get INFO=0 with openblas.
attached is the openblas test I used.

c.
esempio.cc (2.5 KB)

Thanks for the test.
I get the same singular values and solution with it using reference LAPACK that I also got in Octave.
So, it very much looks like what is happening inside Octave is what @jwe described.

I also see INFO = 0.

Like we established, the solution returned by DGELSD isn’t a (least squares) solution to the input LSE. I have no idea why.

Is this a bug in LAPACK (see @jwe’s comment)? Or does DGELSD “just not work” for certain input? If it is the latter, how do we know that it didn’t work (if INFO returns 0)?

Can you give a hint on the correct flags to compile esempio.cc? I tried, but failed, with

g++ -llapack -lblas esempio.cc 
/tmp/ccGiL1zl.o: In function `main':
esempio.cc:(.text+0x2e2): undefined reference to `dgelsd_'
esempio.cc:(.text+0x4b8): undefined reference to `dgelsd_'
collect2: error: ld returned 1 exit status

g++ esempio.cc -lblas -llapack

(Actually I do not think you need -lblas, but the source file should be first.)

I just used gdb to trace the operations of Q = A \ B and it does begin by calling Matrix::solve in dMatrix.cc at line 1601. But the MatrixType is 13, or Rectangular, so it immediately skips to lssolve. The function in its entirety is

Matrix
Matrix::solve (MatrixType& mattype, const Matrix& b, octave_idx_type& info,
               double& rcon, solve_singularity_handler sing_handler,
               bool singular_fallback, blas_trans_type transt) const
{
  Matrix retval;
  int typ = mattype.type ();

  if (typ == MatrixType::Unknown)
    typ = mattype.type (*this);

  // Only calculate the condition number for LU/Cholesky
  if (typ == MatrixType::Upper || typ == MatrixType::Permuted_Upper)
    retval = utsolve (mattype, b, info, rcon, sing_handler, true, transt);
  else if (typ == MatrixType::Lower || typ == MatrixType::Permuted_Lower)
    retval = ltsolve (mattype, b, info, rcon, sing_handler, true, transt);
  else if (transt == blas_trans || transt == blas_conj_trans)
    return transpose ().solve (mattype, b, info, rcon, sing_handler,
                               singular_fallback);
  else if (typ == MatrixType::Full || typ == MatrixType::Hermitian)
    retval = fsolve (mattype, b, info, rcon, sing_handler, true);
  else if (typ != MatrixType::Rectangular)
    (*current_liboctave_error_handler) ("unknown matrix type");

  // Rectangular or one of the above solvers flags a singular matrix
  if (singular_fallback && mattype.type () == MatrixType::Rectangular)
    {
      octave_idx_type rank;
      retval = lssolve (b, info, rank, rcon);
    }

  return retval;
}

The code picks up again at Matrix::lssolve at line 1913. As noted, INFO is 0 after each call to DGELSD which implies that there were no issues.

It does seem as though a minimum norm solution is being calculated, but this is just such a difficult matrix (condition number = 2.96e73) that the result is not very good. When the condition number is this high the answer is going to be very sensitive to even small subtleties in the algorithm.

The minimum norm solution (2-norm(| b - A*x |) can be calculated another way via pinv (A) * B. Could someone with access to Matlab try this code:

load A.mat
load B.mat
Q = pinv (A) * B
res = B - A*Q
norm (res)

Results in Octave are

octave:42> load A.mat
load B.mat
Q = pinv (A) * B
res = B - A*Q
norm (res)
Q =

   1.2743e-113
    2.8239e-64
    6.8920e-69
    4.6853e-64
    1.3903e-60
    2.7499e-57
   -8.7906e-61
    7.0421e-65

res =

   1.3000e+01
  -6.8000e-03
   4.5500e-06
   4.0074e-03
  -2.4890e-02
   4.7996e-02
  -1.5694e-02
  -2.7326e-02

ans = 13.000

The norm is pretty high and is driven by the first entry in the residuals. Note that this looks pretty similar to what mldivide calculates.

octave:52> Q2 = A \ B
warning: matrix singular to machine precision, rcond = 8.27523e-67
Q2 =

   2.7663e-104
   -7.4409e-55
    3.6253e-59
    1.1600e-55
    1.7042e-52
   -8.3410e-56
    1.3639e-59
   -7.4494e-64

octave:53> res2 = B - A*Q2
res2 =

   1.3000e+01
  -6.8000e-03
   4.5500e-06
  -3.2614e-04
   3.0760e-04
   5.2615e-03
  -1.8686e-02
   1.7742e-02

octave:54> norm (res2)
ans = 13.000

As an exploration, I disabled the check for a singular matrix in Matrix::fsolve. Attached is a small patch which just comments out the check on the reciprocal condition number.

disable_rcond.diff (778 Bytes)

The MatrixType code identifies A as a singular matrix, but that can be overridden with the matrix_type function. With the patch above, I get an LU decomposition and the “correct” answer.

load A.mat
load B.mat
AFull = matrix_type (A, 'full');
Q = AFull \ B

The code uses a test for singularity which is

rcond + 1.0 == 1.0

This will occur if rcond is smaller than eps (1.0). Alternatively, this will occur for a condition number > 1 / (2*eps) or 2.25e15. Given that the original reporter is working with a matrix that has a condition number of 2e73, this will clearly be tagged as singular.

Maybe we should do what Matlab does. The wording of the warning is

“Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 8.275234e-067.”

This correctly says that trouble may result, but it goes ahead and treats the matrix as an ordinary matrix and attempts an LU decomposition.

Octave, on the other hand, is definitive that this is a singular matrix:

“matrix singular to machine precision, rcond = 8.27523e-67”

It then goes on to choose an entirely different algorithm (minimum norm) to solve the problem.

If we wanted too, we could use the INFO output from DGETRF to detect truly singular matrices, but otherwise process matrices with large condition numbers.

INFO    (output) INTEGER
              = 0:  successful exit
              < 0:  if INFO = -i, the i-th argument had an illegal
              value

              > 0:  if INFO = i, U(i,i) is exactly zero. The fac-
              torization has been completed, but the factor U is
              exactly singular, and division by zero will occur if
              it is used to solve a system of equations.

Of course, this would be a huge paradigm change for Octave and would need some robust debate about the pros and cons of such a move.

1 Like

I’m not a mathematician. From a very personal user point of view, I’d assume that the return value of mldivide is at least somewhat close to an actual solution (maybe in a least-squares sense).
In this particular case, the solution determined using a LU decomposition is much closer to an “actual” solution than the minimum-norm solution that we currently return.

The Wiki article @jwe linked gives arguments why the current implementation is preferable to the alternative.
I’m not sure if those arguments are very convincing from an application point of view. A solution that is (at least) close to an actual solution would be more important to me personally than e.g. stability under certain transformations.
But it might well be that I’m lacking the necessary background and just don’t understand those arguments correctly.

Matlab r2020b:

load A.mat
load B.mat
Q = pinv (A) * B
Q =
1.0e-56 *
0.0000
-0.0000
0.0000
0.0000
0.0001
0.2750
-0.0001
0.0000

res = B - A*Q
res =
13.0000
-0.0068
0.0000
0.0040
-0.0249
0.0480
-0.0157
-0.0273

norm (res)
ans =
13.0002

Thanks Philip. So Octave is no worse than Matlab when explicitly asked for a minimum norm solution for this particular matrix.

That points to the fundamental issue here: should Octave identify near-singular matrices as being singular and change the solution algorithm?

@mmuetzel: Was there another article besides this one: Differences between Octave and Matlab - Octave ?

On reading that, I see that there are three cases:

  1. singular matrices
  2. under-determined matrices
  3. over-determined matrices

I don’t know if this is too much of a Solomonic “split the baby” approach, but it may be possible to separate truly singular matrices (case 1) from cases 2 & 3 and use a different approach for them. This would probably require the introduction of another MatrixType of Singular to differentiate from cases 2&3 which use the enum type Rectangular.

I generally think Octave users are practical people, rather than pure mathematicians. In that case, it would be more useful if Octave returned something that was sort of close to the right answer, possibly with a warning, rather than insist on mathematical correctness.