Improving correctness and reducing maintenance burden

After spending a significant amount of time over the past couple of days looking for ways to fix the problems reported in bug #61132 while keeping the space optimization of Octave’s special range type, I’m beginning to come to the realization that it is time for us to abandon some features in order to provide more correct (and/or Matlab-compatible) behavior and to ease the burden of maintenance.

The specific problem that pushed me over the edge was trying to support things like

uint8(10) : -2 : uint8(0)

The current templated implementation of ranges (introduced about a year ago to support integer ranges) assumes that all values in the range can be stored using the same type. So the above expression to create a uint8 range stores the base, limit, and increment as uint8 objects. But that doesn’t work for unsigned ranges that have a negative increment (needed for compatibility). So now we may need a different type for the increment? Or do we just store the increment as unsigned and record that the range is descending? If so, then how do we support the function that returns the increment of the range? Etc.

No solution I could think of seemed good. Each thing I considered meant more code or more complicated code, or both, and those things are likely opportunities for even more bug reports and maintenance trouble in the future. OTOH, if the colon operator just created a matrix of the appropriate type, this problem would have been fixed quickly.

At this point, I’d rather have Octave be easier to maintain and working correctly (with more tests!) than save memory.

The same arguments apply for the permutation and diagonal matrix types.

Personally, I favor simplicity in this case. The specific “magic” about Octaves ranges compared to Matlab was that it only saves “start, increment, end” until it is used for a vector operation and alike, right?

For example in Octave 6.3.0

>> a = 1:10
a =

    1    2    3    4    5    6    7    8    9   10

>> b = [1:10]
b =

    1    2    3    4    5    6    7    8    9   10

>> whos
Variables visible from the current scope:

variables in scope: top scope

   Attr Name        Size                     Bytes  Class
   ==== ====        ====                     =====  =====
        a           1x10                        24  double
        b           1x10                        80  double

Total is 20 elements using 104 bytes

In this case there cannot be an overwhelming storage advantage compared to “double” or “int64” for those three values. Furthermore the syntax used to specify the type of the boundaries feels clumsy and error prone. Which effective type is used for uint8(0):int64(5)? Is this documented? Maybe a final typecast uint8(0:5) is more expressive and less error prone.

True for small ranges, but a range is always the base, limit, increment, and possibly a few other values. The array is sizeof(element)*numel bytes which can be quite large.

It is documented for Matlab but not yet in the Octave docs. The rules are fairly straightforward.

If we don’t have the storage optimization, that requires creating and then copying a possibly large array.

After posting yesterday, I took a break and I thought of some possible ways to manage backward compatibility issues vs. creating ranges from unsigned integer base/limit values with a negative double value for the increment. So I’ll give it one more try.

1 Like