Should isindex ("char") return true?

The function isindex() returns true/false depending on whether the input can be used as an index on a matrix. It is true that character arrays can be used as index objects:

x = 1:1000;
x('char')
ans =

    99   104    97   114

Nevertheless, should we really be encouraging that sort of usage?

In my mind, if you are using a string as an index it is because something has gone wrong in the code, and it is not a good coding paradigm. I would prefer to change isindex to return false for isindex ('char'). The workaround, for those few instances where the programmer fully intends to use a string as an index, is simply to convert the character array to a numeric array with double().

As a test, I made the proposed change to isindex and then ran make check and there are no failures. Nothing in core Octave was relying on this slightly strange behavior.

No objections, isindex is a pure Octave extension and I think this is more likely a programming mistake or at least bad style which is difficult to trace.

Maybe it is worth looking at the root of the trouble, if you did not already do it. In https://hg.savannah.gnu.org/hgweb/octave/file/ee39aef7c417/libinterp/corefcn/utils.cc#l1680

idx_vector idx = args(0).index_vector (true);

It seems that Octave is able to convert a string into an index vector. This seems to be implemented in libinterp/octave-value/ov-ch-mat.cc (line 61). This implementation was added in 2007 but the usage might have changed in the meantime. Maybe just removing this implementation which is not used according to Doxygen might fall back to a more desirable default behavior of an error message in
libinterp/octave-value/ov-base.cc (line 233)?

The above is no solution, as it disables the : colon operator to work, e.g. in Octave and Matlab R2020b

>> sv = 1:4
sv =

   1   2   3   4

>> sv(:)
ans =

   1
   2
   3
   4

>> sv(':')
ans =

   1
   2
   3
   4

With this argument, maybe isindex (':') and isindex (":") must remain true?

Second problem Matlab compatibility:

>> sv = 1:200;
>> sv('a')

ans =

    97

>> sv('ab')

ans =

    97    98

>> sv("a")
Unable to use a value of type string as an index.
 
>> sv("ab")
Unable to use a value of type string as an index.

Maybe once Octave implements the string class, this behavior is automatically forbidden for double quoted strings at least.

I’m not worried about this. Just because Matlab does something doesn’t mean it is a good idea. They allow indexing with a character matrix, but it is hard to see why that makes sense and isn’t just an accident of implementation.

Per the original post, I think using a character matrix or string probably represents a mistake on the programmer’s part and it would be better to have the programmer address that question rather than silently accepting it. They can address it by just adding double(...) around the char matrix or string. If someone is using this construct and wants to port their code from Matlab to Octave it really isn’t that hard. And the reverse, running Octave code on Matlab will automatically work.

I had looked at the implementation of isindex and my first thought was actually just to change that function to disregard string objects. Attached is the diff I used which passes all Octave BIST tests. isindex.diff (997 Bytes)

Alternatively, if we wanted to maintain conformity with the XXX_value routines which convert an octave_value to the type XXX we could add a second input to index_vector() which forces string conversion.

For example, here is the prototype in ov.h for the function which converts to octave_idx_type

  octave_idx_type
  idx_type_value (bool req_int = false, bool frc_str_conv = false) const;

and here is the prototype for index_vector

  idx_vector index_vector (bool require_integers = false) const

After my trouble yesterday, I am sure we are opening Pandora’s box if we just add/change something without much deeper thought. We literally change the Octave language here. I am also no fan of following Matlab’s adventurous language journey. But several painful lessons in the past showed not to be too different in fundamental features. Indexing I see as such and we cannot even expect a performance gain from it :thinking:

First thing that is clear to me now, that isindex in it’s current implementation works perfectly. Everything Octave permits to be an index expression (stupid or not) is correctly identified as such. The function just tries to convert it’s argument to an index. Nothing to improve here.

>> a = 1:200;
>> a(":");
>> a(':8')
ans =

   58   56

>> a("3:8")
ans =

   51   58   56

>> isindex (':')
ans = 1
>> isindex (":")
ans = 1
>> isindex (':8')
ans = 1
>> isindex ("3:8")
ans = 1

With your patch we artificially change the game. We give the user a wrong answer. We are saying “what you give us is not a valid index”, but all we want to say is “what you give us is not good to be used as index as it will be a pain in the lower body part to identify problems with such index expressions”. Furthermore, when the user ignores good programming practice to check with isindex before firing the unchecked expression at the array, she will have no trouble and perhaps file a bug report that isindex is broken :sweat_smile:

Second bigger topic is changing what Octave accepts as index expression.

This I do not fully understand. octave_value::idx_type_value is a dispatcher function for octave_value::int_value or octave_value::int64_value

I do not think this has anything to do with converting string expression, only double conversions libinterp/octave-value/ov-base.cc, but maybe you only suggested this as a template.

Also in this case you had to add this parameter to all 18 constructors of index_vector and everywhere those constructors are called :scream: :dizzy_face:

I think this had to be fixed somewhere in the parser level, when the index_vector constructor is called, the individual chars are probably already implicitly converted to integers, as a string is mostly an Array<char> to my knowledge.