Refactoring octave_value function objects (backward incompatible change)

I would be interested in feedback for some changes that I would like to make in the way function objects are handled in the octave_value class hierarchy in the interpreter.

Currently, we have

  • octave_function (base class)
  • octave_builtin (for builtin functions)
  • octave_user_code (base class for scripts and user-defined functions)
  • octave_user_script
  • octave_user_function

Unlike most other octave_value objects, these classes are not just wrappers for other objects (like numeric arrays) but actually provide the definitions for the function objects. Consequently, the octave_value accessor functions for these objects are different from others in that they just return the “rep” pointer (but cast to the actual type instead of octave_base_value). For example, accessing an array vs. a user-defined function:

NDArray nda = aval.array_value ();
octave_user_function *uf = fval.user_function_value ();

In the first case, the accessor returns the object that is managed by the “rep” object. In the second, the accessor just returns the “rep” object directly (but with the actual type instead of as a pointer to octave_base_value).

Using the bare pointer causes some trouble for managing object lifetime but also seems strange in that you could get the same thing by just casting the result of a get_rep method if that is what you really want to do. But either way, this direct access to the pointer to the “rep” object and having the “rep” object directly store the representation of the object seems undesirable to me because it tightly couples the object with the octave_value hierarchy.

I propose that we create a set of objects for managing builtin and user-defined functions and scripts that use std::shared_ptr to manage the object lifetime and make passing them around easy, then wrap those objects in the octave_value class hierarchy (if needed, which is not really clear to me since functions aren’t really first-class objects in Octave, but should really be stored in function handle objects, but that’s something to deal with later).

Then we would have something like

user_function uf = fval.user_function_value ();

and the user_function object would be independent of the octave_value class hierarchy.

I propose to make this backward-incompatible change in version 7. It may cause some trouble, but would help with proper support of anonymous function handles and handles to nested functions.

This change will affect any code that attempts to extract function objects from octave_value objects. We’ve already seen some issues with people trying to do that for function handles (doesn’t generally work as expected anyway, because of the way function handle resolution may happen).

Although it might be possible to make the transition easier by using different method names for the new actions (but what? something like get_user_function that is inconsistent with the other accessors?) along with deprecating and eventually removing the current user_function_value accessor methods. But that will take multiple releases and several years to complete the transition. If necessary, I can do that, but would prefer to avoid that lengthy process.

Comments?

Not much to comment on, except that it does seem a little weird to even derive a function from an octave_value object. Conceptually, I think of octave_value as a polymorphic data container for things like scalars, arrays, structs, etc. But, maybe that is just me.

In any case, no objections to making the API more consistent.

Storing function objects in octave_value objects makes sense if functions are “first-class objects” (treated like any other value). I think I had some idea early on that they should be, but they are not in Matlab. It was also convenient for symbol resolution to be able to look up the meaning of a symbol and receive an octave_value object whether the result was a variable value or a function to call, especially since you can’t tell by just by looking at the program text whether an expression like x(a) will be a variable reference or a function call.