Input handling code / best practices?

In another thread a new contributor asked if someone would be able to help him with input parsing, etc, for a rather cumbersome function. I realized I was wondering something along this line on my last bug fix: code for input parsing seems to take as much of my time as anything else, especially when trying to match odd combinations of variable inputs and dependent code paths. And each time I feel like I’m starting from scratch, deciding between ifs, switches, for i=1:nargin loops, etc. rather inconsistently. It seems Octave code doesn’t make heavy use of InputParser functionality, which maybe would streamline some of that but I have no experience there. Would it make sense to try to capture input handling strategy/ sample code/ etc somewhere? Are there function templates or best practices that would benefit from a page on the wiki or something? Are there enough similar cases to create said templates, or is it just messy because that’s the nature of the beast? Would it help later code maintenance / newbie introduction if such things were more consistent?

This is a major challenge, and you’ve identified some good potential steps.

Architecturally, good code in Octave follows the pattern of

  1. Validate inputs
  2. Calculate results

This avoids Garbage-In/Garbage-Out which is always a problem, but is a particular problem when computer science is not your first discipline. Many people who use Octave can be brilliant in their own field of chemistry, physics, geology, biology, etc. But they use Octave as a tool and they just want an answer. Without sufficient input validation they may not realize that the function they are feeding was not designed to handle negative numbers, or complex values, or maybe outside a periodic range -pi to +pi.

So there’s no debate that input validation is important, the question is how best to achieve it. For reference, Matlab has struggled with the same issue. At first, Matlab probably relied on hand-rolled code for every function. Then they started offering some helper functions like narginchk, nargoutchk for testing the number of inputs/outputs and routines like validatestring, validateattributes for checking actual values. Octave supports these functions for legacy code, but they don’t see much use.

Next, Matlab introduced the inputParser class as a way of standardizing input validation. In some ways this is pretty good, but I don’t think it caught on much outside of Matlab’s own programmers and their core m-file functions. It does have some downsides. First, it is object oriented programming and this is a big lift for users who are not computer science majors. If you’re writing your own function it is far easier to just throw in an if test to check range of an input (if (x < 0 || x > 1) ...) rather than code up your own input parser and then apply it. Octave supports this style as well. On a performance note, I don’t think classdef in Octave is very fast, and it involves a call out to another function. For best performance the input validation does need to be in the m-file itself. But this is often the case that performance and code readability / maintainability are in opposition.

Lately, like 2019, Matlab introduce yet another syntax for validation function inputs (See Function Argument Validation - MATLAB & Simulink). I don’t know if this will become popular, but it is a nice simple way of specifying what the input should look like (data type, dimensions, and values). It should be high performance because I believe this is part of the interpreter. No one in the Octave community has discussed implementing this yet, but it is likely to be a significant effort.

I think some guidelines and possibly some templates for certain situations would be useful to many people.

1 Like

Yes, I agree that this will be a significant amount of work. I could probably adapt the parser to handle the syntax fairly easily, but that is just a small part of the job. Making it actually execute will take much more effort and, as was recently pointed out in a (short) discussion here or on one of the lists, turning the argument validation block into a no-op is not really possible because it can provide default values or alter inputs, so ignoring the block can lead to incorrect results later.