Find start index of group of nonzero elements in vector

I have a 1-D vector which is mostly zeros with some groups of non-zero elements. I want to find the start and end index of each group of nonzero elements in the vector. Google directs me to a solution for exactly this problem using Matlab, under the heading “how-to-find-start-and-end-points-of-non-zero-elements-in-a-vector”. The solution using “findstr()” works in Octave, but the documentation at Function Reference: findstr says that function is scheduled for depreciation and not for use in new code.
The Matlab solution with “strfind()” does not work in Octave, it reports “error: strfind: PATTERN must be a string or cell array of strings”. Here is the code:
TestVec = [0 0 0 3 4 5 0 0 1 2 0 0 0 7 8 9 0 0];
mask = logical(TestVec);
starts = strfind([false, mask], [0 1]);
stops = strfind([mask, false], [1 0]);

I can confirm that your example works correctly in Matlab R2021a but fails in Octave 6.2.

We could probably adjust the input validation of strfind and allow it to operate on logical input. Could you please open a feature request on the bug tracker for this?
GNU Octave - Bugs: Browse Items [Savannah]

In the meantime, you could work around this by casting the input of strfind to char:

TestVec = [0 0 0 3 4 5 0 0 1 2 0 0 0 7 8 9 0 0];
mask = logical(TestVec);
starts = strfind (char ([false, mask]), char ([0 1]));
stops = strfind (char ([mask, false]), char ([1 0]));

a simple math calculation, assuming that your non-zero values are all positive:

TestVec = [0 0 0 3 4 5 0 0 1 2 0 0 0 7 8 9 0 0];
t2=[t1(2:end) t1(1)];

I can confirm the code from mmuetzel does the job I wanted to do.
This code works on my example, but does not quite work in the special case where the first and last elements are both non-zero, such as
TestVec = [3 4 5 0 0 1 2 0 7 8 9];
Meanwhile I did file the item on the bug tracker, in case the goal is better Matlab compatibility.

Thank you.
Link to the bug report: GNU Octave - Bugs: bug #60521, compatibility of strfind() [Savannah]

Matlab might accept a logical input but it is not documented:

Published solutions rely on this behavior, for example How to find start and end points of non-zero elements in a vector - MATLAB Answers - MATLAB Central

What about the following then?

 starts = find (diff ([false, mask]) == 1)
 stops = find (diff ([mask, false]) == -1)

Yes, that is also a good way to do it, which does work in Octave.

Or something like this:

t1 = diff (TestVec != 0)
starts = find (t1 > 0) + 1
stops = find (t1 < 0)

I use loads of this sort of code to detect missing data and slope changes in long time series. Applying “diff ()” twice can also be very helpful in such cases.

the correction is simple, just in the definition of t1:

t1=[0 TestVec>0 0];
t2=[t1(2:end) t1(1)];

When strfind is updated in a later version of Octave it should include double numbers also. Matlab:

a = 1:10; a=a+.1
a =
1.1000 2.1000 3.1000 4.1000 5.1000 6.1000 7.1000 8.1000 9.1000 10.1000
strfind(a, [2.1 3.1])
ans =
This works perfect in Matlab but not in Octave. Is there a workarround I haven’t found?

@Erwin55 you should start a new thread since your new comment is hardly related to the original one.

“str” in strfind stands for string. The fact that the example you provide works in Matlab is undocumented (see this for officially accepted first input arg and that for the second). So it is highly improbable that Octave ever implements this.

1 Like

Your right, I will start a new thread. Because I think this is a very helpful behavior even if it’s not documented. And a clever workaround would be highly appreciated.