How to speedup loop with ode45 inside it

Code : loop which repeatedly calls ode45
Problem : slow (er than matlab)
System : Debian Linux (Octave 5.2), Windows (Octave 6.x)

Simulation with a loop and ode 45 inside it executed much slower in Octave (5.2 and 6.x) compared to Matlab. Before optimizing the differential equation I was integrating, I decided to profile the code to see if my differential equation code or the discrete-time-controller code was the bottleneck.

Profiling indicated that most of the execution time is spent inside functions other than the differential equation functions specific to my problem.
@inputParser_xxxxx and runge_kutta_45_dorpri are consuming most of the time.

I am simulating a system which has both discrete time sub-system (digital control) and continuous time system (plant being controlled). The fixed-time-step part is implemented by the while loop and the continuous time part is implemented by the ode45.

Is there a way to speed up my simulation ? If the functions at the top of the profile were user written, I could have attempted to convert them to C++ mkoct files.

Below is the code and the profiling result. The code is a minimum working example to illustrate the problem. Even for much more complicated (non linear, time varying) differential equations, I am still getting similar results.

CODE

clc;
clear;

fprintf('example profiled running...\n');

endtime = 1000;
stepsize = 0.5; % the fixed step used by the controller

t = 0;
ii = 1;
% reserve space so that it doesn't slow down the program
tl = zeros(ceil(endtime/stepsize), 1);
yl = zeros(ceil(endtime/stepsize), 2);

ic = [1, 0];

control = 0.1;

profile on;

while(t < endtime)
% run the non linear time varying differential equation
% using a variable time step method.
% example shown here is a linear system.
% but profiling results are the same.
[tt, yy] = ode45(@(t,x)[0,1; -1, control]*x/2, [0, stepsize], ic);

% collect results for plotting later
yl(ii, :) = yy(end, :);
tl(ii) = t + tt(end, :);

ic = yy(end, :);

if(norm(ic) > 5)
  % this is where the controller logic would have been written in the real application
  control = -0.05;
end

ii = ii+1;

t = t+stepsize;
end

profile off;
profshow;

plot(tl, yl);

fprintf('example profiled ran...\n');

Profiling result

   #                  Function Attr     Time (s)   Time (%)        Calls

  75     runge_kutta_45_dorpri            14.695      15.45        67522
  23        @inputParser/parse            14.132      14.86        11200
  73        integrate_adaptive            13.456      14.15         5600
  29  @inputParser/add_missing             7.984       8.40        22400
  55   @inputParser/is_argname             6.316       6.64       128800
  76                     feval             4.414       4.64       410732
  57            anonymous@:0:0             3.895       4.10       550732
  56 @inputParser/validate_arg             3.865       4.06       128800
  63               AbsRel_norm             3.263       3.43        84322
  35                    unique             2.574       2.71        28000
  30                   setdiff             2.478       2.61        22400
   2                     ode45             2.419       2.54         5600
  24                fieldnames             1.296       1.36        84000
  60              odemergeopts             1.146       1.20         5600
  62         starting_stepsize             1.064       1.12         5600
   5                    odeset             1.040       1.09        11200
  31              validsetargs             1.032       1.08        22400
  27            __fieldnames__             0.934       0.98        84000
  52                  binary *             0.827       0.87      1225452
  41                      sort             0.782       0.82        22400

I have posted this question to the GNU mailing list also.

sorry you didn’t get a reply on the help list. I saw your post and started looking at it but didn’t get to far into it. my first question was which function is using inputparser, and why is it such a large part of your overhead.

instead of profshow, have you tried profexplore? it lets you step down through the levels of the execution instead of just giving one summary table. I’d be interested to see which function is calling inputParser, because a quick look at ode45 and runge_kutta_45_dorpri call inputParser.

ok, nevermind i see it now. looking at how many times it was called matches with odeset, and odeset does use inputparser. does odeset need to get called so many times? i thought that function was simply for setting options, and performance is usually a low concern. are they changing that much?

can you try whether using the attached version of odeset improves the timing?
c.

odeset.m (11.1 KB)

If I understand correctly what you are trying to do, you are using ode45 in al loop to advance
your system from t to t+dt then modifying some parameters depending on the state at t+dt and repeating the loop …

why not just express your parameter as a function of the state and solve the whole system at once?

what I mean is something like :

ic = [1, 0];
endtime = 1000;
clear control
function res = control (x)
persistent c = 0.1;
if norm (x) > 5
  c = -0.05;
endif
res = c;
endfunction

[tl, yl] = ode45(@(t,x)[0,1; -1, control(x)]*x/2, [0, endtime], ic);
plot (tl, yl)

is this what you want to do?

the simulation with this code takes
Elapsed time is 0.628414 seconds.
for me, while your code is
Elapsed time is 52.7719 seconds.

and the plot looks the same to me (see below…)
speedup is 83x is that sufficient?
c.

untitled

The proposed approach with only one ode45 call does not really simulate a discrete-time control of a continuous-time system wehre the control input is only updated at the sampling instances and remains constant during the sampling intervals.
I think the only way to simulate a discrete-time control is a loop as shown in the original post.

Thank you for your replies.

have you tried profexplore?

I will try it out and report back.

can you try whether using the attached version of odeset improves the timing?

I will try this out out and report back.

why not just express your parameter as a function of the state and solve the whole system at once?

This was my first approach. It involves two approximations from my real-life system. One is that the controller is discrete-time and putting my controller in ode45 makes it a continuous time controller. In fact, I tuned most of my controller parameters this way. Second approximation is that my controller has it own states (which I didn’t show in the example code). Some of those states are linear, smooth states. But there are some “stiff” states like controller-mode status-flag, on/off relays etc. So if I should simulate my controller along with the plant, I would have to use a stiff solver. I need to make an assessment on the accuracy of the achieved result compared to the set-point also. This is mostly decided by the size of the sampling interval and controller-mode switching dynamics. So, I would prefer to have those effects also simulated.

The proposed approach with only one ode45 call does not really simulate a discrete-time control of a continuous-time system where the control input is only updated at the sampling instances and remains constant during the sampling intervals. I think the only way to simulate a discrete-time control is a loop as shown in the original post.

I agree with this opinion. For initial controller tuning, I have used the continuous time approximation of the controller. For my further studies, I would prefer a simulation where the discrete-time effects would be included in the simulation.

If there is no practical way to speed up my simulation, I may drop the ode45 all together and do a vanilla Euler integration with plant made into a C++ version for running the inner Euler loop with a small time-step.

I used the supplied odeset.m. But file diff shows me that the files are same except for commented lines. The profile also shows similar results.

modified odeset

'odeset' is a function from the file /home/user/Documents/odeset.m
example profiled running...
                     Function Attr     Time (s)   Time (%)        Calls
   9        @inputParser/parse             4.860      16.26         4000
  72     runge_kutta_45_dorpri             4.165      13.94        20000
  70        integrate_adaptive             3.829      12.81         2000
  18  @inputParser/add_missing             2.740       9.17         8000
  50   @inputParser/is_argname             2.151       7.20        46000
  52 @inputParser/validate_arg             1.351       4.52        46000
  73                     feval             1.229       4.11       122000
  53            anonymous@:0:0             1.159       3.88       172000
  60               AbsRel_norm             0.917       3.07        26000
  25                    unique             0.844       2.82        10000
  19                   setdiff             0.835       2.79         8000
   2                     ode45             0.802       2.68         2000
  13                fieldnames             0.406       1.36        30000
  57              odemergeopts             0.363       1.21         2000
  59         starting_stepsize             0.355       1.19         2000
   5                    odeset             0.348       1.16         4000
  21              validsetargs             0.342       1.14         8000
  16            __fieldnames__             0.316       1.06        30000
  34                      sort             0.271       0.91         8000
  47                  binary *             0.236       0.79       368000

'odeset' is a function from the file /home/user/Documents/odeset.m
example profiled running...
                    Function Attr     Time (s)   Time (%)        Calls
   9        @inputParser/parse             4.627      16.33         4000
  72     runge_kutta_45_dorpri             4.024      14.20        20000
  70        integrate_adaptive             3.598      12.70         2000
  18  @inputParser/add_missing             2.621       9.25         8000
  50   @inputParser/is_argname             2.061       7.27        46000
  52 @inputParser/validate_arg             1.297       4.58        46000
  73                     feval             1.180       4.17       122000

Inbuilt odeset

'odeset' is a function from the file /usr/share/octave/5.2.0/m/ode/odeset.m
example profiled running...
  #                  Function Attr     Time (s)   Time (%)        Calls
  20        @inputParser/parse             4.632      16.36         4000
  75     runge_kutta_45_dorpri             3.966      14.01        20000
  73        integrate_adaptive             3.607      12.74         2000
  28  @inputParser/add_missing             2.620       9.25         8000
  55   @inputParser/is_argname             2.058       7.27        46000
  56 @inputParser/validate_arg             1.295       4.57        46000
  76                     feval             1.173       4.14       122000
  57            anonymous@:0:0             1.107       3.91       172000
  63               AbsRel_norm             0.860       3.04        26000
  34                    unique             0.788       2.78        10000
  29                   setdiff             0.771       2.72         8000
   2                     ode45             0.744       2.63         2000
  23                fieldnames             0.383       1.35        30000
  60              odemergeopts             0.352       1.24         2000
  62         starting_stepsize             0.334       1.18         2000
   5                    odeset             0.318       1.12         4000
  30              validsetargs             0.313       1.11         8000
  26            __fieldnames__             0.292       1.03        30000
  41                      sort             0.242       0.85         8000
  52                  binary *             0.222       0.78       368000

'odeset' is a function from the file /usr/share/octave/5.2.0/m/ode/odeset.m
example profiled running...
   #                  Function Attr     Time (s)   Time (%)        Calls
   9        @inputParser/parse             4.638      16.36         4000
  72     runge_kutta_45_dorpri             3.971      14.01        20000
  70        integrate_adaptive             3.610      12.74         2000
  18  @inputParser/add_missing             2.629       9.28         8000
  50   @inputParser/is_argname             2.061       7.27        46000
  52 @inputParser/validate_arg             1.298       4.58        46000

I may have attached the wrong file …
these are the changes I was suggesting to odeset.m :

---------------8<-----------------

--- a/odeset.m	2021-05-01 05:23:06.000000000 +0200
+++ b/odeset.m	2021-01-25 13:41:55.000000000 +0100
@@ -153,8 +153,7 @@
function odestruct = odeset (varargin)

  persistent p;
-  persistent pdef;
-  
+
  if (isempty (p))
    ## FIXME: Add an inexact match option once it is available in inputParser.
    ## See bug #49364.
@@ -184,16 +183,8 @@
    p.KeepUnmatched = true;
  endif

-
-  if (isempty (pdef))
-    p.parse ();
-    pdef = p.Results;
-  endif
-  
  if (nargin == 0 && nargout == 0)
    print_options ();
-  elseif (nargin == 0)
-    odestruct = pdef;
  else
    p.parse (varargin{:});
    odestruct = p.Results;

---------------8<-----------------

this shoud cut the calls to InputParser in half.
to completely get rid of the InputParser calls
this further change to ode45 would be needed

---------------8<-----------------

--- a/ode45.m	2021-05-01 05:43:21.000000000 +0200
+++ b/ode45.m	2021-01-20 23:31:00.000000000 +0100
@@ -157,9 +157,7 @@
                                                 trange(1), trange(end));

  ## FIXME: Refine is not correctly implemented yet
-  ## FIXME: odeset is slow, so assign field directly 
-  ## defaults = odeset (defaults, "Refine", 4);
-  defaults.Refine = 4;  
+  defaults = odeset (defaults, "Refine", 4);

  persistent ode45_ignore_options = ...
    {"BDF", "InitialSlope", "Jacobian", "JPattern",

---------------8<-----------------

this latter change must be applied to ode45 “in place” rather than
creating a local version of ode45.m …

with those changes the result of your profiled run for me

is (notice InputParser is never used):

---------------8<-----------------
example profiled running…
# Function Attr Time (s) Time (%) Calls
---------------------------------------------------------------------------------
69 runge_kutta_45_dorpri 4.779 26.51 21714
67 integrate_adaptive 4.488 24.90 2172
60 anonymous@/private/tmp/pippo.m:0:0 1.715 9.51 136798
71 feval 1.459 8.10 132454
56 AbsRel_norm 1.088 6.03 28229
2 ode45 0.877 4.86 2172
53 odemergeopts 0.471 2.61 2172
54 isfield 0.447 2.48 30408
55 starting_stepsize 0.372 2.06 2172
97 pippo 0.303 1.68 1
51 binary * 0.286 1.59 399534
52 rmfield 0.207 1.15 6516
57 max 0.180 1.00 110744
22 binary + 0.145 0.80 232427
66 min 0.140 0.77 69483
62 binary / 0.102 0.56 191115
68 kahan 0.097 0.54 21714
48 odedefaults 0.096 0.53 2172
50 abs 0.087 0.48 175884
5 odeset 0.079 0.44 2172
example profiled ran…
---------------8<-----------------

instead of changing odeset.m you could also apply this change to your script

---------------8<-----------------
o = odeset ();
while(t < endtime)
% run the non linear time varying differential equation
% using a variable time step method.
% example shown here is a linear system.
% but profiling results are the same.
[tt, yy] = ode45(@(t,x)[0,1; -1, control]*x/2, [0, stepsize], ic, o);
---------------8<-----------------

the patch to ode45 is still needed though …

This is the timing I get with those changes :

%% original version
%% Elapsed time is 37.2565 seconds.

%% creating options struct
%% Elapsed time is 30.6186 seconds.

%% patching ode45
%% Elapsed time is 15.4584 seconds.

I see tho point here, but I think the start-and-stop approach would not simulate this system properly either, I’d really look at a way
of setting up a single set of equations to represent the controlled system and the controller.

that may lead to a differential algebraic problem that needs to be solved via an appropriate solver such as ode15i …

HTH,
c.

1 Like

The start-stop-approach shouldn’t be a problem as long as the initial values are carefully set up for each simulation of the next sampling interval. I also used this approach several times.

With only one activation of ode45 for the whole simulation, I am not seeing a way to make sure that the function with the system dynamics (including the controller) is really called exactly at the sampling tmes. In addition, the controller would have to check the time passed by ode45 (which can also decrease for step size tuning), detect the current sampling interval and use the buffered control signal from the beginning of that interval.

For such cases, it would be nice if ode45 could be configured to additionally call the dynamics function at specific times provided by the user and that a flag indicates such a “sampling” call.

1 Like

With the modified odeset and ode45, there is an improvement is the execution. I feel this is not a permanent solution. However, I will use this method for the current problem.
Thank you.

  51    integrate_adaptive             1.527      20.46         2000
  54 runge_kutta_45_dorpri             1.504      20.16         6812
   2                 ode45             0.772      10.35         2000
  22          odemergeopts             0.523       7.00         2000
  41           AbsRel_norm             0.490       6.57        12812
  56                 feval             0.462       6.20        42872

Why not? I think the patch to ode45 would be worth pushing to the repository (adding similar changes to at least ode23 as well) …
I’m not sure about the change to odeset as a workaround exists, but that might go too …
could you open an issue on the bug tracker referencing this discussion?
c.

If the time range is passed as a list of time instants rather than start-stop values, ode45 is guaranteed to have a time step
at that instants, so what you propose would not be difficult to implement …

But I am concerned about compatibility, does Matlab have anything like that?
how does matlab/simulink handle mixed discrete/continuous systems?

c.

If the time range is passed as a list of time instants rather than start-stop values, ode45 is guaranteed to have a time step at that instants

If I am not mistaken, the integration is done at time instants independent of the input time vector. The values at the specified time instants are then later interpolated to the input time vector values.

could you open an issue on the bug tracker referencing this discussion?

Would this be considered as a bug ? It seems to be more of a request for performance improvement.

… matlab/simulink handle mixed discrete/continuous systems?

Simulink has specific blocks for forcing the sampling rate of specific signals. I would assume that a block is probably called when the time reaches sample boundary of any input signal to that block. With Matlab, I assume the internal odexxxxxx calls are pre-compiled.

My change to ode45 just removes a call to odeset that was totally unneeded. Indeed, It was already marked as FIXME as It Is there to work around an issue with the “refine” option. So this part can be considered a bug in my opinion.

The change to odeset Is indeed a performance improvement that speeds up the common case of a call with no imput. I’m not sure whether this is really needed as one can avoid the multiple calls to odeset by saving the result of the first call and passing it to the solver explicitely …

Anyway feature requests and bugs are discussed on the same tracker.

c.

1 Like

I agree to @octavecontrib that the times given might be interpolated and not really used as points for integrating the ode. Or are we wrong here?
Regarding compatibility: Yes, this is definitely an issue. Therefore, please understand my post about this ode45 extension merely as brainstorming rather than a real feature request.

You are right, it used to be as I described until some point :

https://hg.savannah.gnu.org/hgweb/octave/file/b7ac1e94266e/scripts/ode/private/integrate_adaptive.m

but it was then changed (apparently by myself :wink: ) here :

https://hg.savannah.gnu.org/hgweb/octave/file/a22d8a2eb0e5/scripts/ode/private/integrate_adaptive.m

BTW at that time there also were options to perform the integration with a constant time step :

https://hg.savannah.gnu.org/hgweb/octave/file/a22d8a2eb0e5/scripts/ode/private/integrate_const.m
https://hg.savannah.gnu.org/hgweb/octave/file/a22d8a2eb0e5/scripts/ode/private/integrate_n_steps.m

those were removed for compatibility …
c.