As I expected, the change I suggested to eliminate that add function and move its code inside the location where it is called has sped it up by a lot. I deleted the function Add_an_individual_to_CommunityAgents_v9 completely and changed the main function’s for-loop as follows:
iii=0;
for ii=1:SummedAgeGrPopul_gender(iAgeGr,genderX)
iii=iii+1;
AgentsCOMM(ID+1,1)=ID+1; % Identification number
AgentsCOMM(ID+1,2)=SampledAges(iii); % Death of birth
AgentsCOMM(ID+1,3)=genderX; % 1: Male or 2: Female
AgentsCOMM(ID+1,4)=1; % 1: alive 0: dead
AgentsCOMM(ID+1,5)=nan; % At the moment Race is NaN
AgentsCOMM(ID+1,6)=0; % Time of hospitalization as patient
AgentsCOMM(ID+1,7)=0; % Time of discharge from hospital as patient
AgentsCOMM(ID+1,8)=0; % Antibiotix history (0, 1, 2, or 3);
AgentsCOMM(ID+1,9)=1; % initialized with the susceptible status
AgentsCOMM(ID+1,10)=0; % 0 currently NOT in a hospital; 1 in hospital
AgentsCOMM(ID+1,11)=0; % ID of HCF person most recently admitted
AgentsCOMM(ID+1,12)=0; % ID of ward of the HCF
AgentsCOMM(ID+1,13)=iComm; % ID of the community a person lives in
AgentsCOMM(ID+1,14)=0; % Clinical Test result (the most recent)
AgentsCOMM(ID+1,15)=0; % Time of the most recent test result
ID=ID+1;
end
I renamed that version SyntheticIndividuals_for_ABM_TEST.m. Now see the performance. Each of the following is run with a fresh Octave instance:
Original:
octave:1> SyntheticIndividuals_for_ABM_debug
Elapsed time is 34.7479 seconds.
Improved:
octave:1> SyntheticIndividuals_for_ABM_TEST
Elapsed time is 1.71566 seconds.
That function caused unnecessary copying and was getting in the way. You can see why it becomes quadratic instead of linear if written like that. The entire array AgentsCOMM had to get copied from the main function SyntheticIndividuals_for_ABM_debug into the helper function Add_an_individual_to_CommunityAgents_v9, and it got copied back. So every addition of a row caused the whole matrix to be copied back and forth, meaning that the addition of N rows was causing the copying of (2 * N^2) rows’-worth of data.
Take-home lesson: Avoid function calls to grow an array like that. Inline it instead.
Technique lesson: If you see a slow-down that bad, experiment with the size of the problem, like changing 75K to 10K, 20K, etc. If you can see quadratic behavior for what should be mostly linear, you can expect that an entire array is being copied back and forth for some reason, so it’ll help localize the problem.
Edit:
I get into the time analysis in more detail here.
This is the original code behavior:
octave:17> pos = 0; x = y = [];
octave:18> SyntheticIndividuals_for_ABM_debug ; pos += 1; x(pos) = pos; y(pos) = toc; ## 10K
Elapsed time is 0.550006 seconds.
octave:19> SyntheticIndividuals_for_ABM_debug ; pos += 1; x(pos) = pos; y(pos) = toc; ## 20K
Elapsed time is 1.69111 seconds.
octave:20> SyntheticIndividuals_for_ABM_debug ; pos += 1; x(pos) = pos; y(pos) = toc; ## 30K
Elapsed time is 3.27898 seconds.
octave:21> SyntheticIndividuals_for_ABM_debug ; pos += 1; x(pos) = pos; y(pos) = toc; ## 40K
Elapsed time is 5.56754 seconds.
octave:22> SyntheticIndividuals_for_ABM_debug ; pos += 1; x(pos) = pos; y(pos) = toc; ## 50K
Elapsed time is 9.40509 seconds.
octave:23> SyntheticIndividuals_for_ABM_debug ; pos += 1; x(pos) = pos; y(pos) = toc; ## 60K
Elapsed time is 16.3135 seconds.
octave:24> SyntheticIndividuals_for_ABM_debug ; pos += 1; x(pos) = pos; y(pos) = toc; ## 70K
Elapsed time is 26.7577 seconds.
octave:25> [x; y]
ans =
1.0000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000
0.5500 1.6911 3.2795 5.5683 9.4059 16.3146 26.7588
octave:26> p = polyfit (x, y, 2), z = polyval (p, x); [x; y; z]
p =
0.9073 -3.1871 3.6833
ans =
1.0000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000
0.5500 1.6911 3.2795 5.5683 9.4059 16.3146 26.7588
1.4035 0.9384 2.2878 5.4519 10.4306 17.2240 25.8320
octave:27> p = polyfit (x, y, 3), z = polyval (p, x); [x; y; z]
p =
0.1516 -0.9124 3.0301 -1.7757
ans =
1.0000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000
0.5500 1.6911 3.2795 5.5683 9.4059 16.3146 26.7588
0.4937 1.8482 3.1977 5.4519 9.5208 16.3142 26.7419
As you can see, the time performance for the original is strongly quadratic and weakly cubic, so this is an indication that something weird is going on.
Here is the same analysis for the improved version, just from the change of inlining the code of that add_individual_to_group function:
octave:30> pos = 0; x = y = [];
octave:31> SyntheticIndividuals_for_ABM_TEST ; pos += 1; x(pos) = pos; y(pos) = toc; ## 10K
Elapsed time is 0.225497 seconds.
octave:32> SyntheticIndividuals_for_ABM_TEST ; pos += 1; x(pos) = pos; y(pos) = toc; ## 20K
Elapsed time is 0.445472 seconds.
octave:33> SyntheticIndividuals_for_ABM_TEST ; pos += 1; x(pos) = pos; y(pos) = toc; ## 30K
Elapsed time is 0.66836 seconds.
octave:34> SyntheticIndividuals_for_ABM_TEST ; pos += 1; x(pos) = pos; y(pos) = toc; ## 40K
Elapsed time is 0.889586 seconds.
octave:35> SyntheticIndividuals_for_ABM_TEST ; pos += 1; x(pos) = pos; y(pos) = toc; ## 50K
Elapsed time is 1.11439 seconds.
octave:36> SyntheticIndividuals_for_ABM_TEST ; pos += 1; x(pos) = pos; y(pos) = toc; ## 60K
Elapsed time is 1.35935 seconds.
octave:37> SyntheticIndividuals_for_ABM_TEST ; pos += 1; x(pos) = pos; y(pos) = toc; ## 70K
Elapsed time is 1.56989 seconds.
octave:38> [x; y]
ans =
1.0000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000
0.2255 0.4455 0.6684 0.8896 1.1152 1.3603 1.5709
octave:39> p = polyfit (x, y, 2), z = polyval (p, x); [x; y; z]
p =
8.6829e-04 2.1851e-01 5.1089e-03
ans =
1.0000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000
0.2255 0.4455 0.6684 0.8896 1.1152 1.3603 1.5709
0.2245 0.4456 0.6684 0.8930 1.1194 1.3474 1.5772
octave:40> p = polyfit (x, y, 1), z = polyval (p, x); [x; y; z]
p =
2.2545e-01 -5.3106e-03
ans =
1.0000 2.0000 3.0000 4.0000 5.0000 6.0000 7.0000
0.2255 0.4455 0.6684 0.8896 1.1152 1.3603 1.5709
0.2201 0.4456 0.6710 0.8965 1.1220 1.3474 1.5729
It is strongly linear and the quadratic coefficient is tiny, which is much more plausible behavior.
All of the above was done on a single thread of a single core, no parallel computation or funny business.