Numeric vector indexing slows down when the vector is a struct field. Why?

In the speed comparision below, the only difference between Version 1 and Version 2 is that the vector being updated is the field of a struct. I thought struct fields were supposed to behave essentially the same as individual variables, and that dot indexing was just a way of giving a common prefix to their names, in this case "c.". Why then does Version 2 run 6 times more slowly?
clear, close all
N_range = 1e5:1e5:1e6;
times = zeros(1, length(N_range));
%%Version 1
N_i = 1;
for N = N_range
y=zeros(1,N,'uint32');
tic
for i=2:N
y(i)=y(i-1)*rand;
end
times(N_i) = toc;
N_i = N_i + 1;
end
times0=times;
%%Version 2
N_i = 1;
for N = N_range
c.y=zeros(1,N,'uint32');
tic
for i=2:N
c.y(i)=c.y(i-1)*rand;
end
times(N_i) = toc;
N_i = N_i + 1;
end
times1=times;
plot(1:N_i-1,times0, 1:N_i-1, times1); legend('Version 1', 'Version 2','location','northwest')

댓글 수: 2

"...and that dot indexing was just a way of giving a common prefix to their names"
No, dot indexing is not a "prefix to their names". Dot indexing is indexing. Which means your comparison is
  • indexing one numeric array vs
  • indexing into one struct array and then indexing into a numeric array.
In other words, with a structure MATLAB must first find the location of the nested numeric array in memory before it can index into it, and that dereferencing takes a finite non-zero amount of time. Calling SUBSREF/SUBSASGN requires some time regardless of the array type.
Lets also compare against cell indexing:
R = 1e5:1e5:1e6;
T = nan(1,numel(R));
% Version 1
for k = 1:numel(R)
n = R(k);
y = zeros(1,n,'uint32');
tic
for ii = 2:n
y(ii) = y(ii-1)*rand;
end
T(k) = toc;
end
times1 = T;
% Version 2
for k = 1:numel(R)
n = R(k);
s = struct('y',zeros(1,n,'uint32'));
tic
for ii = 2:n
s.y(ii) = s.y(ii-1)*rand;
end
T(k) = toc;
end
times2 = T;
% Version 3
for k = 1:numel(R)
n = R(k);
c = {zeros(1,n,'uint32')};
tic
for ii = 2:n
c{1}(ii) = c{1}(ii-1)*rand;
end
T(k) = toc;
end
times3 = T;
plot(R,times1, R,times2, R,times3);
legend('1 x indexing', '2 x indexing (struct)', '2 x indexing (cell)', 'location','northwest')
See also:
By the way, you get the same results as @Stephen23 showed if you put the work into functions -- with the idea being that functions would be optimized by the Execution Engine even if scripts are not optimized.
R = 1e5:1e5:1e6;
T = nan(1,numel(R));
% Version 1
for k = 1:numel(R)
n = R(k);
T(k) = ver1(n);
end
times1 = T;
% Version 2
for k = 1:numel(R)
n = R(k);
T(k) = ver2(n);
end
times2 = T;
% Version 3
for k = 1:numel(R)
n = R(k);
T(k) = ver3(n);
end
times3 = T;
plot(R,times1, R,times2, R,times3);
legend('1 x indexing', '2 x indexing (struct)', '2 x indexing (cell)', 'location','northwest')
function time = ver1(n)
y = zeros(1,n,'uint32');
tic
for ii = 2:n
y(ii) = y(ii-1)*rand;
end
time = toc;
end
function time = ver2(n)
s = struct('y',zeros(1,n,'uint32'));
tic
for ii = 2:n
s.y(ii) = s.y(ii-1)*rand;
end
time = toc;
end
function time = ver3(n)
c = {zeros(1,n,'uint32')};
tic
for ii = 2:n
c{1}(ii) = c{1}(ii-1)*rand;
end
time = toc;
end

댓글을 달려면 로그인하십시오.

 채택된 답변

Walter Roberson
Walter Roberson 2025년 3월 21일
I thought struct fields were supposed to behave essentially the same as individual variables, and that dot indexing was just a way of giving a common prefix to their names
If we temporarily neglect potential effects of JIT compilation from the execution engine:
Direct variable use requires examining the symbol table to find the variable name, then examining the index given, validating the index, performing the indexing and building a new anonymous scalar variable containing the value, then using the anonymous scalar in the expression.
Structure dot indexing requires examining the symbol table to find the structure name, then examining the field name given, looking up the field name to get an offset into the struct, looking at the offset to find an anonymous variable, then examining the index given, validating performing the indexing and building a new anonymous scalar variable containing the value, then using the anonymous scalar in the expression.
The main difference is in the time taken to look up the field name in the struct, and the slower access because the variable name does not contain the value directly and instead the indexed variable name contains the value.
The effects of JIT compilation in these two cases is unknown. We can suspect that iterative indexing will be optimized. We can guess that maybe structure indexing with fixed field will be well-optimized... but unfortunately we just do not know.
The extra time taken to look up the field name and find the referenced value could account for the difference in timing.

댓글 수: 6

I guess that makes sense, but I am rather surprised the JIT hasn't covered this yet.
It's been this way for a long time, which is why I got in the habit of assigning struct variables into temp variables before loops like this. E.g., cy = c.y, then index into cy. And yes, it would be nice if the JIT could figure this out instead of the user writing workaround code.
But I thought there had been a big push in recent releases to accelerate dot-indexed operations for user-defined classes. Yet, dot-indexed struct operations were left out in the cold? According to the tests below, the same indexing performs an order of magnitude better if I just use a field-containing class, instead of a struct.
clear, close all
N_range = 1e5:1e5:1e6;
times = zeros(1, length(N_range));
%%Version 0
N_i = 1;
for N = N_range
y=zeros(1,N,'uint32');
tic
for i=2:N
y(i)=y(i-1)*rand;
end
times(N_i) = toc;
N_i = N_i + 1;
end
times0=times;
%%Version 1
N_i = 1;
c.y=[];
for N = N_range
c.y=zeros(1,N,'uint32');
tic
for i=2:N
c.y(i)=c.y(i-1)*rand;
end
times(N_i) = toc;
N_i = N_i + 1;
end
times1=times;
%%Version 2
N_i = 1;
c=myclass;
for N = N_range
c.y=zeros(1,N,'uint32');
tic
for i=2:N
c.y(i)=c.y(i-1)*rand;
end
times(N_i) = toc;
N_i = N_i + 1;
end
times2=times;
plot(1:N_i-1,times0, 1:N_i-1, times1, 1:N_i-1, times2);
legend('Numeric', 'Struct', 'Class','location','northwest')
You are correct, we have been working hard to optimize dot-indexing on class properties, and the better performance of class dot-indexing in this case is an outcome of that work.
Field names of structs are more difficult for the JIT-compiler to deal with than properties of a class. We can pull the property names out of a classdef file and they won't change as long as that file doesn't change, so the lookup can be done at JIT-compile time. It's harder in general for the compiler to guarantee that field names of a struct are valid, and in current MATLAB it's a runtime operation.
We can pull the property names out of a classdef file and they won't change as long as that file doesn't change
Are we to understand that addprop has been removed?
The addprop function still exists and works as intended for classes that inherit from the dynamicprops mixin. The JIT-compiler doesn't do any acceleration for properties added using addprop. I've never measured their performance but I'd expect it to be similar to the performance of structure fields.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Matrix Indexing에 대해 자세히 알아보기

제품

릴리스

R2024b

질문:

2025년 3월 21일

댓글:

2025년 6월 24일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by