Get index of table column

조회 수: 65 (최근 30일)
Jasper van Casteren
Jasper van Casteren 2018년 6월 4일
답변: Peter Perkins 2018년 6월 4일
I have the following problem: - for a large table, the column names are concatenated. - I have a loop in which I have to assign a lot of values to the correct columns The loop is now slow. I think it would be much faster if I get the column numbers first, then use the column indices in the loop.
I know I can find the indices by ColIndex = find(T.Properties.VariableNames, 'MyColName', 1) But find is such a slow function. I would really like to use the fast index-from-names algorithm that the table is using itself. Is there really no function like Index = T.ColumnIndex('MyColName')? Seems so obvious.
  댓글 수: 1
Paolo
Paolo 2018년 6월 4일
Can you provide a minimal working example?

댓글을 달려면 로그인하십시오.

답변 (4개)

Jasper van Casteren
Jasper van Casteren 2018년 6월 4일
Sorry, I meant to write ColIndex = find(strcmp(T.Properties.VariableNames, 'MyColName'), 1)

KSSV
KSSV 2018년 6월 4일
Read about strcmp

Jasper van Casteren
Jasper van Casteren 2018년 6월 4일
I have data triplets for multiple scenarios: X,Y,Z that I want store in Table, using the concatenation of a variable name and a scenario name. It is a lot of data.
The alternatives are to use direct column names (first function), or to first find the column indiced (second function).
This is not the actual code, but shows my problem). I suspect that the find slow. I have a lot of columns (few hundred), and looking up the index seems to be very slow.
function T = AddScenario1(T, D, scenName, colNames)
for i=1:size(D,1);
col = [scenName, colNames{D(i,1)}];
T{D(i,2),col} = T{D(i,2),col} + D(i,3);
end
end
function T = AddScenario2(T, D, scenName, colNames)
% assert : 1<=D(:,1)<=length(colnames)
idx = zeros(length(colNames), 1);
for i=1:length(colNames)
varnm = [scenName, colNames{D(i,1)}];
idx(i) = find(strcmp(T.Properties.VariableNames, varnm), 1);
end
% assert : 1<=D(:,1)<=size(colnames)
for i=1:size(D,1)
col = idx(D(i,1));
T{D(i,2),col} = T{D(i,2),col} + D(i,3);
end
end

Peter Perkins
Peter Perkins 2018년 6월 4일
I guess you have a table like this:
>> t = array2table(rand(10,6),'VariableNames',{'A_X' 'A_Y' 'A_Z' 'B_X' 'B_Y' 'B_Z'})
t =
10×6 table
A_X A_Y A_Z B_X B_Y B_Z
_______ _______ ________ ________ _______ _______
0.81472 0.15761 0.65574 0.70605 0.43874 0.27603
0.90579 0.97059 0.035712 0.031833 0.38156 0.6797
0.12699 0.95717 0.84913 0.27692 0.76552 0.6551
0.91338 0.48538 0.93399 0.046171 0.7952 0.16261
0.63236 0.80028 0.67874 0.097132 0.18687 0.119
0.09754 0.14189 0.75774 0.82346 0.48976 0.49836
0.2785 0.42176 0.74313 0.69483 0.44559 0.95974
0.54688 0.91574 0.39223 0.3171 0.64631 0.34039
0.95751 0.79221 0.65548 0.95022 0.70936 0.58527
0.96489 0.95949 0.17119 0.034446 0.75469 0.22381
And then you have a function whose purpose is to do something for one scenario, A, say, and one or more of the X,Y,Z variables for that scenario? I'm guessing.
You're looping over rows. You should try to vectorize. It's not clear what is in your D, so this is all just speculation, but for example, if D have the same name in every element of its first column, you would not need the loop at all. One vectorized statement. Even if D has different names in each element of it's first column, you should be able to partially vectorize.
You are using braces to read/write only one variable at a time. If you are committed to writing loops like that, use dot subscripting. t{rows,'Var'} -> t.Var(rows). I guess you'll actually need t.(vanrName)(rows).
Also, i'm not sure why it would be faster to do the name lookup yourself rather than letting the table do it. Unless you can do this once, outside of your loop, I don't see how you'd be saving any work. If you do want to do the name lookup, just leave the find out. Don't need it, a logical subscript will work on its own.
In any case, creating a mapping from nested names to variable indices ought to be done outside the loop. If you stick with that "flat" organization, and you've confirmed that the name lookup is really a bottleneck, you should be able to create a nested scalar struct to map your nested names to the table's variables.
>> map = struct('A',struct('X',1,'Y',2,'Z',3),'B',struct('X',4,'Y',5,'Z',6))
map =
struct with fields:
A: [1×1 struct]
B: [1×1 struct]
>> map.A.Y
ans =
2
But probably a better way to organize your data would be to make each scenario an Nx3 table, and nest them in another table:
>> t = table( ...
array2table(rand(10,3),'VariableNames',{'X' 'Y' 'Z'}), ...
array2table(rand(10,3),'VariableNames',{'X' 'Y' 'Z'}), ...
'VariableNames',{'A' 'B'})
t =
10×2 table
A B
X Y Z X Y Z
_____________________________ _______________________________
0.75127 0.84072 0.35166 0.075854 0.16218 0.45054
0.2551 0.25428 0.83083 0.05395 0.79428 0.083821
0.50596 0.81428 0.58526 0.5308 0.31122 0.22898
0.69908 0.24352 0.54972 0.77917 0.52853 0.91334
0.8909 0.92926 0.91719 0.93401 0.16565 0.15238
0.95929 0.34998 0.28584 0.12991 0.60198 0.82582
0.54722 0.1966 0.7572 0.56882 0.26297 0.53834
0.13862 0.25108 0.75373 0.46939 0.65408 0.99613
0.14929 0.61604 0.38045 0.011902 0.68921 0.078176
0.25751 0.47329 0.56782 0.33712 0.74815 0.44268
>> t.A.Y(3:4) = 3:4
t =
10×2 table
A B
X Y Z X Y Z
_____________________________ _______________________________
0.75127 0.84072 0.35166 0.075854 0.16218 0.45054
0.2551 0.25428 0.83083 0.05395 0.79428 0.083821
0.50596 3 0.58526 0.5308 0.31122 0.22898
0.69908 4 0.54972 0.77917 0.52853 0.91334
0.8909 0.92926 0.91719 0.93401 0.16565 0.15238
0.95929 0.34998 0.28584 0.12991 0.60198 0.82582
0.54722 0.1966 0.7572 0.56882 0.26297 0.53834
0.13862 0.25108 0.75373 0.46939 0.65408 0.99613
0.14929 0.61604 0.38045 0.011902 0.68921 0.078176
0.25751 0.47329 0.56782 0.33712 0.74815 0.44268
You could also nest them in a scalar struct, but a nested table allows you to select rows across all scanarios easily. Not sure if that's important.

카테고리

Help CenterFile Exchange에서 Data Type Conversion에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by