Creating a loop to use all columns in a dataset array

Hello , i have a data set with 10 columns.The first column contains the names of the variables and the 9 next contain the data for the variables across 9 periods. I want , for each period ,to take the data which are lower than the data's median .I use the following command for the first period: s1=data(data.mv1<median(data.mv1),{'Name','mv1',}), where mv1 is the header of the first period's column and s1 a new dataset which contains only the variables i want. My question is , can i write a (for) loop that will automatically do this for the whole 9 periods, thus giving me s1,s2,...,s9?

댓글 수: 1

Jan
Jan 2011년 7월 26일
There must be one or two typos in "s1=data(data.mv1<median(data.mv1),{'Name','mv1',})"
Please fix them by editing your question.

댓글을 달려면 로그인하십시오.

 채택된 답변

Oleg Komarov
Oleg Komarov 2011년 7월 26일
data = dataset({('a':'z').','names'},{rand(26,2),'mv1','mv2'});
EDIT
% Retrieve column names/variables (from second onwards)
varnames = data.Properties.VarNames(2:end);
nV = numel(varnames);
% Preallocate
C = cell(1,nV);
% Loop per each column (variable)
for v = 1:nV
idx = data.(varnames{v}) < median(data.(varnames{v}));
C{v} = data.(varnames{v})(idx);
C{v} = data(idx,[1 v]);
end

댓글 수: 5

Thank you for your answer. A few brief questions: If i use C{v} i get a 'Cell contents assignment to a non-cell array object.' , whereas if use a new letter (for example s{v}) it works fine. Is there any way though for the s{v} to display the names of the variables along with the data? Also, your code creates one cell array which contains all the new data sets i wanted. How do i split them so i have different elements for each s{v}? Again , thank you for your time.
C{v} gave error cause I forgot to change preallocation to cell. Now it's fixed.
I added another line in the loop but don't know if it does what you're asking. Choose the first or the new line.
Can you elaborate on "how do I split..."?
The line you added does display the names like i wanted but the variable n is not defined. I assumed you wanted to write 'v+1' insted of 'n' so i tried that and it works. The end result of the code is a cell array with dimensions '1*nV' meaning each cell of the first row (up to the nV cell) contains a dataset ,namely C{v}. My question is, how do i extract these datasets to the matlab workspace so i can use them as variables?
Sry for the distraction, it should be v (why v+1?).
Why do you want it to be extracted to the workspace, I suggest to keep it in the cell array, easier to reference. Read http://matlab.wikia.com/wiki/FAQ#How_can_I_create_variables_A1.2C_A2.2C....2CA10_in_a_loop.3F
If i use v ,the first dataset doesn't show the first data but instead repeates the names of the variables. Agreed, it's much easier if i keep them in the cell ,it came to me right after i posted it unfortunately. Thanks again for all your help.

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Jan
Jan 2011년 7월 26일

1 개 추천

It would be easy, if you do not use symbols like 's1' and 'mv1', which have an index inside the name. Better use and index as index: s{1}, s{2}, ... and mv{1}, mv{2}, ...
Titus Edelhofer
Titus Edelhofer 2011년 7월 26일
Hi,
I guess this leads to the question, how to get to data.mv1 where mv1 is given in a variable?
Note, that data.mv1 is the same as data.('mv1'). So if you have e.g.
header = {'mv1', ...};
then you could do
for i=1:length(header)
col = data.(header{i});
% do your median thing
dataNew = data(col<median(col));
end
Hope, this helps,
Titus

댓글 수: 2

Is your <header> a cell array? When trying to run your code i get a 'Dataset array subscripts must be two-dimensional.' error. Also , shouldn't there be a suscript to dataNew like dataNew{i}?
Yes, header is a cell array containing the names. And yes, there should be some subscript depending on what further you want to do with the reduced data ...

댓글을 달려면 로그인하십시오.

태그

질문:

2011년 7월 26일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by