how to select particular rows from a matrix

조회 수: 3 (최근 30일)
ektor
ektor 2017년 3월 6일
댓글: ektor 2017년 3월 7일
Dear all,
I have a panel data set(countries, years) of dimension 20000 by 200. My goal is
1) to choose those countries that correspond to a specific period (1970-2013) AND
2) for which period there are no NaN values Is there a way to do that? Here is a sample matrix Many thanks in advance.
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
};

채택된 답변

Image Analyst
Image Analyst 2017년 3월 6일
Try converting it to a table to make it easier:
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ 42] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
};
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
Echoed to the command window, you can see the steps:
t =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ ___ ___ ___ ___
'Albania' 1919 0 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1920 1 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1921 0 0 0 0 0 0 1 0 NaN 1 0 0
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
'Albania' 1923 0 0 0 0 0 0 0 0 NaN 1 0 0
'Albania' 1924 1 0 1 0 0 0 2 1 NaN 1 2 0
'Albania' 1933 0 0 0 0 0 0 0 0 NaN 1 0 0
yearsInRange =
7×1 logical array
0
0
1
1
1
0
0
nonNanRows =
7×1 logical array
0
0
0
1
0
0
0
newTable =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ __ ___ ___ ___
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
>>
  댓글 수: 3
Image Analyst
Image Analyst 2017년 3월 7일
Sure it will. Here is proof:
A={ 'country' 'year' 'v1' 'v2' 'v3' 'v4' 'v5' 'v6' 'v7' 'v8' 'v9' 'v10' 'v11' 'v12'
'Albania' [1919] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1920] [1] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1921] [0] [0] [0] [0] [0] [ 0] [1] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1922] [0] [0] [0] [0] [0] [ 0] [1] [0] [ 42] [ 1] [ 0] [ 0]
'Albania' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Albania' [1924] [1] [0] [1] [0] [0] [ 0] [2] [1] [ NaN] [ 1] [ 2] [ 0]
'Albania' [1933] [0] [0] [0] [0] [0] [ 0] [0] [0] [ NaN] [ 1] [ 0] [ 0]
'Canada' [1922] [1] [0] [1] [0] [0] [ 0] [2] [1] [ 11] [ 1] [ 2] [ 0]
'USA' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ 12] [ 1] [ 0] [ 0]
'Germany' [1922] [1] [0] [1] [0] [0] [ 0] [2] [1] [ 13] [ 1] [ 2] [ 0]
'Belgium' [1923] [0] [0] [0] [0] [0] [ 0] [0] [0] [ 14] [ 1] [ 0] [ 0]
};
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
And look at all the countries it found:
newTable =
country year v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12
_________ ____ __ __ __ __ __ __ __ __ __ ___ ___ ___
'Albania' 1922 0 0 0 0 0 0 1 0 42 1 0 0
'Canada' 1922 1 0 1 0 0 0 2 1 11 1 2 0
'USA' 1923 0 0 0 0 0 0 0 0 12 1 0 0
'Germany' 1922 1 0 1 0 0 0 2 1 13 1 2 0
'Belgium' 1923 0 0 0 0 0 0 0 0 14 1 0 0
so explain why you said that - why you think it's getting countries outside the range or not including countries inside the year range. For example you can give me a cell array that proves what you said.
ektor
ektor 2017년 3월 7일
So I was confused! Yes, it is fine. Many thanks!

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Rik
Rik 2017년 3월 6일
You can use ismember to select the country, then you can convert those rows (excluding the first column) to a matrix with cell2mat. With logical indexing you can select the correct range of years, and with isnan you can find out the rows with NaN values.
I think you should be able to figure out what to do with this pointer. If you have trouble writing the code, don't hesitate to reply.
  댓글 수: 2
ektor
ektor 2017년 3월 6일
Dear Rik,thank you for this. I am not sure how to choose the years. For example, I am interested in the 1945-2015 time period. However, if I set, for example
x>=1945&x<=2015
I might get counties with time period 1989-2009. But I want countries over 1945-2015. My logical indexing is wrong. Any suggestion on that?
Rik
Rik 2017년 3월 7일
I would say that your data is not complete then, because there is nothing wrong with your logical indexing.
I also like to add that the solution from Image Analyst is more elegant and will be easier to understand if you come back to your code after a year. You do need to make sure that the all machines you run your code on have 2013b or later, because older versions don't have the table data structure.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Resizing and Reshaping Matrices에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by