how to select particular rows from a matrix
    조회 수: 3 (최근 30일)
  
       이전 댓글 표시
    
Dear all,
I have a panel data set(countries, years) of dimension 20000 by 200. My goal is
1) to choose those countries that correspond to a specific period (1970-2013) AND
2) for which period there are no NaN values Is there a way to do that? Here is a sample matrix Many thanks in advance.
 A={  'country'     'year'  'v1'   'v2' 'v3'     'v4'    'v5' 'v6'  'v7'    'v8'    'v9'    'v10'       'v11'  'v12'
    'Albania'    [1919]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
    'Albania'    [1920]    [1]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
    'Albania'    [1921]    [0]    [0]    [0]    [0]    [0]    [ 0]    [1]    [0]    [ NaN]    [  1]    [  0]    [  0]
    'Albania'    [1922]    [0]    [0]    [0]    [0]    [0]    [ 0]    [1]    [0]    [ NaN]    [  1]    [  0]    [  0]
    'Albania'    [1923]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
    'Albania'    [1924]    [1]    [0]    [1]    [0]    [0]    [ 0]    [2]    [1]    [ NaN]    [  1]    [  2]    [  0]
 'Albania'    [1933]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
 };
댓글 수: 0
채택된 답변
  Image Analyst
      
      
 2017년 3월 6일
        Try converting it to a table to make it easier:
A={  'country'     'year'  'v1'   'v2' 'v3'     'v4'    'v5' 'v6'  'v7'    'v8'    'v9'    'v10'       'v11'  'v12'
  'Albania'    [1919]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
  'Albania'    [1920]    [1]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
  'Albania'    [1921]    [0]    [0]    [0]    [0]    [0]    [ 0]    [1]    [0]    [ NaN]    [  1]    [  0]    [  0]
  'Albania'    [1922]    [0]    [0]    [0]    [0]    [0]    [ 0]    [1]    [0]    [ 42]    [  1]    [  0]    [  0]
  'Albania'    [1923]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
  'Albania'    [1924]    [1]    [0]    [1]    [0]    [0]    [ 0]    [2]    [1]    [ NaN]    [  1]    [  2]    [  0]
  'Albania'    [1933]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
  };
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
Echoed to the command window, you can see the steps:
t = 
     country     year    v1    v2    v3    v4    v5    v6    v7    v8    v9     v10    v11    v12
    _________    ____    __    __    __    __    __    __    __    __    ___    ___    ___    ___
    'Albania'    1919    0     0     0     0     0     0     0     0     NaN    1      0      0  
    'Albania'    1920    1     0     0     0     0     0     0     0     NaN    1      0      0  
    'Albania'    1921    0     0     0     0     0     0     1     0     NaN    1      0      0  
    'Albania'    1922    0     0     0     0     0     0     1     0      42    1      0      0  
    'Albania'    1923    0     0     0     0     0     0     0     0     NaN    1      0      0  
    'Albania'    1924    1     0     1     0     0     0     2     1     NaN    1      2      0  
    'Albania'    1933    0     0     0     0     0     0     0     0     NaN    1      0      0  
yearsInRange =
  7×1 logical array
   0
   0
   1
   1
   1
   0
   0
nonNanRows =
  7×1 logical array
   0
   0
   0
   1
   0
   0
   0
newTable = 
     country     year    v1    v2    v3    v4    v5    v6    v7    v8    v9    v10    v11    v12
    _________    ____    __    __    __    __    __    __    __    __    __    ___    ___    ___
    'Albania'    1922    0     0     0     0     0     0     1     0     42    1      0      0  
>>
댓글 수: 3
  Image Analyst
      
      
 2017년 3월 7일
				Sure it will. Here is proof:
A={  'country'     'year'  'v1'   'v2' 'v3'     'v4'    'v5' 'v6'  'v7'    'v8'    'v9'    'v10'       'v11'  'v12'
  'Albania'    [1919]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
  'Albania'    [1920]    [1]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
  'Albania'    [1921]    [0]    [0]    [0]    [0]    [0]    [ 0]    [1]    [0]    [ NaN]    [  1]    [  0]    [  0]
  'Albania'    [1922]    [0]    [0]    [0]    [0]    [0]    [ 0]    [1]    [0]    [ 42]    [  1]    [  0]    [  0]
  'Albania'    [1923]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
  'Albania'    [1924]    [1]    [0]    [1]    [0]    [0]    [ 0]    [2]    [1]    [ NaN]    [  1]    [  2]    [  0]
  'Albania'    [1933]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ NaN]    [  1]    [  0]    [  0]
  'Canada'    [1922]    [1]    [0]    [1]    [0]    [0]    [ 0]    [2]    [1]    [ 11]    [  1]    [  2]    [  0]
  'USA'    [1923]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ 12]    [  1]    [  0]    [  0]
  'Germany'    [1922]    [1]    [0]    [1]    [0]    [0]    [ 0]    [2]    [1]    [ 13]    [  1]    [  2]    [  0]
  'Belgium'    [1923]    [0]    [0]    [0]    [0]    [0]    [ 0]    [0]    [0]    [ 14]    [  1]    [  0]    [  0]
  };
t = cell2table(A(2:end,:), 'VariableNames', A(1,:))
yearsInRange = t.year >= 1921 & t.year <= 1923
nonNanRows = ~isnan(t.v9)
% Find out where both are true:
% year in range, and v9 column is not nan
rowsToKeep = yearsInRange & nonNanRows;
% Extract only those rows
newTable = t(rowsToKeep,:)
And look at all the countries it found:
newTable = 
     country     year    v1    v2    v3    v4    v5    v6    v7    v8    v9    v10    v11    v12
    _________    ____    __    __    __    __    __    __    __    __    __    ___    ___    ___
    'Albania'    1922    0     0     0     0     0     0     1     0     42    1      0      0  
    'Canada'     1922    1     0     1     0     0     0     2     1     11    1      2      0  
    'USA'        1923    0     0     0     0     0     0     0     0     12    1      0      0  
    'Germany'    1922    1     0     1     0     0     0     2     1     13    1      2      0  
    'Belgium'    1923    0     0     0     0     0     0     0     0     14    1      0      0
so explain why you said that - why you think it's getting countries outside the range or not including countries inside the year range. For example you can give me a cell array that proves what you said.
추가 답변 (1개)
  Rik
      
      
 2017년 3월 6일
        You can use ismember to select the country, then you can convert those rows (excluding the first column) to a matrix with cell2mat. With logical indexing you can select the correct range of years, and with isnan you can find out the rows with NaN values.
I think you should be able to figure out what to do with this pointer. If you have trouble writing the code, don't hesitate to reply.
댓글 수: 2
  Rik
      
      
 2017년 3월 7일
				I would say that your data is not complete then, because there is nothing wrong with your logical indexing.
I also like to add that the solution from Image Analyst is more elegant and will be easier to understand if you come back to your code after a year. You do need to make sure that the all machines you run your code on have 2013b or later, because older versions don't have the table data structure.
참고 항목
카테고리
				Help Center 및 File Exchange에서 Resizing and Reshaping Matrices에 대해 자세히 알아보기
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!