Ignore missing data in a table group

Question

0 개 추천

I have a table with some missing data. I am using findgroups and splitapply to do some calculations on columns of the table, but when a group has a missing value the calculation returns missing.

I would like to ignore the missing value in these calculations but without removing the enitre row- other colums have valid data.

>> T = readtable('messy.csv','TreatAsEmpty',{'.','NA'})
T =
  21×5 table
       A         B          C          D       E  
    ________    ____    __________    ____    ____
    {'afe1'}       3    {'yes'   }       3       3
    {'egh3'}     NaN    {'no'    }       7       7
    {'wth4'}       3    {'yes'   }       3       3
    {'atn2'}      23    {'no'    }      23      23
    {'arg1'}       5    {'yes'   }       5       5
    {'jre3'}    34.6    {'yes'   }    34.6    34.6
    {'wen9'}     234    {'yes'   }     234     234
    {'ple2'}       2    {'no'    }       2       2
    {'dbo8'}       5    {'no'    }       5       5
    {'oii4'}       5    {'yes'   }       5       5
    {'wnk3'}     245    {'yes'   }     245     245
    {'abk6'}     563    {0×0 char}     563     563
    {'pnj5'}     463    {'no'    }     463     463
    {'wnn3'}       6    {'no'    }       6       6
    {'oks9'}      23    {'yes'   }      23      23
    {'wba3'}     NaN    {'yes'   }     NaN      14
    {'pkn4'}       2    {'no'    }       2       2
    {'adw3'}      22    {'no'    }      22      22
    {'poj2'}     -99    {'yes'   }     -99     -99
    {'bas8'}      23    {'no'    }      23      23
    {'gry5'}     NaN    {'yes'   }     NaN      21
    
    >> [G,gen]=findgroups(T(:,[3])); %find groups based on column C
    
    >> gen
gen =
  2×1 table
       C   
    _______
    {'no' }
    {'yes'}
    
    %% find mean of columns E and D based on Groups in column C ('no' or 'yes')
    >> MeanE=splitapply(@mean,T(:,5),G)
MeanE =
          61.4444444444444
          44.4181818181818
    % works 
          
    >> MeanD=splitapply(@mean,T(:,4),G) 
MeanD =
          61.4444444444444
                       NaN
    % Does not work for 'yes' group due to NaN values in column 4
    
    >> MeanD=splitapply(@mean,rmmissing(T(:,4)),G);
    %fails because rmmissing(T(:,4)) and G are differnet sizes
    

I would like to be able to ignore the NaN values in column D when calcualting the mean. I can't seem to make it work with ismissing or rmmissing, and feel like this should be simpler than splitting them manually.

댓글 수: 2
없음 표시 없음 숨기기

Johan 2021년 10월 27일

MATLAB Online에서 열기

I'm not used to working with table but maybe instead of using @mean you can define a function and use this in your splitapply call.

mymean = @(x) mean(x,'omitnan');
splitapply(mymean,T(:,4),G) 

Marcus Glover 2021년 10월 27일

Thank you, omitnan works for me.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Ive J 2021년 10월 27일

MATLAB Online에서 열기

0 개 추천

As Johan also suggested you may use omitnant flag. Also, consider using groupsummary and groupfilter :

m = groupsummary(T, 'C', @(x)mean(x, 'omitnan'), {'D', 'E'})
        C         GroupCount    fun1_D    fun1_E
    __________    __________    ______    ______
    {0×0 char}         1           563       563
    {'no'    }         9        61.444    61.444
    {'yes'   }        11          50.4    44.418

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

Marcus Glover 2021년 10월 27일

편집: Marcus Glover 2021년 10월 27일

Thanks! My only reluctance to use groupsummary is that it seems (to me anyway...) that I lose the group indexing so I have to use findgroups anyway to work with the individual group members- ie if I wanted to make a scatterplot of B vs E for only 'no' group members or something.

댓글을 달려면 로그인하십시오.

Ignore missing data in a table group

댓글 수: 2
없음 표시 없음 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (0개)

카테고리

제품

릴리스

태그

Community Treasure Hunt

Ignore missing data in a table group

댓글 수: 2 없음 표시 없음 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (0개)

카테고리

제품

릴리스

태그

참고 항목

Community Treasure Hunt

댓글 수: 2
없음 표시 없음 숨기기

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기