agrupacion de datos de una tabla

Question

Camilo Cárdenas 2024년 7월 17일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2138257-agrupacion-de-datos-de-una-tabla

댓글: Camilo Cárdenas 2024년 9월 14일

Holas,

tengo una tabla 'tData' de tamano: 152709x14, que resume senal de varias variables de un instrumento de medicion (FTIR) y quisiera disminuir el ruido agrupando los valores de la senial en grupos de a 10 es decir obtener una nueva tabla 'tDataReduced' de un tamano 15270x14. la tabla tData cuenta con una columna de texto con el nombre de cada serie de medicion llamada 'MeasID', una columna de fecha y hora llamada 'Time' y 12 columnas con valores numericos.

Para este proposito, Estoy usando este codigo:

% (1) Crear un vector de índices para agrupar las filas en grupos de 10

indices = repelem(1:ceil(height(tData)/10), 10, 1);

indices = indices(1:height(tData));

% (2) Calcular el promedio de cada grupo de 10 filas

tDataReduced = grpstats(tData, indices, 'mean');

% (3) Eliminar la columna de índices generada por grpstats

tDataReduced(:, 'GroupCount') = [];

%%%%%%%%%%%%%%%%%%%5

Pero se tranca en el paso (2) indicando este error:

%*****************

Index exceeds the number of array elements. Index must not exceed 14.

Error in dsgrpstats (line 97)

[group,glabel,groupname] = mgrp2idx(a_data(groupvars),a_nobs);

Error in grpstats (line 144)

[varargout{1:nargout}] = dsgrpstats(x,group,whichstats,varargin{:});

%******************

Algunba idea que me pueda ayudar?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Voss 2024년 7월 17일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2138257-agrupacion-de-datos-de-una-tabla#answer_1487087

편집: Voss 2024년 7월 17일

MATLAB Online에서 열기

% first, I construct a table similar to yours
N = 152709;
MeasID = cellstr(char(64+randi(26,N,4)));
Time = datetime(randi(30,N,6));
C = num2cell(rand(N,12),1);
tData = table(MeasID,Time,C{:})
tData = 152709x14 table
     MeasID             Time              Var3       Var4        Var5        Var6        Var7        Var8       Var9       Var10      Var11       Var12        Var13       Var14  
    ________    ____________________    ________    _______    _________    _______    ________    ________    _______    _______    _______    _________    _________    ________

    {'BAIX'}    14-Oct-0016 15:04:08     0.74662    0.66584      0.97243    0.87197     0.16557     0.47469     0.2506    0.60395    0.86371      0.39017      0.17175     0.61027
    {'KQVE'}    11-Mar-0013 23:01:14     0.89935    0.57533      0.97518    0.15691     0.81941    0.024631    0.28559    0.67214     0.9678      0.10555      0.23931     0.75911
    {'RTDP'}    28-Jun-0004 01:03:30     0.68013    0.83434      0.19336    0.39673     0.24662     0.66457    0.03651    0.77372    0.96092       0.9078       0.1652     0.45785
    {'EFAT'}    20-Mar-0014 02:26:13    0.053861     0.5456      0.27471    0.38658     0.71067     0.52239    0.88472    0.69064     0.4962      0.68826      0.27238     0.66459
    {'GYEY'}    24-Sep-0017 04:13:15     0.07359    0.74494      0.63175    0.76147     0.32292    0.081403    0.45976    0.33457    0.97788      0.36092       0.6186     0.93396
    {'BTAL'}    11-Nov-0023 16:26:29     0.87689    0.78009     0.036542    0.78547     0.63962     0.87256    0.94112    0.98786     0.1229      0.80758      0.72292     0.15229
    {'OBIW'}    17-Sep-0018 17:20:20     0.65785    0.75491       0.4657    0.62428     0.48063     0.82777     0.5978    0.52455    0.37039    0.0096658      0.88045     0.43789
    {'YQDL'}    25-Apr-0006 09:15:15     0.97658    0.85495     0.079595     0.2679     0.61404     0.23152    0.61481    0.17822    0.81922      0.65674      0.81724     0.11674
    {'MHVL'}    19-Jul-0022 23:13:20    0.032466    0.38461      0.20308    0.89468     0.10013     0.49899    0.91968    0.45019    0.32341      0.30873      0.15687      0.7115
    {'ARYA'}    01-Mar-0023 19:11:10     0.17789    0.91077      0.30519    0.52604    0.095028     0.36208     0.5468    0.75843    0.33732      0.37849      0.62022     0.16637
    {'IOLX'}    27-Feb-0030 15:12:06     0.41864    0.95644      0.13891    0.90773    0.014836     0.90455    0.65606    0.63802    0.17448       0.6043      0.63695     0.85284
    {'KHJC'}    27-Dec-0009 16:01:19     0.63461    0.27975      0.88259    0.85227     0.46034     0.96396    0.32063    0.88355    0.39404     0.042222      0.70572     0.91728
    {'TTTW'}    01-Apr-0024 19:20:08     0.96205    0.88056    0.0078747    0.58999     0.70797     0.12374    0.30285    0.22808    0.73668      0.76896    0.0025572    0.093245
    {'COOO'}    05-Oct-0001 14:19:10     0.33551    0.87829      0.22246    0.64423     0.59508      0.9902    0.93756    0.99927    0.32911       0.1336      0.59499     0.74522
    {'SFNO'}    04-Jul-0022 21:03:26     0.29299    0.54507      0.78344    0.95124     0.29257     0.72571    0.13398    0.41114    0.74129       0.9658      0.98474     0.31675
    {'ORFM'}    06-Jun-0009 22:02:08     0.51373    0.65547     0.099982    0.92044      0.6864     0.11025    0.18933    0.47621    0.93311      0.76635      0.42722    0.073502
% second, use grpstats to calculate the mean of each group, with grouping
% done according to indices, over all the numeric and datetime table variables
indices = repelem(1:ceil(height(tData)/10), 10, 1);
indices = indices(1:height(tData));
tData.group_idx = indices(:);
vars = tData.Properties.VariableNames;
vars(ismember(vars,{'MeasID','group_idx'})) = [];
tDataReduced = grpstats(tData,'group_idx','mean','DataVars',vars)
tDataReduced = 15271x15 table
          group_idx    GroupCount         mean_Time          mean_Var3    mean_Var4    mean_Var5    mean_Var6    mean_Var7    mean_Var8    mean_Var9    mean_Var10    mean_Var11    mean_Var12    mean_Var13    mean_Var14
          _________    __________    ____________________    _________    _________    _________    _________    _________    _________    _________    __________    __________    __________    __________    __________

    1         1            10        07-Feb-0016 05:55:29     0.51752      0.70514      0.41375       0.5672      0.41946      0.45606      0.55374      0.59743       0.62398       0.46139       0.46649       0.50105  
    2         2            10        13-Feb-0015 18:09:59     0.63774      0.70607       0.3648      0.77554      0.34877      0.55199      0.49512      0.64596       0.47208       0.52363       0.54743       0.42753  
    3         3            10        11-May-0013 06:47:28      0.4609      0.52306      0.40594       0.5143      0.51193      0.36895      0.65375      0.51091       0.44099       0.43514        0.5454       0.41105  
    4         4            10        05-Oct-0016 04:06:59     0.50106      0.48854      0.38952       0.5806      0.55353      0.52183      0.46437        0.449       0.35994       0.48053       0.50099       0.56911  
    5         5            10        02-Jul-0014 12:40:20     0.42546      0.47703      0.60594      0.48473      0.49456      0.52169      0.45005      0.50737       0.59695       0.49354       0.53942       0.51315  
    6         6            10        24-Jun-0015 05:20:08     0.54365      0.42454      0.50526       0.3918      0.54657      0.34156      0.50776      0.64644       0.41142       0.52169       0.37793       0.54506  
    7         7            10        26-May-0018 12:10:02       0.223      0.62658      0.57745      0.43137      0.64486      0.59084      0.27219      0.47831        0.4504       0.52788        0.4444       0.61095  
    8         8            10        26-Sep-0015 06:05:38     0.36107      0.45844      0.43259      0.44789      0.53122      0.51577      0.43466      0.44741       0.30144       0.48877       0.44208       0.53721  
    9         9            10        20-Jul-0014 02:30:31     0.52693      0.52128      0.62287      0.56108      0.55619      0.40734      0.56754      0.45903       0.51274       0.53433       0.40313       0.53245  
    10       10            10        11-Apr-0013 12:28:32     0.55496      0.37775      0.49104      0.64185        0.583      0.50644      0.59117      0.57325       0.58854       0.48762       0.53708       0.50528  
    11       11            10        14-Dec-0015 20:58:18     0.47345      0.56358      0.50469      0.42712      0.58192      0.44643      0.51888      0.55781       0.35132       0.53893       0.49807       0.61024  
    12       12            10        30-Oct-0018 05:14:32     0.67559      0.48449      0.42596      0.46722      0.57329      0.58917      0.52009       0.4463       0.46092       0.36197       0.61369       0.59372  
    13       13            10        16-Oct-0018 01:22:52      0.3809      0.47299      0.43709      0.60162      0.40392      0.67956       0.4347      0.37734       0.33699        0.4328       0.42047       0.29406  
    14       14            10        12-Apr-0019 05:11:28     0.44723      0.49497      0.48745      0.65644       0.5099      0.52495      0.58134      0.47569       0.41992       0.39552       0.56101       0.49381  
    15       15            10        18-Mar-0013 10:42:37     0.45406      0.57539      0.54544      0.45573       0.6396      0.59733      0.57942      0.46537       0.51966       0.34679       0.54309       0.41337  
    16       16            10        24-Mar-0022 02:02:55     0.48386      0.29842      0.38997      0.69434      0.35066      0.49995      0.55025      0.47631       0.65821       0.46389       0.43511       0.71093  

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

Camilo Cárdenas 2024년 7월 18일

이동: Voss 2024년 7월 18일

Again: Thank you very much!

Here with I close the question!

Regards, Camilo

Camilo Cárdenas 2024년 9월 14일

hello I would like to add an additional comment to my original question.

I have found that with the “movmean” function (I am using Matlab version 2021b) it does in one line and much faster exactly what I want. Maybe it will be of help to anyone who goes through this question.

댓글을 달려면 로그인하십시오.

Answer 2

Camilo Cárdenas 2024년 7월 19일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2138257-agrupacion-de-datos-de-una-tabla#answer_1488046

MATLAB Online에서 열기

Hi Voss,

I realized that there is now an error if the variable “Time” is not excluded.

Additionally, taking into account that the variable “MeasID” groups the data by measurement, and if the amount of data of a measurement is not a multiple of the N value used to reduce the table (10 in this case), then data from two different measurements would be combined. That's why I share here the solution that I applied, sure there is a faster way, but now it is working.

%%  Reduce el numero de filas agrupandolas en grupos de 10
disp('.... comenzando')
N = 10;
indices = repelem(1:ceil(height(tData)/N), N, 1);
indices = indices(1:height(tData));
tData.grp_idx = indices(:);
vars = tData.Properties.VariableNames;
vars(ismember(vars,{'MeasID', 'Time', 'grp_idx'})) = [];  %segun el forum si se quita Time de aqui tambien se promediaria pero no funciona
tReduced = tData(1:0, :); %tabla vacia con mismas columnas de tData para almacenar los datos ya agrupados
IDs = unique(tData.MeasID);
for ii = 1 : length(IDs)
    ID_value = char(IDs(ii));
    tTemp = tData(strcmp(tData.MeasID, ID_value), :); %tabla del ii-esimo MeasID
    for kk = tTemp.grp_idx(1) : tTemp.grp_idx(end) % kk representa el numero de grupo que se esta trabajando
        tTempGrp = tTemp(tTemp.grp_idx == kk, :); % tabla temporal agrupada (de acuerdo del mismo indice grp_idx)
        tTempRed = grpstats(tTempGrp,'grp_idx','mean','DataVars',vars); %reduccion de la tabla temporal agrupada (promedio del grupo)
        tTempRed.Properties.RowNames = {};
        tTempRed.mean_MeasID = tTempGrp.MeasID(1);
        tTempRed.mean_Time   = mean(tTempGrp.Time);
        newColumnNames = replace(tTempRed.Properties.VariableNames, 'mean_', '');
        tTempRed.Properties.VariableNames = newColumnNames;
        tTempRed       = removevars(tTempRed, 'GroupCount');
        tTempRed       = movevars(tTempRed, {'MeasID', 'Time'}, 'Before', 1);
        tReduced       = vertcat(tReduced, tTempRed);
    end
end
disp('.... ###########')
%%

Saludos!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

agrupacion de datos de una tabla

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

추가 답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

agrupacion de datos de una tabla

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 4 이전 댓글 2개 표시이전 댓글 2개 숨기기

추가 답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기