Splitting up large arrays based on datetimes without using loops

Hi, I've a large dataset consisting of 10min samples and an acompanying datetime array spanning many years on which I wish to perform certain functions on each month. Is there a way to operate on each individual month without using nested loops? I wish to calculate the skewdness and kurtosis every month for every column in the dataset and then store the results to run control charts on and update at a later date. Thanks in advance!

댓글 수: 3

What's wrong with nested loops? Without knowing, how the data are represented in your "dataset", it is hard to suggest some code for processing it. I'd expect findgroup and splitapply to solve this problem without creating explicit loops.
hi Jan. primarily nested loops are slow and cumbersome to code. I tried changing the datetime format to "yyyymm" and using accumarray but it only returns zeros! See below code snip
d = temp_struct.timestamps.(turbines{i_wec});
d.Format = 'yyyyMM';
temp_subs = datenum(d);
temp_vals = temp_struct.(atrib_name{i_atrib}).(turbines{i_wec})';
test = accumarray(temp_subs, temp_vals,[], @kurtosis);
I think this should work but not sure why it doesn't now. note vals is a 52704x1 array of double. In this instance "test" is a 736696x1 array of doubles all zero! Not sure why its so much bigger either.
Apparently you have a function kurtosis already. One way to debug calls to ACCUMARRAY (assuming that you already checked out that indices are fine) is to output a cell array of grouped values:
groups = accumarray(temp_subs, temp_vals,[], @(x){x});
so you can checkout what is passed to your aggregation function. If all groups are empty there is an issue with your IND and/or VAL inputs. If groups make sense, the issue is with your aggregation function.

댓글을 달려면 로그인하십시오.

 채택된 답변

Peter Perkins
Peter Perkins 2017년 9월 21일
Here's how you would do this using a table and varfun:
>> t = table(datetime(2017,1,randi(365,20,1)),randn(20,1),'VariableNames',{'Date' 'Value'})
t =
20×2 table
Date Value
___________ ________
05-Mar-2017 2.1778
23-May-2017 1.1385
31-Oct-2017 -2.4969
21-Oct-2017 0.44133
23-Jan-2017 -1.3981
[snip]
>> t.Month = month(t.Date)
t =
20×3 table
Date Value Month
___________ ________ _____
05-Mar-2017 2.1778 3
23-May-2017 1.1385 5
31-Oct-2017 -2.4969 10
21-Oct-2017 0.44133 10
23-Jan-2017 -1.3981 1
[snip]
>> varfun(@mean,t,'GroupingVariable','Month','InputVariables','Value')
ans =
10×3 table
Month GroupCount mean_Value
_____ __________ __________
1 2 -0.3667
2 1 0.32321
3 3 0.41779
4 1 -0.48094
5 4 -0.12632
6 3 0.97795
7 1 0.1644
8 2 0.65163
10 2 -1.0278
12 1 0.085189

추가 답변 (1개)

Steven Lord
Steven Lord 2017년 9월 20일

0 개 추천

If you have your data stored in a timetable, use retime. Specify @skewness or @kurtosis as the aggregation method, assuming you have Statistics and Machine Learning Toolbox available. If you don't, you will need to write your own functions to compute those statistics and specify those as the aggregation method when you call retime.

댓글 수: 1

@ Steven I only have Matlab2015, so that solution wont work. : (

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Data Type Identification에 대해 자세히 알아보기

질문:

2017년 9월 20일

답변:

2017년 9월 21일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by