Partitioning data for Time Series TCN model Training, Validation, and Testing

Question

Isabelle Museck 2024년 6월 5일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2125801-partitioning-data-for-time-series-tcn-model-training-validation-and-testing

답변: Krishna 2024년 6월 6일

Hello there, I am trying to build a TCN model to predict a continuous variable. I have time series data in which I am using 3 input features (accelrometer measuments in x,y,z directions) to estimate/predict a continuous variable. I have acceleromter data from 10 different trials stored in a 10x1 cell and each cell has the three accelerometer measurments over time stored in a 500x3 table for that trial. The target continous varable I am trying to predict is simialrly stored in a 10x1 cell array with each cell contaning a the a 500x1 table which is the true value of the predicted variable over time named "Taget". If I am trying to build a TCN model with this data what is the best way to partition the data for training, testing (10%), and validation (10%)? I think I need to use the tspartition function but am not sure how to use it for this type of data. Do I need to combine the data from all 10 trials into one large table and then partition? Or should I partition each trial seprately, train the model on a singluar trial, and then retrain the model on the next trial and so on. Any help would be greatly appreciated!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Krishna 2024년 6월 6일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2125801-partitioning-data-for-time-series-tcn-model-training-validation-and-testing#answer_1468256

Hello Isabelle,

Based on your description, I think you're seeking the correct method for dividing your time series data into training, testing, and validation sets. I can share an effective approach that I have personally utilized.

You've mentioned having 10 observations, with each one comprising both input and output data. Specifically, the input data consists of a time series sequence of 500 steps with 3 features, and the output data is a sequence of 500 steps for a single variable. Therefore, your data should be organized as 1x10 sequences within a cell array, where each sequence is represented as a list of 500x4, including 3 inputs and 1 output.
To partition this data into training, testing, and validation sets, you can use the cvpartition function. However, it's important to note that cvpartition generates two sets at a time, necessitating its use twice. Initially, divide the data into a training set and a combined testing/validation set. Subsequently, split the latter into distinct testing and validation sets. After this the whole trainData would contain 8 sequences(80 percent) and validate and test would contain 1 sequence each (10 percent each).
Once partitioned, proceed to organize the training data into Xtrain, which comprises the input sequences of 500x3, and Ytrain, which includes the output sequences of 500x1.

Please go through the following documentation to learn more,

https://in.mathworks.com/help/stats/cvpartition.html

Hope this helps.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Partitioning data for Time Series TCN model Training, Validation, and Testing

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

Partitioning data for Time Series TCN model Training, Validation, and Testing

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기