# Training data and Training target in Neural Networks

MAT-Magic 4 Feb 2020
댓글: Mahesh Taparia 10 Feb 2020
Hi,
I am having a signal in form of vector (1*25000). I want to split this signal into four parts x_train, y_train, x_test and y_test (according to 70-30% training and testing method) in MATLAB. Can anyone help me how to split this vector form signal into these four parts?
Thanks
MAT-Magic 5 Feb 2020
Thanks for the answer. Here, I am confused becuase I just have recorded respiratory signal in this array 1*25000, which is unlabled data according to my understanding without any training target. But for neural networks training on training data, I need to have the corresponding training target, which is I am not having right now.
Anyways, I did it in the following way. Can you please review the code, and tell me whether I am in the right track or not?
signal = data(1:25000);
[m,n] = size(signal);
P = 0.70 ;
idx = randperm(m);
train_data = signal(idx(1:round(P*m))); %% 17500*1 (dimension)
test_data = signal(idx(round(P*m)+1:end)); %% 7500*1 (dimension)
%% For training data:
colnr_1 = 2;
rownr_1 = 17500/2;
mat_1 = reshape(train_data, [rownr_1, colnr_1]);
x_train = mat_1(:,1);
y_train = mat_1(:,2);
%% For testing data:
colnr_2 = 2;
rownr_2 = 7500/2;
mat_2 = reshape(test_data, [rownr_2, colnr_2]);
x_test = mat_2(:,1);
y_test = mat_2(:,2);
Waiting for the positive feedback. Correct me If I am wrong anywhere. Thanks.
Please go through this below URL, it might be related to my problem.

Greg Heath 9 Feb 2020
You cannot make any intelligent decisions until you have examined a plot of the data!!!
(WRONG!!! Plotting the data first is the ultimate beginning decision!!!)
Hope this helps.
Greg

Mahesh Taparia 7 Feb 2020
Hi
You have correctly divided the data using randperm. Since you didn’t have ground truth, you are taking last 8750 as ground truth as per following code:
mat_1 = reshape(train_data, [rownr_1, colnr_1]);
x_train = mat_1(:,1);
y_train = mat_1(:,2);
which is incorrect. Select the correct ground truth.
Mahesh Taparia 10 Feb 2020
Hi
You mentioned earlier that your dataset is unlabeled, y_train would be the labels of x_train. Taking y_train (labels of x_train) as half of the data (which is amplitude) is illogical.
For supervised learning, there is a need of ground truth so collect the labels. Or else you can try with unsupervised learning approach like clusteriung.

