Testing unlabeled data on a trained model

Dear Matlab community,
I need to know if there's a way to test the reliability of predictions made by classifying new data (unlabeled data) using and already trained model.
This is what I did:
1) Create a dataset with labeled data, with 2 predictors and 3 response variables (training set);
2) Fit and validate a Multiclass Support Vector Machine classifier using the training set;
3) Use the obtained model to make predictions on a new dataset with unlabeled data (test set)
I would like to know which are the classification metrics (if there are) to establish the relaibility of this classification, since the new data is unlabeled.
Thanks.

댓글 수: 4

Reliability of the predictions made by a trained model is generally done using a test set which is labeled.
I suggest you split your labeled dataset into train, valid and test datasets. The train dataset is used for training the model, the valid dataset is used for tuning the hyperparameters of a model, and finally the test dataset will give you the performance or reliability of your final trained model.
Amanda
Amanda 2020년 10월 29일
I've used a 7-fold CV validation method + Hyperparameters Optimization option to train the model. What I did next, was to use that model (which is already trained and validated) to make predictions on new unlabeled data. From this new classification (predicted values) I wanted to know:
  1. if there was some statistical kind of validation to test the reliability of the predictions,
  2. or is the reliability of the predictions based only in the performance of your trained model?
Thanks for your reply! :)
If your labeled training data and the unlabeled test data have a high correlation, the best thing you can do is to use a small partition of the labeled training data as test data to get a quantitative measure on reliability. The high correlation should ensure similar performance with your unlabeled test data.
Apart from this, I don't think there is any reliable way to get performance of your model on real data without ground truth.
Amanda
Amanda 2020년 10월 29일
Thank you so much for the answer! This is just what I wanted to know.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

도움말 센터File Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기

질문:

2020년 10월 26일

댓글:

2020년 10월 29일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by