selectFeatures

버전 2.2.0.0 (18.5 KB) 작성자: Elliot Layden
Improved sequential feature selection for linear or quadratic discriminant analysis.
다운로드 수: 118
업데이트 날짜: 2018/4/25

라이선스 보기

Matlab's sequentialfs.m provides a fast, but arguably sub-optimal, feature selection algorithm for linear or quadratic discriminant models. This submission provides a generally slower, but better optimized, forward selection algorithm. It sequentially selects predictors/features which improve cross-validated classification accuracy, using a cross-validation method of the user's choosing. The function provides the same cross-validation options as sequentialfs.m (Holdout, KFold, Leaveout), but also provides an additional customizable option, ‘sets’ (see help section within function). If two or more candidate features improve the model’s classification accuracy to the same degree (i.e., a “tie”), the algorithm proceeds to the next “depth” of candidate features, separately for each of the tied features. Proceeding to the next depth continues until one feature at the tied level is determined to unambiguously yield the best accuracy (in combination with the subsequent features at greater depths). The user can specify a maximum depth for which to search for "tie-breakers", or, by default, the algorithm can proceed to an unlimited depth (in practice, usually not more than 3-4). If the specified maximum depth is reached while comparing tied candidates, the algorithm will greedily select the tied feature in order of feature entry. If, at any point, additional features add no improvement to the model's classification accuracy, optimization ceases. If a tie persists after optimization ends, the tied feature in order of feature entry is selected.
--
Bootstrapping is now available to check each selected feature for significance and to generate confidence intervals for feature coefficients. Currently, this option can only be used for 2 category classification problems. A specified number of boostrapped samples (resamples with replacement) are generated, and a discriminant model is fitted to each using the selected features. 95% confidence intervals for each feature are calculated as the 2.5% and 97.5% of the sorted coefficient bootstrap distribution. P-values are calculated for the 2-tailed test that each feature's bootstrap distribution is significantly different from 0. Note: features are z-scored within each bootstrap sample so as to provide coefficients that are more comparable across features.

인용 양식

Elliot Layden (2024). selectFeatures (https://www.mathworks.com/matlabcentral/fileexchange/65716-selectfeatures), MATLAB Central File Exchange. 검색 날짜: .

MATLAB 릴리스 호환 정보
개발 환경: R2017a
모든 릴리스와 호환
플랫폼 호환성
Windows macOS Linux
카테고리
Help CenterMATLAB Answers에서 Dimensionality Reduction and Feature Extraction에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
버전 게시됨 릴리스 정보
2.2.0.0

Changed title

2.1.0.0

Fixed bootstrapping waitbar issue

2.0.0.0

Added a bootstrapping option to calculate 95% confidence intervals and p-values of selected features. The median or mean coefficient from each feature's bootstrap distribution could be taken as a more robust estimate of effect size.

1.3.0.0

Updated help info

1.2.0.0

Fixed history output

1.1.0.0

Corrected verbose output

1.0.0.0