finding all possible polynomial combinations of n variables
조회 수: 26 (최근 30일)
이전 댓글 표시
I am trying to create additional polynomial features (with polynomial degree up to p where p can be anywhere between 1 to 10) to my current dataset of n variables/features.
For example, I have three variables/features: x1, x2, and x3
If I want to expand the dataset to include polynomial degree up to 2,
then I would have: x1, x1^2, x2, x1x2, x2^2, x3, x1x3, x2x3, x3^2
Once I have all possible combination of exponents the variables can take, I can easily do element-wise exponentiation.
Solution I came up with for the toy example above is the following:
% 3 variables
% 2 deg max polynomial
[var1_exp, var2_exp, var3_exp] = ndgrid(0:2);
% all exponent combinations
exp_comb = [var1_exp(:), var2_exp(:), var3_exp(:)];
% logic to remove combinations that aren't valid
exp_comb = exp_comb(sum(exp_comb,2)<=2 & sum(exp_comb,2)>0, :);
The solution indeed matches what I had above:
>> exp_comb
exp_comb =
1 0 0
2 0 0
0 1 0
1 1 0
0 2 0
0 0 1
1 0 1
0 1 1
0 0 2
However, I am not sure how to generalize this easily to large n cases.
Any help would be greatly appreciated.
UPDATES:
Following solutions have been posted so far with its own pros and cons
Solution 1 - works but doesn't scale well to both n and p in terms of memory (n=30 and p=2 requires 1534009.5GB memory)
n = 3; % Number of variables/features
p = 2; % Maximum polynomial degree
vars = cell(n,1);
[vars{1:n}] = ndgrid(0:p);
exp_comb = reshape(cat(n+1, vars{:}), [], n);
exp_comb = exp_comb(sum(exp_comb,2)<=p & sum(exp_comb,2)>0, :);
Solution 2 - works without demanding tremendous memory, but doesn't generalize to p other than 2
n = 3; % Number of variables/features
p = 2; % Maximum polynomial degree
combs=nchoosek(1:n,p);
m=size(combs,1);
S=sparse([1:m;1:m],combs.',1,m,n);
exp_comb = [S;speye(n);2*speye(n)];
댓글 수: 3
Chukwunomso Agunwamba
2022년 5월 15일
편집: Chukwunomso Agunwamba
2022년 5월 15일
More generally, let D = a vector listing out the maximum power of each variable. Then the output solution matrix will have a size of prod(D+1) by length(D). Concerning the memory issue, [I am refering to "(n=30 and p=2 requires 1534009.5GB memory)"], maybe, a solution algorithm can be restricted to produce a vector at a time. Then, submatrices, that are concatenations of those vectors, can be saved in file(s), or in a database, with the RAM kept free from being clogged up. If you output a column at a time, the column grows quickly with n after fixing p, because of prod(D+1). [In your special case, that column length is (p+1)^n, with D = p*ones(size(1,n))]. So, I think that it is better to output a row at a time, since, I believe you might not be using n>1000, for example. That is, the row length is simply the n specified, which is nicely much smaller than prod(D+1) as n or as p increases. Then, a selection logic can be applied on the data stored in the file(s)/database while saving, and/or while reading, without loading the entire matrix into RAM.
채택된 답변
Matt J
2019년 10월 22일
편집: Matt J
2019년 10월 22일
This should be better:
n=30;
combs=nchoosek(1:n,2);
m=size(combs,1);
S=sparse([1:m;1:m],combs.',1,m,n);
exp_comb = [S;speye(n);2*speye(n)];
댓글 수: 5
Bashar
2020년 2월 9일
편집: Bashar
2020년 2월 9일
Hi Louis and Matt,
I am considering the same problem of expanding my feature vector for classification and this solution seems ideal to my problem. Can you please advise on how to use the resultant exp_comb (from the method that utilises diophantine function) to expand the feature vector with a p order polyinomal expansion, e.g. X which is mxn (n is number of features per oberservation/data-sample; m = number of observations)?
If I use p = 2 for n =3, I get the following 12 entries (hence I am not sure how to use the resultant exp_comb to expand my feature set e.g. X is 10x3, to 10x9):
exp_comb 2
(1,1) 1
(4,1) 2
(5,1) 1
(7,1) 1
(2,2) 1
(5,2) 1
(6,2) 2
(8,2) 1
(3,3) 1
(7,3) 1
(8,3) 1
(9,3) 2
참고 항목
카테고리
Help Center 및 File Exchange에서 Polynomials에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!