How can I make a combination/permutation of all possible values with a given subset of data?

조회 수: 7 (최근 30일)
Hello, I'm having trouble putting this into words so I'll give an example and hopefully someone can help.
To make it simple, let's say I have a 200 second time series (200x1 array) from 3 regions (A,B,C). Each region has different types, so for all A, theres A1, A2, A3 etc. This also applies to B and C. However the number of types differ for each region. So if A has A1 - A5, B would have B1 - B9 etc.
I want to make an array combination of one of each region. So [A1 B1 C1], [A2 B1 C1], [A3 B1 C1], etc. So if I had 3 regions, I want all combinations of a 200 x 3 array possible using one type from each region.
My question is, currently, I have all the types and regions in one array (200 x 164). So A1:A5 B1:B11 C1:C20 D1:D5 etc. In total, I have 54 regions, so I would want to make all possible combinations of a 200 x 54 array.
Is there a way to do this with how my data is currently organized? Thanks for any suggestions.
  댓글 수: 2
Stephen23
Stephen23 2024년 7월 29일
편집: Stephen23 2024년 7월 29일
I doubt that your computer would be table to store all of those combinations in memory at once. Would it be sufficient to generate tham one-at-a-time ?
The problem may anyway be intractable due to the total number of combinations required.
Umar
Umar 2024년 7월 29일

Hi @Andrew You ,

To generate all possible combinations of a 200 x 54 array from your current 200 x 164 array, you can extract the regions you need and concatenate them to form the desired array. Here's a sample code snippet to achieve this:

% Sample data (replace this with your actual data)

data = rand(200, 164); % Assuming your data is stored in a variable named 'data'

% Extract regions A1:A5, B1:B11, C1:C20, D1:D5 (adjust the indices accordingly)

regions_A = data(:, 1:5);

regions_B = data(:, 6:16);

regions_C = data(:, 17:36);

regions_D = data(:, 37:41);

% Concatenate the extracted regions to form a 200 x 54 array

combined_array = [regions_A, regions_B, regions_C, regions_D]; % Add more regions as needed

% Display the size of the combined array

size(combined_array)

So, by extracting the regions of interest and concatenating them, you can create the desired 200 x 54 array. Make sure to adjust the indices and add more regions as necessary to cover all 54 regions in your data. Please see attached results of code snippet.

Please let me know if you have any further questions.

댓글을 달려면 로그인하십시오.

답변 (1개)

Tony
Tony 2024년 7월 29일
Below is example code for running through all combinations of a simpler problem of just 9 regions (A1:A3, B1:B2, C1, D1:D3). You can update the parameter settings for your full problem. dataCombinations stores all the combinations in a single variable, with the third index iterating over the combinations. But as Stephen23 remarked, storing all the combinations may require too much memory. So it would be more efficient to process each combination as it's generated.
% using smaller values for testing and demonstration
nTime = 1; % 200 in full problem
nRegionClass = 4; % 54 in full problem
nRegionClassSize = [3 2 1 3]; % to be updated for full problem
nRegionTotal = sum(nRegionClassSize);
data = rand(nTime, nRegionTotal); % dummy values for testing
nCombinations = prod(nRegionClassSize);
iRegionStart = cumsum([0 nRegionClassSize(1:end-1)]); % index of region just before each class
dataCombinations = zeros(nTime, nRegionClass, nCombinations);
combCounters = ones(1, nRegionClass);
for i = 1:nCombinations
regionSubset = combCounters + iRegionStart;
disp("Combination #" + num2str(i) + ": " + num2str(regionSubset));
dataCombinations(:, :, i) = data(:, regionSubset); % extracts data for region combinations
for j = 1:nRegionClass
if combCounters(j) < nRegionClassSize(j)
combCounters(j) = combCounters(j) + 1;
break;
else
combCounters(j) = 1;
end
end
end
Combination #1: 1 4 6 7 Combination #2: 2 4 6 7 Combination #3: 3 4 6 7 Combination #4: 1 5 6 7 Combination #5: 2 5 6 7 Combination #6: 3 5 6 7 Combination #7: 1 4 6 8 Combination #8: 2 4 6 8 Combination #9: 3 4 6 8 Combination #10: 1 5 6 8 Combination #11: 2 5 6 8 Combination #12: 3 5 6 8 Combination #13: 1 4 6 9 Combination #14: 2 4 6 9 Combination #15: 3 4 6 9 Combination #16: 1 5 6 9 Combination #17: 2 5 6 9 Combination #18: 3 5 6 9
  댓글 수: 3
Stephen23
Stephen23 2024년 7월 31일
So you have 4.2797e+15 combinations... lets assume that your code can process them at a rate of one million combinations per second, then you will only need to wait:
4.2797e+15 / (1e6 * 60*60*24*365)
ans = 135.7084
one hundred and thirty-six years for the results.
You might need to think about your approach a bit more, e.g. perhaps use dynamic programming.
Steven Lord
Steven Lord 2024년 7월 31일
FYI you can perform this computation without the "magic numbers" 60, 24, and 365 using some duration functions.
numCombinations = 4.2797e15;
Y = years(seconds(numCombinations/1e6))
Y = 135.6183
This matches the computations with "magic numbers" if you use 365.2425 instead of 365.
4.2797e+15 / (1e6 * 60*60*24*365.2425)
ans = 135.6183
It doesn't make a lot of difference in this case, shaving off a mere 0.1 year, but IMO the intent of the years and seconds calls is a little clearer.
I agree with your last statement; brute-forcing this problem is probably not the best approach. Without knowing the problem the original poster wants to solve, offering specific suggestions for a different approach doesn't seem possible.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by