Matrix element counting by rows, histograms, etc.

조회 수: 2 (최근 30일)
Michael
Michael 2012년 5월 11일
댓글: Mohsin Shah 2019년 5월 15일
Hi -- this is kind of tricky. I am using a 2D array (matrix) to store some information. The first n-1 columns hold indexes of variables I'm dealing with -- in other words think of them as names not numbers. The last column contains a count for the number of times I've seen the variables in the preceding n-1 columns. There are many rows with such counts, all with the same number of variables. So if I'm counting single variables it could look like this:
1 3
2 2
3 1
Meaning I've seen variable 1 three times, variable 2 twice, and variable 3 once. If I'm counting two variable combos, it could look like this:
1 2 4
1 3 3
2 4 3
3 4 1
which means I've seen variables 1 and 2 together four times, I've seen variables 1 and 3 together three times, and I've also seen variables 2 and 4 together three times, and finally I've seen variables 3 and 4 together once.
Similar structures can exist for 3, 4, 5 variables, maybe more.
What I need help with is turning these structures into a single vector of variables repeated for every time they've been counted. So for that first example with single variables the vector would contain:
1 1 1 2 2 3
For that second example the vector would contain:
1 1 1 1 2 2 2 2 1 1 1 3 3 3 2 2 2 4 4 4 3 4
These vectors will allow me to do some histogram type analysis, but I'm not sure how to replicate these variables into the new vector based on the counts in that last column. Any help would be appreciated.
PS - for n variables followed by a count column, the ACTUAL data structure I'm using has n extra columns inserted between the variables and the counts (i.e. 2n+1 columns). The information in those columns isn't relevant to the question, but it implies the following. For n=1 variable, the structure has three columns: the first for the variables, the second for the extra information not relevant to this question, and the third for the count. For n=2 variables, the structure has five columns: the first two for the variables, the second two for the extra information not relevant to this question, and the fifth for the count. For n=3 variables it has seven columns -- 3, 3 and 1...
Thanks in advance!
Mike
  댓글 수: 1
Mohsin Shah
Mohsin Shah 2019년 5월 15일
Quite late but I need to ask you how you did it in the second example - counting the occurences of rows? I need to apply this in my work.

댓글을 달려면 로그인하십시오.

채택된 답변

Daniel Shub
Daniel Shub 2012년 5월 11일
It is not the fastest and you probably could preallocate z if you wanted to. It also ignores your ps, but getting rid of the extra columns shouldn't be hard
z = [];
for ii = 1:size(x, 1)
y = repmat(x(ii, 1:(end-1)), x(ii, end), 1);
y = reshape(y, 1, numel(y));
z = [z, y];
end
To deal with your ps instead of
x(ii, 1:(end-1))
you want to stop where the data stops.
  댓글 수: 1
Michael
Michael 2012년 5월 11일
That'll do the trick! Thank you sir.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Sean de Wolski
Sean de Wolski 2012년 5월 11일
Why would you want to do this? You have all of the information you need in a nice condensed easy to understand package...
  댓글 수: 2
Daniel Shub
Daniel Shub 2012년 5월 11일
I am guessing it is for an anovan or some other statistical analysis.
Michael
Michael 2012년 5월 11일
I need to create some text-based reports on these nice, condensed packages :)

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Histograms에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by