Cannot use 'histogram' to compute entropy

조회 수: 1 (최근 30일)
z8080
z8080 2021년 9월 9일
댓글: Walter Roberson 2021년 9월 10일
I'd like to compute the entropy of various vectors. I was going to use something like:
X = randn(1,100);
h1 = histogram(X, 'Normalization', 'Probability');
probabilities = h1.Values;
entropy = -sum(probabilities .* log2(probabilities ))
The second command however gives the error:
Undefined function 'c:\Program Files\MATLAB\R2019b\toolbox\matlab\specgraph\histogram.m' for input arguments of type 'double'.
But surely that's exactly what the standard Matlab function 'histogram' expects?! Doing a
which histogram
indeed returns
C:\Program Files\MATLAB\R2019b\toolbox\matlab\specgraph\histogram.m
which is the newest file (by modified date) from several of that name that (sadly) exist in my Matlab folder. I believe this should be the standard Matlab function 'histogram'.
If on the other hand in the above example I use 'hist' instead of 'histogram', I get the scalar value for entropy that I expect. However, I know 'hist' is not recommended, not least because with it one cannot specify the normalization type.
So, my question is: is using 'hist' for computing probabilities ok, or should I try something else to be able to use 'histogram' instead?
  댓글 수: 13
z8080
z8080 2021년 9월 10일
편집: z8080 2021년 9월 10일
Thanks a lot for this excellent answer and derivation. to answer my own question then, I guess that it is acceptable to manually remove all bins with a count of 0, to enable the computation of entropy based on the non-0 bins. This is in fact what you had answered me from the very beginning :)
Thanks again!
Walter Roberson
Walter Roberson 2021년 9월 10일
Depending on your knowledge of the distribution, it might make sense to take ask for the counts, and take max(1,counts) to substitute a nominal hit for each bin, and then calculate probability from that, as adjusted_counts ./ sum(adjusted_counts) .
The fewer samples you have, the more that distorts the probabilities; the more samples you have, the less likely you are to need it.
But I do recommend figuring out the number of bits yourself somehow or else you are going to continue to be at the mercy of its undocumented method of selecting the number of bins.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Histograms에 대해 자세히 알아보기

제품


릴리스

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by