필터 지우기
필터 지우기

Regarding Mutual information calculation of two binary strings

조회 수: 1 (최근 30일)
I am trying to get the Mutual information for two binary strings. I have made a code for it:
clear all;
S=textread('ecoli_profiles.txt'); data=unique(S,'rows'); for i=1:length(data) for j=1:length(data) x=data(i,1:70); y=data(j,1:70);
P1=0; Q1=0; R1=0; S1=0;
P=sum(~x & ~y);
if P==0;
P1=1;
else
P1=P;
end
Q=sum(~x & y);
if Q==0;
Q1=1;
else
Q1=Q;
end
R=sum(x & ~y);
if R==0;
R1=1;
else
R1=R;
end
S=sum(x & y);
if S==0;
S1=1;
else
S1=S;
end
J = [P1,Q1;R1,S1]/70.0;
MI(i*j,:) = sum(sum(J.*log2(J./(sum(J,2)*sum(J,1)))));
display(i);
end
end
B = reshape(MI,length(data),length(data));
csvwrite('MI3121.csv',B);
The problem is I am assuming P1, Q1, R1, S1 as 1 if they are coming out to be zero. Code is running fine. But the result is ambiguous. Can any one resolve the problem? I hope new logarithm would help. Can any one help me?

채택된 답변

Alfonso Nieto-Castanon
Alfonso Nieto-Castanon 2014년 7월 16일
When computing mutual information you may assume that 0*log(0) == 0. In your code you could remove all the "if XXX==0" checks, and when computing Mi use instead something like:
Mi(i*j,:) =sum(sum(J.*log2(max(eps,J./(sum(J,2)*sum(J,1))))));
  댓글 수: 2
Alfonso Nieto-Castanon
Alfonso Nieto-Castanon 2014년 7월 16일
편집: Alfonso Nieto-Castanon 2014년 7월 16일
while I am at it, you might as well optimize the code a bit and use something like:
S = textread('ecoli_profiles.txt');
data = unique(S,'rows');
data = double(data(:,1:70)); % your binary data
N = size(data,2);
P = (1-data)*(1-data)'/N; % your binary probs (for all data pairs)
Q = (1-data)*data'/N;
R = Q'; % = data*(1-data)'/N;
S = 1-P-Q-R; % = data*data'/N;
H = @(x)x.*log2(max(eps,x)); % entropy Fcn
MI = H(P)+H(Q)+H(R)+H(S) ... % Mutual information matrix
-H(P+Q)-H(R+S)-H(P+R)-H(Q+S);
csvwrite('MI3121.csv',MI);
Alfonso Nieto-Castanon
Alfonso Nieto-Castanon 2014년 7월 16일
편집: Alfonso Nieto-Castanon 2014년 7월 16일
Yes, the mutual information I(x,x) is equal to the entropy H(x) (which can be any value between 0 and 1, depending on the percentage of 1's and 0's in x)

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Text Analytics Toolbox에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by