Regarding Mutual information calculation of two binary strings

I am trying to get the Mutual information for two binary strings. I have made a code for it:
clear all;
S=textread('ecoli_profiles.txt'); data=unique(S,'rows'); for i=1:length(data) for j=1:length(data) x=data(i,1:70); y=data(j,1:70);
P1=0; Q1=0; R1=0; S1=0;
P=sum(~x & ~y);
if P==0;
P1=1;
else
P1=P;
end
Q=sum(~x & y);
if Q==0;
Q1=1;
else
Q1=Q;
end
R=sum(x & ~y);
if R==0;
R1=1;
else
R1=R;
end
S=sum(x & y);
if S==0;
S1=1;
else
S1=S;
end
J = [P1,Q1;R1,S1]/70.0;
MI(i*j,:) = sum(sum(J.*log2(J./(sum(J,2)*sum(J,1)))));
display(i);
end
end
B = reshape(MI,length(data),length(data));
csvwrite('MI3121.csv',B);
The problem is I am assuming P1, Q1, R1, S1 as 1 if they are coming out to be zero. Code is running fine. But the result is ambiguous. Can any one resolve the problem? I hope new logarithm would help. Can any one help me?

 채택된 답변

When computing mutual information you may assume that 0*log(0) == 0. In your code you could remove all the "if XXX==0" checks, and when computing Mi use instead something like:
Mi(i*j,:) =sum(sum(J.*log2(max(eps,J./(sum(J,2)*sum(J,1))))));

댓글 수: 2

while I am at it, you might as well optimize the code a bit and use something like:
S = textread('ecoli_profiles.txt');
data = unique(S,'rows');
data = double(data(:,1:70)); % your binary data
N = size(data,2);
P = (1-data)*(1-data)'/N; % your binary probs (for all data pairs)
Q = (1-data)*data'/N;
R = Q'; % = data*(1-data)'/N;
S = 1-P-Q-R; % = data*data'/N;
H = @(x)x.*log2(max(eps,x)); % entropy Fcn
MI = H(P)+H(Q)+H(R)+H(S) ... % Mutual information matrix
-H(P+Q)-H(R+S)-H(P+R)-H(Q+S);
csvwrite('MI3121.csv',MI);
Yes, the mutual information I(x,x) is equal to the entropy H(x) (which can be any value between 0 and 1, depending on the percentage of 1's and 0's in x)

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by