How to calculate the conditional probability of an event?

조회 수: 16 (최근 30일)
Myriam Moss
Myriam Moss 2021년 4월 23일
답변: William 2021년 4월 25일
I have an array similar to this array = [A A B A C A B B B C C A A C]. I want to calculate p(C|A), p(C|B), p(C|C). How can I do this just having this information? I want to know what is the probability of C happening after a previous event A, B or C.

채택된 답변

William
William 2021년 4월 23일
Hello Myriam. It is not clear whether A, B, and C here are text characters, or if they have numeric values. If we assume they have numerical values, like A=1, B=2 and C=3, then you could use
y = diff(array);
P_AC = sum(y==2);
P_BC = sum(y==1);
P_CC = sum(y==0);
  댓글 수: 1
Myriam Moss
Myriam Moss 2021년 4월 24일
편집: Myriam Moss 2021년 4월 24일
Hi William. Thank you. They are characters.
I'm new to matlab sorry. Could you explain to me your logic, please?

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

William
William 2021년 4월 25일
Actually, I believe that p(C|A) would be:
y = strfind(array, 'A');
N_A = length(y);
p_CA = N_AC/N_A;
There is one further thing to consider, though. It may be true that the very last element of array is an 'A'. I don't think this should be counted in N_A because we don't know whether it would have been followed by a 'C' or not. So, if the last element of array is 'A', we should reduce N_A by 1.
y = strfind(array, 'A');
N_A = length(y);
if y(end)==length(array) || y(end)==length(array)-1 % The string might end
N_A = N_A - 1; % with an 'A' or an 'A '
end
p_CA = N_AC/N_A;

William
William 2021년 4월 25일
Myriam,
If A, B and C were variables with the values 1, 2 and 3, then in your example:
array = [1, 1, 2, 1, 3, 1, 2, 2, 2, 3, 3, 1, 1, 3]
The diff() function returns the difference between each value and the next value, so
diff(array) = [0, 1, -2, 2, -2, 1, 0, 0, 1, 0, -2, 0, 2]
Every time an A is followed by a C, the difference is 2. Every time a B is followed by a C, the difference is 1. So, I was suggesting that you count the number of times A is followed by C by counting the number of times that the value 2 appears in diff(array) with a statement like c = sum(diff(array) == 2). Unfortunately, I see now that this does not work correctly for the number of times B is followed by C, because this results in a value of 1 in diff(array), and a value of 1 is also produced when an A is followed by a B.
Since you have said that A, B and C are characters, I assume that you mean that:
array = 'A A B A C A B B B C C A A C';
In this case, maybe a better solution would be:
y = strfind(array, 'A C');
N_AC = length(y);
y = strfind(array, 'B C');
N_BC = length(y);
  댓글 수: 1
Myriam Moss
Myriam Moss 2021년 4월 25일
Thank you William! Now I have the number of times C appears after A and B.
If I define
y = strfind(array, 'C');
N_C = length(y);
If I want p(C|A), for example, I should do:
p_CA = N_AC/N_C , do you agree? :)

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Dimensionality Reduction and Feature Extraction에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by