Fisher's exact test of 2x2 contingency tables permits calculation of precise probabilities in situation where, as a consequence of small cell frequencies, the much more rapid normal approximation and chi-square calculations are liable to be inaccurate.
The Fisher's exact test involves the computations of several factorials to obtain the probability of the observed and each of the more extreme tables. Factorials growth quickly, so it's necessary use logarithms of factorials. This computations is very easy in Matlab because x!=gamma(x+1) and log(x!)=gammaln(x+1).
I rewrote this function several times: now the fully vectorization, the preallocation, the using of a recursive relationship for the Fisher's exact test on 2x2 matrix and the using of logarithm greatly speed up the execution.
It is faster than the previously submitted Fisherextest. In fact, I performed this test comparing the core of both scripts (deleting the input error check, the code to display results and compute the power). X=[70 30; 29 80] (100 tables to evaluate)
times=zeros(1,1000); for I=1:1000, tic; myfisher22(X); times(I)=toc; end, median(times)
ans = 1.3000e-4
The same for Fisherextest: ans = 0.0024
So my function in about 18.5 fold faster
Actually, the function also computes the mid-P correction to make the test less conservative.
Moreover, the routine computes the Power and, if necessary, the sample sizes needed to achieve a power=0.80 using a modified asymptotic normal method with continuity correction as described by Hardeo Sahai and Anwer Khurshid in Statistics in Medicine, 1996, Vol. 15, Issue 1: 1-21.
Giuseppe Cardillo (2020). MyFisher22 (https://github.com/dnafinder/myfisher22), GitHub. Retrieved .
The result is the same but the computation is different. Using binomial coefficients permits to easily upgrade 2x2 matrix into 2xC matrix (as in my function MyFisher23) or 3x3 matrix. I think that is not important to know if MyFisher is better or worse than Fisherextest; I think that is important to know if it is a good code, if it could be improved and if it is useful for developing other codes.
To be honest. I Really do not see any new contribution to the previously m-file created Fisherextest.
In the beginning, after reading the review, I used MyFisher and Fisherextest by Antonio Trujillo Ortiz on the matrix given in the review and I found the bug (the first one). So, I uploaded the new file. Then, using both functions on several matrix, I saw that p-both was always the same but sometimes p-left and p-right were swapped. In this case, I was unable to find quickly the bug so I preferred to remove the file and to study the articles again. Finally, I understood where the second bug was, I fixed it and submitted the file again (the same bug was in MyFisher23, so I fixed it too).
These were the two bugs:
1) The first one: for a 2x2 matrix there is only one degree of freedom. The vectors p and q represent all the possible tables for the given marginal totals. Binomial coefficients are computed using these two vectors and the columns marginal total and then the coefficients are cross-multiplyed (vector np). Previously the function didn't correctly identified the observed table and then the p-value was totally incorrect.
2) The second one: there is a direction of the more extreme tables! Two status are possible. the first one: more estreme tables (in the same direction)-observed table-less exteme tables-more extreme table (in the opposite direction); the second one: more extreme table (in the opposite direction)-less exteme tables-observed table-more estreme tables (in the same direction). I fixed this bug using a logical array (the p-array computed after the ob variable).
I saw you have removed this file after a giving deep reviewing and low rating and submitted again. Well, what is the improvement given to this one?
Inspired by: Fisherextest
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!