The differences are quite small, and I don't think there is very much wrong. My own implementation of the Canny edge detector also produces slightly different results from edge().
Possible causes of differences include:
- You use a rounded approximation to the Gaussian kernel for smoothing. It is likely that edge() uses a different approximation.
- You use a symmetrical operator (the Sobel operator) for differencing. I believe edge() uses non-centred nearest-neighbour differences, which introduce a half-pixel shift in the gradient estimate positions. Personally I prefer your approach.
- Your choice of non-maximum suppression is to quantise the gradient direction to the nearest 45 degrees and to compare gradient magnitudes at the pixels in the relevant direction. There are a number of other possible approaches - for example you can interpolate the gradient magnitude and then compare along the gradient direction. Canny did not specify what to do here, and everyone makes their own choice. I am not sure what edge() does, but it may well do something different from your method.
- Your implementation of hysteresis thresholding is not correct. It is not sufficient to test whether any of the nearest neighbours exceeds the high threshold. You need to test whether the current pixel can be connected to a pixel where the gradient exceeds the high threshold.
I suspect the second point above is the most important effect in making a difference between your output and that of edge(). The last point above is the only one which I would identify as a definite error on your part.
Your code is not in a good MATLAB style, by the way - all your loops could be vectorised - but that's another story.