Understand values differences between smooth and smoothdata functions

Question

0 개 추천

Hi,

I'm working on rlowess smoothing and I was wondering why I observe output differences between smooth and smoothdata functions, despite the fact that they are supposed to be similar.

I wrote a small piece of code:

% Sample one-dimensional data

x = 1:100;

data = cos(2*pi*0.05*x+2*pi*rand) + 0.5*randn(1,100);

% Apply rlowess smoothing with smooth

smoothedData2 = smooth(data, 9, 'rlowess');

% Apply rlowess smoothing with smoothdata

smoothedData3 = smoothdata(data, 'rlowess', 9);

smoothedData2

smoothedData2 = 100×1

0.5067 0.3147 0.1235 -0.0630 -0.2807 -0.3956 -0.4401 -0.4361 -0.4861 -0.4326

<mw-icon class=""></mw-icon>

smoothedData3

smoothedData3 = 1×100

0.5067 0.3147 0.1235 -0.0630 -0.2807 -0.3956 -0.4401 -0.4361 -0.4861 -0.4326 -0.3851 -0.3447 -0.1071 0.1931 0.4699 0.7296 0.9485 1.0593 1.0393 0.8891 0.6163 0.3299 0.0853 -0.1275 -0.3258 -0.4781 -0.6435 -0.7609 -0.7636 -0.6183

<mw-icon class=""></mw-icon>

isequal(smoothedData2, smoothedData3)

ans = logical

0

% Plot the original and smoothed data

figure;

plot(x, data, 'o');

hold on;

plot(x, smoothedData2, '--');

plot(x, smoothedData3, ':');

legend('Original Data', 'Smoothed Data 2', 'Smoothed Data 3');

title('Comparison of Smoothing Methods');

●

% Calculate the differences

diff2_3 = smoothedData2(:) - smoothedData3(:);

% Display the differences

disp('Difference between smoothedData2 and smoothedData3:');

Difference between smoothedData2 and smoothedData3:

disp(diff2_3);

1.0e-14 * -0.0333 -0.0111 0.0180 -0.0083 -0.0056 0.0056 -0.0333 -0.0333 -0.0333 -0.0333 -0.0222 -0.0111 -0.0097 0.0083 -0.0111 -0.0222 0.0777 -0.0222 -0.0444 0.0222 -0.0222 0.0278 0.0069 0.0333 0.0222 0.0444 -0.0111 0.0222 -0.0111 -0.0222 -0.0278 -0.0472 -0.0215 -0.0222 -0.0222 0 0.0333 -0.0444 -0.0111 0.0222 0.0444 0 0.0222 -0.0042 -0.0167 0.0111 0.0888 0.0666 0.0222 -0.0444 -0.0444 -0.0111 -0.0111 -0.0222 -0.0250 0.0111 -0.0222 -0.0666 -0.0222 0 -0.0333 0.0056 -0.0097 0.0056 -0.0222 0.0111 0 -0.0222 0 0.0111 -0.0333 -0.0167 -0.0250 0.0064 0.0111 0.0222 0.0333 0.0222 0.0333 0.0333 -0.0111 0.0069 -0.0083 0.0056 0 -0.0111 0 -0.0111 -0.0111 0.0666 -0.0222 0 -0.0080 0.0111 0 0 0.0333 -0.0111 0.0222 -0.1110

As you can see, the graph is similar for both cases, but the values are different.

I also tried to change the method to 'lowess', and the differences there are only on the first and last entries, so I'm guessing that the moving window isn't calculated similarly in both cases (however, I can't find a difference in the window calculation from the documentation here and here).

To sum up, my two questions are:

Is there a difference in window calculation between smooth and smoothdata for 'lowess' method
Apart from this difference, could there be another issue with the calculation of robust weights for example in 'rlowess' method.

Thank you!

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

Ben 2025년 1월 29일

NB: The differences are sometimes very small (order of 1e-15, but for other runs, they are way more significant)

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Adam Danz 2025년 1월 29일

편집: Adam Danz 2025년 1월 29일

MATLAB Online에서 열기

1 개 추천

> Is there a difference in window calculation between smooth and smoothdata for 'lowess' method

These two functions compute the regression using completely different methods but they both produce the same result within limits of precision. The difference in the results is caused by limits to floating point representation.

As an analogy, here are two solutions for creating the same 5-element vector. However, isequal returns false when we compare the results!

method1 = [0.1 : 0.1 : 0.5]
method1 = 1×5
    0.1000    0.2000    0.3000    0.4000    0.5000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
method2 = [0.1  0.2  0.3  0.4  0.5]
method2 = 1×5
    0.1000    0.2000    0.3000    0.4000    0.5000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
isequal(method1, method2)
ans = logical
   0

When we compare each value, we find that 0.3 in the first method is not represented in the same way as "0.3" in the second method.

method1 == method2
ans = 1x5 logical array
   1   1   0   1   1
method1(3) - 0.3
ans = 5.5511e-17

That's just a simple vector but it's enough to show how two different solutions that arrive to the same output can differ because of the way floating values are represented.

More info

Recognizing and Avoiding Round-Off Errors

Here I use isapprox (R2024b) to compare the two results and I confirm that the max difference is indicative of floating point limitations.

% Sample one-dimensional data
x = 1:100;
data = cos(2*pi*0.05*x+2*pi*rand) + 0.5*randn(1,100);
% Apply rlowess smoothing with smooth
smoothedData2 = smooth(data, 9, 'lowess');
% Apply rlowess smoothing with smoothdata
smoothedData3 = smoothdata(data, 'lowess', 9);
isequal(smoothedData2(:), smoothedData3(:))
ans = logical
   0
all(isapprox(smoothedData2(:), smoothedData3(:)))  %R2024b
ans = logical
   1
max(abs(smoothedData2(:) - smoothedData3(:)))
ans = 9.9920e-16

댓글 수: 2
없음 표시 없음 숨기기

Ben 2025년 1월 29일

편집: Ben 2025년 1월 29일

MATLAB Online에서 열기

Hi, thank you very much for this answer, you are correct about the 'lowess' method, I missed the fact that the difference is negligible.

However regarding the 'rlowess' method, I'm observing more significant differences (I took a specific dataset that I know causes differences, but you can use the previously defined function):

% Sample one-dimensional data
x = 1:100;
data = [-0.7527, -0.3347, -1.1238, -1.0330, -1.8728, -0.9252, -0.9477, -0.1328, 0.6286, 1.3972, ...
        1.7422, 0.9095, 0.8381, 1.3394, 1.0172, 0.9205, 0.6402, -0.8005, -0.0956, -0.6813, ...
        -1.2231, -1.3148, -0.3575, -1.3356, -1.3905, -0.4306, 0.7543, 0.0731, 0.5200, 0.1664, ...
        1.6801, 1.0305, 1.2632, 0.4547, 1.1991, 0.7325, -0.3269, -0.0405, -0.5911, -0.5809, ...
        -0.9683, -1.0629, -1.5146, -1.0923, -0.3888, 0.3344, -0.8313, 1.2490, 1.1289, 0.7198, ...
        0.6924, 0.6186, 1.4146, 0.5834, 0.5411, 0.9797, 0.4691, 0.3881, -0.1135, -0.8360, ...
        -1.0999, -0.5698, -1.3338, -0.3374, -0.3767, -0.3941, -0.2495, 0.2431, 2.1292, 0.5793, ...
        0.0647, 1.9255, 1.3031, 0.6067, 2.0808, 0.8134, 0.3982, -0.4492, -0.8983, -1.5197, ...
        -1.0545, -1.4945, -0.6743, -1.0895, 0.1124, 0.2174, -0.1691, -0.0811, 0.9419, 0.0623, ...
        1.1798, 0.6334, 0.7980, 0.5948, 1.0599, 0.1489, -0.2807, 0.2162, -0.5775, -0.4547];
size(data)
ans = 1×2
     1   100
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
% Apply rlowess smoothing with smooth
smoothedData2 = smooth(data, 9, 'rlowess');
% Apply rlowess smoothing with smoothdata
smoothedData3 = smoothdata(data, 'rlowess', 9);
isequal(smoothedData2(:), smoothedData3(:))
ans = logical
   0
all(isapprox(smoothedData2(:), smoothedData3(:)))  %R2024b
ans = logical
   0
max(abs(smoothedData2(:) - smoothedData3(:)))
ans = 0.0957

Adam Danz 2025년 1월 29일

Thanks for the demo Ben. I'll try to find some time this afternoon to investigate.

댓글을 달려면 로그인하십시오.

Understand values differences between smooth and smoothdata functions

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

답변 (1개)

댓글 수: 2
없음 표시 없음 숨기기

카테고리

제품

릴리스

태그

Community Treasure Hunt

Understand values differences between smooth and smoothdata functions

댓글 수: 1 이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

답변 (1개)

댓글 수: 2 없음 표시 없음 숨기기

카테고리

제품

릴리스

태그

참고 항목

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

댓글 수: 2
없음 표시 없음 숨기기