# Second derivative from a smoothing spline fit

조회 수: 33(최근 30일)
Angelavtc 5 Apr 2020
댓글: darova 6 Apr 2020
Hi!
I have the following fit curve that I approximate using the Curve Fitting toolbox:
And I want to find the points (Volume, Price) where the curve changes from concave to convex. Is there an easy way to find this?
Here is my code and my data file.
%% Fit: 'Jan 2012'.
[xData, yData] = prepareCurveData( x_jan_12_s, Price_jan_12_s );
% Set up fittype and options.
ft = fittype( 'smoothingspline' );
excludedPoints = excludedata( xData, yData, 'Indices', [2 276] );
opts = fitoptions( 'Method', 'SmoothingSpline' );
opts.SmoothingParam = 8.24530273269464e-08;
opts.Exclude = excludedPoints;
% Fit model to data.
[fitresult{2}, gof(2)] = fit( xData, yData, ft, opts );
% Plot fit with data.
figure( 'Name', 'Jan 2012 Supply France Peak' );
plot( fitresult{2}, 'k');
% Label axes
xlabel( 'Volume [MWh]', 'Interpreter', 'none' );
ylabel( 'Price', 'Interpreter', 'none' );
%Also this curve fit can be read as:
f = fit(xData, yData,'smoothingspline','SmoothingParam',8.24530273269464e-08)

로그인 to comment.

### 채택된 답변

darova 5 Apr 2020
Here you go
n = 50;
x = linspace(0,10,n);
y = sin(x).*x;
d2y = diff(y,2)./(x(2)-x(1))/2; % second derivative
ix = abs( diff(sign(d2y)) ) > 0.01; % find changes of sign
ind = 1 + find(ix); % appropriate indices in X,Y
fv.vertices = [x(2:end-1); y(2:end-1)]'; % without border nodes
fv.faces = [1:n-3; 2:n-2]'; % connections
fv.faceVertexCData = d2y(:); % color in nodes
fv.edgecolor = 'interp'; % interpolate colors
plot(x(ind),y(ind),'or')
patch(fv)
axis equal
colorbar

#### 댓글 수: 6

표시 이전 댓글 수: 3
darova 6 Apr 2020
Use sign for only two colors
Angelavtc 6 Apr 2020
what do you mean @darova? could you please be more precise? Sorry, I am kind of new using matlab and everything is really useful to learn more. Thanks in advance!
darova 6 Apr 2020
I mean that if you want only two colors: negative and positive, you can use sign function (returns only -1,0 and +1)
fv.faceVertexCData = sign(d2y(:)); % color in nodes (only -1 and 1 two colors)
Maybe you want to turn off color interpolation also
fv.edgecolor = 'flat'; % interpolate colors

로그인 to comment.

### 추가 답변(1개)

John D'Errico 5 Apr 2020
편집: John D'Errico 5 Apr 2020
It is important to recognize that IF you have the curve fitting toolbox, then there are various ways to accomplish that you were looking for, arguably better ways.
As well, be careful in how you use splines. The curve shown in your question seems to be one that classical splines will often fail on, since they will exhibit ringing. Not unlike the Gibbs phenomena I recall from working with Fourier series. Sharp transitions in a curve will cause serious problems when you want to model the data.
plot(t,p,'o-')
UGH. You have ONE data point down at the bottom, then a massive transition to a flat region. This is EXACTLY the kind of data where you do NOT want to use a smoothing spline!!!!!!!!
Note that bump at the top the initial transition in your curve as plotted? That is not part of your curve. It is spurious garbage, introduced by the spline. The spline fit down there is simply wrong, faiing to match any kind of behavior.
Are you kidding? I'm sorry, but you are kidding yourself in this. A smoothing spline is a terribly poor choice to fit that data, IF you include that first data point. It does very little smoothing in the rest of the curve, while introducing garbage at the bottom.
You would be far better off if you just completely dropped the first data point from any analysis. Surprisingly, it seems like you decided to drop the SECOND data point in your data? Given that the second data point is actually consistent with the rest of the data, this seems utterly strange.
I can possibly understand that you decided to drop the next to last data point, as the time stamp listed for that point seems a bit strange. At the same time, the curve is clearly starting to roll over at the top.
ft = fittype( 'smoothingspline' );
excludedPoints = excludedata( xData, yData, 'Indices', [1 276] );
opts = fitoptions( 'Method', 'SmoothingSpline' );
opts.SmoothingParam = 8.e-08;
opts.Exclude = excludedPoints;
[fitresult, gof] = fit( xData,yData, ft, opts );
plot(fitresult)
hold on
plot(xData,yData,'go')
Now, it appears you want to find the point where the curve transitions from concave to convex? I would point out there are actually several transitions in curvature here. Lets look at the second derivative. This is a spline. You don't need to use an approximate differentiation using a finite difference!!!! You have the curve fitting toolbox, and it can do the differentiation for you.
help cfit
cfit Curve Fit Object
A cfit object encapsulates the result of fitting a curve to data.
Use the FIT function or CFTOOL to create cfit objects.
You can treat a cfit object as a function to make predictions or
evaluation the curve at values of X. For example to predict what
the population was in 1905, then use
cf = fit( cdate, pop, 'poly2' );
yhat = cf( 1905 )
cfit methods:
coeffvalues - Get values of coefficients.
confint - Compute prediction intervals for coefficients.
differentiate - Compute values of derivatives.
feval - Evaluate at new observations.
integrate - Compute values of integral.
plot - Plot a curve fit.
predint - Compute prediction intervals for a curve fit or for
new observations.
probvalues - Get values of problem parameters.
Documentation for cfit
>> help cfit/differentiate
differentiate Differentiate a fit result object.
DERIV1 = differentiate(FITOBJ,X) differentiates the model FITOBJ at the
points specified by X and returns the result in DERIV1. FITOBJ is a Fit
object generated by the FIT or CFIT function. X is a vector. DERIV1 is
a vector with the same size as X. Mathematically speaking, DERIV1 =
D(FITOBJ)/D(X).
[DERIV1,DERIV2] = differentiate(FITOBJ, X) computes the first and
second derivatives, DERIV1 and DERIV2 respectively, of the model
FITOBJ.
Documentation for cfit/differentiate
Differentiation is thus quite easy.
xdiff = linspace(xData(2),xData(end),500);
[~,fitdiff2] = differentiate(fitresult,xdiff);
plot(xdiff,fitdiff2,'-')
Ugh. What point are you trying to locate now? That transition point would be where the second derivative changes sign. So next, I'll add a reference line at zero, and then expand the y axis to show the behavior near y==0.
So, exactly where do you think this curve exhibits the behavior you want to see? Remember, a postive second derivative region is one where the curve is concave upwards. Even having dropped that first data point, you will not easily find the point you want to find. I think you are kidding yourself.
Derivative estimation is a noise amplification process, a classically ill-posed problem. Estimation of the second derivative is thus also classicly ill-posed, on steroids.

#### 댓글 수: 3

Angelavtc 6 Apr 2020
Dear @John D'Errico thanks for the time employed to answer my question. In fact, the data that I use here was just a simple example to ask in a fast way what I need, perhaps it was not the best, my apologies. This is not the data that I will finally use for my research. In any case I wil consider the differentiation that you suggest and think about the "ill-posed problem" ;)
John D'Errico 6 Apr 2020
Yes. The data you showed was just noisy enough that it becomes difficult to know cleanly where your curvature changes. Were your data more noisy yet, this becomes more difficult. Then you would need to use more smoothing. As I said, differentiation is a noise amplifier. It really does not matter that much how you differentiate the spline, except that if you use an approximate derivative, then you introduce MORE problems into the problem, because it is now only an approximation.
Anyway, it can become more a problem not of how you differentiate the spline, but of how you build the spline. Too little smoothing can yield a very noisy second derivative estimate, thus making it very difficult to locate what you want. Too much smoothing can blur things too greatly, making it difficult to find where this event happens.
Ill-posed problems are ones that can amplify even the smallest, tiniest noise in the data. Differentiation is exactly that, ill-posed in nature. If you think about it, the definition of a derivative has you divide a tiny change in y, by an also tiny change in x, thus deltay/deltax. That is clearly ill-posed. But now, you want to compute a second derivative, which is most easily viewed as the derivative of the first derivative. So you are now amplifying the noise in a process that already amplifies noise in your data.
This is why I say how you compute the second derivative might be of less importance than how you built the spline. At the same time, since it is so easy to compute the second derivative (splines really are easy to differentiate if you understand how to work with them) there is no reason to use an approximation, which can at best just muddy the waters more than they are already.
And finally, even in the example we have here, you need to see that garbage in the data is important to clean up, if you are trying to find a subtle signal in problem data. You also need to understand how spline modeling works, in the sense I discussed in my answer that splines don't handle rapid transients well. When faced with a rapid change in behavior, a spline will tend to tend to respond with ringing (rapid bell-like oscillations near the transient) in the model. That ringing can be significant, as we saw in the data you gave us.
Angelavtc 6 Apr 2020
Wonderful @John D'Errico, thanks for the explanation, it is really useful for me, specially because I dont have that level of experience using splines :) Have a great day!

로그인 to comment.

이 질문에 답변하려면 로그인을(를) 수행하십시오.

Translated by