Identify rule that governs pattern in series.

I have a series of terms in a sequence and need to identify the rule that produces it. I know it is based on a damped harmonic oscillator but I am not sure which formula governs the interval. Here is the string, it is the interval rates in seconds: (108, 88, 92, 76, 80, 68, 64, 56, 40, 32, 8, 8) the closest I have come is: D_n = Asin(Bn + C)exp(-Dn) + E
A = Amplitude of the wave (could be around 28, as you noted that the sequence never went more than 28 in any direction) B = Frequency of the wave (which affects how “quickly” the wave oscillates - this could be tweaked to match the observed pattern) C = Phase shift of the wave (could potentially be 0 if the wave starts at its peak) D = Damping coefficient (this affects how quickly the oscillations decay - would need to be chosen to match the decay observed in the sequence) E = Equilibrium position (which would be 8, as the sequence oscillates around 8)

댓글 수: 7

That proposed function leads to poor results, at least if exp(-Dn) is understood as exp(-D*n)
You don't have sufficient information to estimate (with any degree of confidence) those 5 parameters from that little data. You have only 12 data points, representing only a small fraction of one period of any oscillatory sequence.
Walter Roberson
Walter Roberson 2023년 7월 9일
편집: Walter Roberson 2023년 7월 9일
Your n is acting as the independent variable, with integer values. Now suppose you had a B0 for the sin(B*n+C) term. Let B=B0+k*2*pi for integer k. n*(B0+k*2*pi) = n*B0 + n*k*2*pi but n and k are integer and sin is periodic so that is the same as sin(n*B0+C). Therefore you should constrain B and C to one period of sin()
Just to add a little more detail to the excellent comments by @Walter Roberson and @John D'Errico, here is what happens when you do try to fit that function to your data:
% Your data
n = (1:12)'; % (I assume that this is implied.)
D_n = [108, 88, 92, 76, 80, 68, 64, 56, 40, 32, 8, 8]';
% Tabulate the data
tbl = table(n,D_n);
% Define function that will be used to fit data
% (F is a vector of fitting parameters)
f = @(F,n) F(1).*sin(F(2).*n + F(3)).*exp(F(4).*n) + F(5);
% Initial parameter guesses
beta0 = [1 1 0 1 0];
% Fit the model
mdl = fitnlm(tbl,f,beta0)
Warning: The Jacobian at the solution is ill-conditioned, and some model parameters may not be estimated well (they are not identifiable). Use caution in making predictions.
mdl =
Nonlinear regression model: D_n ~ F1*sin(F2*n + F3)*exp(F4*n) + F5 Estimated Coefficients: Estimate SE tStat pValue ___________ ______ ___________ ________ F1 45.576 15.483 2.9437 0.018604 F2 -2.3136e-05 462.16 -5.0061e-08 1 F3 11.102 75.416 0.14721 0.88661 F4 0.098244 48.651 0.0020194 0.99844 F5 150.85 387.71 0.38907 0.70737 Number of observations: 12, Error degrees of freedom: 8 Root Mean Squared Error: 6.43 R-Squared: 0.971, Adjusted R-Squared 0.96 F-statistic vs. constant model: 89.8, p-value = 1.68e-06
% Calculate the model values at the empirical x
D_n_predicted = predict(mdl,n);
% Plot the data and fit
figure
plot(n,D_n,'*',n,D_n_predicted,'g');
legend('data','fit')
There is lots going wrong here. Two big issues are
  • The warning that some model parameters are not estimated well
  • The errors ("SE" in the model output table) on the parameter estimates are huge compared to the estimates themselves, and the implications of that for their statistical significance
These both point to this being a terrible model (which John, Walter and I would simply know intuitively from experience), but is also borne out by the model fit.
Note that I might have been able to do a bit better with more precise initial parameter guesses. But that is a fool's errand. When you have a very small amount of data, you cannot (sensibly) fit a model that has many parameters.
You have not mentioned the purpose for this fit. Why are you doing it at all? What did you plan to do with the result? There may be a more sensible alternative.
And to expand on what @the cyclist just said, you have 5 unknowns in that model. But your data only has 12 data points, and over a VERY short region, if you want to estimate those parameters. But look at the data you show.
n = (1:12)'; % (I assume that this is implied.)
D_n = [108, 88, 92, 76, 80, 68, 64, 56, 40, 32, 8, 8]';
plot(n,D_n,'o')
And what I see is a LOT of noise. There is enough information in that data to estimate a straight line, but not really any more than that. Certainly NOT a damped sinusoid.
P1 = fit(n,D_n,'poly1')
P1 =
Linear model Poly1: P1(x) = p1*x + p2 Coefficients (with 95% confidence bounds): p1 = -8.727 (-10.13, -7.328) p2 = 116.7 (106.4, 127)
plot(P1,n,D_n)
Is there a tiny amount of curvature in that data? Barely. So you might decide to fit a quadratic polynomial.
P2 = fit(n,D_n,'poly2')
P2 =
Linear model Poly2: P2(x) = p1*x^2 + p2*x + p3 Coefficients (with 95% confidence bounds): p1 = -0.4076 (-0.7897, -0.02553) p2 = -3.429 (-8.531, 1.673) p3 = 104.4 (89.94, 118.8)
plot(P2,n,D_n)
What I would look at in that fit, is how wide is the uncertainty in the quadratic coefficient. fit thinks it lies somewhere between -0.7897 and -0.02553. That is a pretty large interval. So anything more sophisticated than a simple quadratic is a silly thing to do with this data.
You ABSOLUTELY need better data, even if you belive this data lies on a damped sinusoid. AND you absolutely need MORE data, so over a longer period of time. In order to estimate that period AND a damping rate, you need at least a few periods from that curve.
Degree 9 polynomial looks decent R^2 when you center and scale, but any polynomial struggles with two adjacent values the same.
A 9th degree polynomial would be a literally insane fit. If it does fit well, that would be based merely on R^2. But R^2 does not measure what the curve does between the data points. And with 12 data points, and 10 parameters to estimate? It will be purely chasing noise. R^2 does not measure if the fit makes any sense at all.
n = (1:12)'; % (I assume that this is implied.)
D_n = [108, 88, 92, 76, 80, 68, 64, 56, 40, 32, 8, 8]';
nhat = (n - mean(n))/std(n);
P9 = fit(nhat,D_n,'poly9')
P9 =
Linear model Poly9: P9(x) = p1*x^9 + p2*x^8 + p3*x^7 + p4*x^6 + p5*x^5 + p6*x^4 + p7*x^3 + p8*x^2 + p9*x + p10 Coefficients (with 95% confidence bounds): p1 = -5.27 (-126.4, 115.9) p2 = 16.89 (-49.89, 83.66) p3 = 25.46 (-566.4, 617.3) p4 = -68.78 (-369.7, 232.2) p5 = -37.51 (-980.6, 905.5) p6 = 84.49 (-335.1, 504.1) p7 = 13.44 (-543.7, 570.6) p8 = -41.59 (-235.8, 152.7) p9 = -27.19 (-126.6, 72.25) p10 = 68.75 (47.42, 90.09)
plot(P9,nhat,D_n,'o')
As I said, the degree 9 polynomials is a bad thing to do, even to bad data. Yes. It fits the data points reasonably well, but that is all one can say.

댓글을 달려면 로그인하십시오.

 채택된 답변

Sam Chak
Sam Chak 2023년 7월 10일
If you are 100% sure that the data is the response of a damped harmonic oscillator, then you should fit it using the model of a damped harmonic oscillator, despite the possibility of the data being corrupted by noise. A sensible model of an underdamped harmonic oscillator is given by:
n = (0:11)';
D_n = [108, 88, 92, 76, 80, 68, 64, 56, 40, 32, 8, 8]';
% proposed damped harmonic oscillator model
modelfun = @(F,n) F(1).*exp(-F(2).*n).*(sin(F(2).*n) + cos(F(2).*n)) + F(3);
% initial values
beta0 = [108 0.1 1];
% fit nonlinear regression model
mdl = fitnlm(n, D_n, modelfun, beta0)
mdl =
Nonlinear regression model: y ~ F1*exp( - F2*n)*(sin(F2*n) + cos(F2*n)) + F3 Estimated Coefficients: Estimate SE tStat pValue ________ _______ ________ _______ F1 276.74 246.31 1.1236 0.29028 F2 0.067796 0.04068 1.6666 0.12995 F3 -180.81 248.57 -0.72738 0.48548 Number of observations: 12, Error degrees of freedom: 9 Root Mean Squared Error: 6.91 R-Squared: 0.962, Adjusted R-Squared 0.954 F-statistic vs. constant model: 115, p-value = 3.84e-07
% predict the outputs of the model
D_n_predicted = predict(mdl, n);
% compare between the data and the fitted model
plot(n, D_n, '.', n, D_n_predicted), grid on, xlabel('n'), ylabel('D_n')
legend('data', 'fitted model')
% predict the steady-state of the model
t = linspace(0, 120, 1201);
y = predict(mdl, t');
plot(t, y, 'linewidth', 1.5), grid on, xlabel('n'), ylabel('D_n')
title('Prediction of the steady-state')
Based on the model and the plot, it is predicted that the damped harmonic oscillator will settle at . If this is not the case, then you should obtain more data points, at least until the harmonic oscillator reaches steady-state.

댓글 수: 2

The original model posted was in terms of parameters A B C D E
You have
modelfun = @(F,n) F(1).*exp(-F(2).*n).*(sin(F(2).*n) + cos(F(2).*n)) + F(3);
which is A*exp(-B*n)*(sin(B*n) + cos(B*n)) + C -- only three parameters.
Sam Chak
Sam Chak 2023년 7월 10일
Isn't it true that we should simplify the problem sufficiently so that it becomes solvable and that the solution will provide insight into the original problem?
In other words, even though the two functions and may appear different, mathematically they produce the same outputs. By simplifying the problem, we can reduce the number of parameters to determine from five to just three, without sacrificing the behavior of the solution.
After all, @Jake mentioned the damped harmonic oscillator, and I provided the link that shows the proposed model for his reference. Additionally, I suggested obtaining more data points because it doesn't seem that the oscillator has truly reached the steady state (equilibrium) at 8, considering that the data is corrupted by noise.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Jake
Jake 2023년 7월 14일

0 개 추천

Thank you all for your insightful answers. I feel embarrassed for having given such terrible data. This is my first time using MATLAB and I have already learned so much from all of you.
I have a more thorough explanation for what the data is being used for, but again, having been so far off in my attempt, I feel embarrassed of my own ignorance.
My primary goal was to understand if there was a MATLAB function that helped to identify a rule or pattern that governed a series of terms. Second, after reading your responses (which were extremely well crafted) I realized I had gone about this all wrong. And in asking the question I realized I had neither provided enough context, nor understood the implications of the data I was representing. I hope no one felt as if I wasted their time.
Thank you again for all of your responses.

카테고리

도움말 센터File Exchange에서 Chemistry에 대해 자세히 알아보기

제품

릴리스

R2023a

질문:

2023년 7월 9일

답변:

2023년 7월 14일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by