필터 지우기
필터 지우기

how to know the distribution of my data

조회 수: 5 (최근 30일)
MAHMOUD ALZIOUD
MAHMOUD ALZIOUD 2018년 9월 24일
댓글: MAHMOUD ALZIOUD 2018년 9월 26일
Dear Matlab Community, I have attached an excel file for some data I have. this data represents the percent of loads in each load bin with their histogram , my question is how can I know using MATLAB what distribution my data follows? is it normal? exponential or something else? and after that how to know the parameters of the distribution. Thanks alot
  댓글 수: 8
dpb
dpb 2018년 9월 25일
Your institution may have access to more; ask your advisor for what is available for your use on university machines.
MAHMOUD ALZIOUD
MAHMOUD ALZIOUD 2018년 9월 25일
thank you for your help

댓글을 달려면 로그인하십시오.

채택된 답변

Image Analyst
Image Analyst 2018년 9월 26일
When I fit the data to the sum of 3 Gaussians, the fit looks pretty reasonable. What do you think? And why do you need analytical equation(s) for the distribution rather than just using the ACTUAL distribution obtained from the histogram.
% Uses fitnlm() to fit a non-linear model (sum of three Gaussians with an offset) through noisy data.
% Requires the Statistics and Machine Learning Toolbox, which is where fitnlm() is contained.
% Initialization steps.
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 20;
% % Create the X coordinates from 0 to 20 every 0.5 units.
% X = linspace(0, 40000, 4000);
% mu1 = 6000; % Mean, center of Gaussian.
% sigma1 = 2000; % Standard deviation.
% mu2 = 13000; % Mean, center of Gaussian.
% sigma2 = 2500; % Standard deviation.
%
% % Define function that the X values obey.
% a = 0 % Arbitrary sample values I picked.
% b = 3
% c = 18
% Y = a + b * exp(-((X - mu1)/sigma1) .^ 2) + ...
% c * exp(-((X - mu2)/sigma2) .^ 2); % Get a vector. No noise in this Y yet.
% X=X';
% Y=Y';
data = xlsread('matlab.xlsx');
X = data(:, 1);
Y = data(:, 2);
% Now we have noisy training data that we can send to fitnlm().
% Plot the noisy initial data.
plot(X, Y, 'b.', 'LineWidth', 2, 'MarkerSize', 15);
grid on;
drawnow;
% Convert X and Y into a table, which is the form fitnlm() likes the input data to be in.
tbl = table(X, Y);
% Define the model as Y = a + exp(-b*x)
% Note how this "x" of modelfun is related to big X and big Y.
% x((:, 1) is actually X and x(:, 2) is actually Y - the first and second columns of the table.
modelfun = @(b,x) b(1) + b(2) * exp(-((x(:, 1) - b(3))/b(4)).^2) + ...
b(5) * exp(-((x(:, 1) - b(6))/b(7)).^2) + ...
b(8) * exp(-((x(:, 1) - b(9))/b(10)).^2);
beta0 = [0, 2, 6000, 2000, 18, 13000, 2000, 2, 14000, 9000]; % Guess values to start with. Just make your best guess.
% Now the next line is where the actual model computation is done.
mdl = fitnlm(tbl, modelfun, beta0);
% Now the model creation is done and the coefficients have been determined.
% YAY!!!!
% Extract the coefficient values from the the model object.
% The actual coefficients are in the "Estimate" column of the "Coefficients" table that's part of the mode.
coefficients = mdl.Coefficients{:, 'Estimate'}
% Let's do a fit, but let's get more points on the fit, beyond just the widely spaced training points,
% so that we'll get a much smoother curve.
X = linspace(min(X), max(X), 1920); % Let's use 1920 points, which will fit across an HDTV screen about one sample per pixel.
% Create smoothed/regressed data using the model:
yFitted = coefficients(1) + coefficients(2) * exp(-((X - coefficients(3))/ coefficients(4)) .^2) + ...
coefficients(5) * exp(-((X - coefficients(6))/ coefficients(7)) .^2) + ...
coefficients(8) * exp(-((X - coefficients(9))/ coefficients(10)) .^2);
% Now we're done and we can plot the smooth model as a red line going through the noisy blue markers.
hold on;
plot(X, yFitted, 'r-', 'LineWidth', 2);
grid on;
title('Exponential Regression with fitnlm()', 'FontSize', fontSize);
xlabel('X', 'FontSize', fontSize);
ylabel('Y', 'FontSize', fontSize);
legendHandle = legend('Noisy Y', 'Fitted Y', 'Location', 'northeast');
legendHandle.FontSize = 25;
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0 0 1 1]);
% Get rid of tool bar and pulldown menus that are along top of figure.
% set(gcf, 'Toolbar', 'none', 'Menu', 'none');
% Give a name to the title bar.
set(gcf, 'Name', 'Demo by ImageAnalyst', 'NumberTitle', 'Off')
  댓글 수: 1
MAHMOUD ALZIOUD
MAHMOUD ALZIOUD 2018년 9월 26일
this is genius and beautiful, I thank you very very much for your amazing help

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

dpb
dpb 2018년 9월 25일
Plotting the data it definitely is not normal; has long RH tail and isn't symmetric.
For hypothesis testing it would be better to go back to the underlying data from which the histogram was made if you have it.
  댓글 수: 4
MAHMOUD ALZIOUD
MAHMOUD ALZIOUD 2018년 9월 25일
actually when i went back to the original data (55000 rows) i found out that it is normal
dpb
dpb 2018년 9월 25일
By what measure? As IA says, it looks bimodal (if not tri, that's kinda suspicious hump at the LH side of the central lobe) and the RH tail is definitely not consistent with Gaussian.
If the raw data look markedly different that would be surprising.

댓글을 달려면 로그인하십시오.


Image Analyst
Image Analyst 2018년 9월 25일
Since your data didn't look like one Gaussian to me, I fit it to the sum of two Gaussians with the attached m-file. I got this:

카테고리

Help CenterFile Exchange에서 Hypothesis Tests에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by