wrong values in histogram plotting
조회 수: 6 (최근 30일)
이전 댓글 표시
Hello,
I'm trying to plot a histogram of an array. I have a csv file with a list of double values, and I want to see how many elements have a value that is less or equal to 10% of the maximal value, 20%, 30% and etc.. I tried using the following code, but I get wrong statistics, when I check how many elements have a lesser or equal value to 10% of the maximal element, I see that there are 11173940 such elements. I did so by using the following code:
maxElement = max(array);
elementCount = sum(array < maxElement * 0.1);
when I print the histogram it shows like there are less than 180 elements that constitute this condition. this is the code I used (I have a lot of csv files that I want to read and analyze in the same manner, that's why the filename loop):
clear; clc;
dataDir = 'hist_res_rel';
fileList = dir(strcat(dataDir, '/*.csv'));
plotDir = 'plot_dir_rel';
for i = 1:numel(fileList)
fileName = fileList(i).name;
epoch = fileName(length(fileName)-5:length(fileName)-4);
if contains(fileName,'a_rel')
plot_title = strcat('A Realtive Value Change Between Epochs: ', epoch, '-', num2str(str2double(epoch)+10));
end
if contains(fileName,'b_rel')
plot_title = strcat('B Realtive Value Change Between Epochs: ', epoch, '-', num2str(str2double(epoch)+10));
end
rel_val = readmatrix(strcat(dataDir, fileName));
rel_val = abs(rel_val);
Max = max(rel_val);
p = 0.1;
x = zeros(10, 1);
y = zeros(10, 1);
for index = 1:10
percentage = Max * p;
x(index) = percentage;
if index == 1
y(index) = sum(rel_val <= x(index));
else
y(index) = sum(rel_val <= x(index) & rel_val > x(index-1));
end
p = p + 0.1;
end
f = histogram(rel_val, x);
xticks(x);
title(plot_title);
xlabel('Percantage of Relative Change');
ylabel('Amount of Parameters');
xticklabels({'0', '10','20','30', '40', '50', '60', '70', '80', '90', '100'});
saveas(f, strcat(plotDir, '/plot_', fileName(1:length(fileName)-3), '.jpg'));
end
this is the histogram that I get:
and this is the csv file that I'm trying to analyze just to make sure everything works (sorry, it's so large I had to use an external site for the upload):
Thank you so much for your time and attention, I appreciate your help.
댓글 수: 0
채택된 답변
Ganesh
2023년 12월 27일
I understand that your histogram is inconsistent with the data you have. The issue you are facing can be easily resolved by adding 0 at the start of the variable "x".
When using a histogram, the histogram calculates the number of data points between edges. As your variable "x" begins with Max*0.1, the histogram plots interval between Max*0.1 and Max*0.2 and so on. By adding 0 at the start you can make the first edge to be 0, Max*0.1, which will give you the right result.
x = [0;x] % Add this line before plotting the histogram
Kindly refer to the following document for more information and examples on using the "histogram()" function:
Hope this helps!
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Histograms에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!