Group data into bins with irregular sizes / Track peaks

Question

Andreas Nagl 2017년 7월 5일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/347531-group-data-into-bins-with-irregular-sizes-track-peaks

편집: Andreas Nagl 2017년 7월 28일

채택된 답변: John BG

MATLAB Online에서 열기

Hello,

I need to group data (peaks for many signals, below) into bins with irregular sizes:

[pks,locs,w,p] = findpeaks(data)

I know discretize(), but I don't have equal-sized bins. The goal: Find peaks in series of measurements and track shifts of peak positions (therefore the peak positions +/- tolerance give my bins). This means findpeaks(), peaks = bins +/-, findpeaks() in next dataset, put them into the bins. Then the whole process again with the latest dataset giving the next bins. I am dealing with lots of peaks (~ 100/measurement and 60.000 altogether).

So what I want in the end:

Peak 1
   Pos 7.2 Height 5     Width 3
   Pos 7.3 Height 5.3   Width 2.8
   ...     ...          ...
Peak 2
   ...     ...          ...

Coming from the C# world, where this would be solved with lists and loops I appreciate every hint, including keywords for further research (also which data structures to use).

Thanks!

댓글 수: 2
없음 표시없음 숨기기

Image Analyst 2017년 7월 6일

Don't make us imagine what your data looks like -- show us. Please plot it and upload a screenshot, and attach the data file with code to read it in.

Andreas Nagl 2017년 7월 11일

편집: Andreas Nagl 2017년 7월 11일

Please find a graph of my data here:

I cannot attach a data file for various reasons, but I hope this picture helps.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

John BG 2017년 7월 6일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/347531-group-data-into-bins-with-irregular-sizes-track-peaks#answer_273045

편집: John BG 2017년 7월 6일

MATLAB Online에서 열기

Hi Andreas

this is John BG ( <mailto:jgb2012@sky.com jgb2012@sky.com> )

While the common output of findpeaks would be something like the starter example

x = linspace(0,1,1000);
Pos = [1 2 3 5 7 8]/10;
Hgt = [4 4 4 2 2 3];
Wdt = [2 6 3 3 4 6]/100;
for n = 1:length(Pos)
    Gauss(n,:) =  Hgt(n)*exp(-((x - Pos(n))/Wdt(n)).^2);
end
PeakSig = sum(Gauss);
% plot(x,Gauss,'--',x,PeakSig)

your list of peaks has repeated numerals for groups of peaks, something like

Pos1=[1 1.2 1.7]/10
Hgt1=[3.9 4 4.1]
Wdt1=[2 2 2.1]/100
Pos2=[2]/10
Hgt2=4
Wdt2=6/100
Pos3=[2.9 3.01 3.02 3.25]/10
Hgt3=[4.1 4.2 3.8 4]
Wdt3=[3 3.1 3.2 2.9]/100
Pos4=[5 5.2]/10
Hgt4=[2 2.01]
Wdt4=[3 3.1]/100
Pos5=[7 6.8]/10
Hgt5=[1.9 2]
Wdt5=[3.9 4]/100
Pos6=[8]/10
Hgt6=3
Wdt6=6/100

according to your question the data related to peaks with same numerals is grouped into cells

locs={Pos1 Pos2 Pos3 Pos4 Pos5}
w={Wdt1 Wdt2 Wdt3 Wdt4 Wdt5}
p={Hgt1 Hgt2 Hgt3 Hgt4 Hgt5}

but while your data has sets of variable lengths for each locs{k} w{k} p{k}

findpeaks does not repeat peak numerals for peaks located on different locations.

Your variable pks for this example would be

pks=[1:1:5]

With such data format, to generate the table you ask for, the peak numerals have to repeat accordingly

L=[]
for k=1:1:size(locs,2)
  L=[L repmat(k,1,numel(locs{k}))];
end
L
 =
     1     1     1     2     3     3     3     3     4     4     5     5

And to read the values in the table the contents of the cells has to change type

locs2=cell2mat(locs)
w2=cell2mat(w)
p2=cell2mat(p)

Once the data has been prepared, obtaining the table

T = table(L', locs2', p2', w2','VariableNames',{'PeakNo' 'Positions' 'Heights' 'Widths'})
T =
  12×4 table
    PeakNo    Positions    Heights    Widths
    ______    _________    _______    ______
         0.1         3.9        0.02 
        0.12           4        0.02 
        0.17         4.1       0.021 
         0.2           4        0.06 
        0.29         4.1        0.03 
       0.301         4.2       0.031 
       0.302         3.8       0.032 
       0.325           4       0.029 
         0.5           2        0.03 
        0.52        2.01       0.031 
         0.7         1.9       0.039 
        0.68           2        0.04

if you find this answer useful would you please be so kind to consider marking my answer as Accepted Answer?

To any other reader, if you find this answer useful please consider clicking on the thumbs-up vote link

thanks in advance

John BG

<mailto:jgb2012@sky.com jgb2012@sky.com>

댓글 수: 2
없음 표시없음 숨기기

Andreas Nagl 2017년 7월 6일

Thank you very much for your answer.

Could you specify what you mean by the term numeral? Especially in connection with the findpeaks() method?

John BG 2017년 7월 23일

MATLAB Online에서 열기

Hi Andreas

In your question one can read the format of your data

Peak 1
   Pos 7.2 Height 5     Width 3
   Pos 7.3 Height 5.3   Width 2.8
   ...     ...          ...
Peak 2
   ...     ...          ...

note that each peak may have a variable amount of positions.

Yet, the function find understands each 'position' as a different peak.

For such reason, in my answer I assign numerals that repeat accordingly.

The numerals are the matrix indices.

I have noticed that my answer was initially accepted, probably by you, and now it's not accepted.

If you find my answer to solve your question, would you be so kind to mark again it as the accepted answer?

Thanks in advance.

John BG <mailto:jgb2012@sky.com jgb2012@sky.com>

댓글을 달려면 로그인하십시오.

Answer 2

Star Strider 2017년 7월 5일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/347531-group-data-into-bins-with-irregular-sizes-track-peaks#answer_273028

편집: Star Strider 2017년 7월 5일

MATLAB Online에서 열기

Since you have a fixed number of peaks in your data (I assume they are always in the same order), I would store the peaks and other data in a cell array (or matrix) in a loop, for example:

for k1 = 1:N
    [pks{k1},locs{k1},w{k1},p{k1}] = findpeaks(data{k1});
end

where ‘N’ are the number of data vectors you have.

If you want to use discretize (or possibly histcounts), note that the bins do not have to be equal sizes. You can specify the edges of the bins in a vector in both functions, so the bins can be different widths.

EDIT —

The loop I posted will allow you to get all the information you described in your edit that you want.

For information on how to use cell arrays such as I use in my code example, see the documentation on Cell Arrays (link).

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

Andreas Nagl 2017년 7월 11일

I don't mean to be rude, but I ask for tracking peaks for a reason.

And there is a reason why I am not asking about pre-processing, noise and fft. That is all already done. So unless you're an expert in crystallography I'd appreciate it if we could stick to my question/problem instead of redefining my problem.

Star Strider 2017년 7월 11일

I have no idea what your peaks represent. I have no experience with crystallography.

If the peaks simply shift in amplitude but are always at the same positions with respect to your independent variable, then you can concatenate them in a matrix. That would allow you to track their amplitudes between experiments, and plot them.

If the peak positions are never stable, so that the peaks shift in amplitude and position of your independent variable, I know of no reliable way to track them between experiments. Perhaps summing them over the same ranges of your independent variable in each experiment would then be appropriate. You could do that with the reshape function, if the size of your vectors and the ranges of the independent variable match the requirements reshape imposes.

Perhaps posting to a crystallography forum would provide you with participants with the necessary expertise and experience to give you the information you need.

댓글을 달려면 로그인하십시오.

Answer 3

Jan 2017년 7월 24일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/347531-group-data-into-bins-with-irregular-sizes-track-peaks#answer_275392

편집: Jan 2017년 7월 24일

Omit the details about cristallography or if the numbers are peaks, because for Matlat they are just numbers.

You have 60'000 sets of 100 numbers and want to find the clusters in it: The numbers which are near together. The data sets need not contain all numbers. You want the position and width of the clusters. Correct? If so, how is "width" defined? Standard deviation or maximum range? Can the clusters overlap -- or in your terminology: can the positions of the peaks vary such, that they could be seen as the one or the other final peak? Then the order of numbers might matter.

This might be a job for kmeans. Join all numbers (peak positions) to a vector and determine the 100 clusters. But perhaps there are not exactly 100 clusters. Then this answer is not the solution, but perhaps it helps you to get in the right direction. Or I misunderstand your question.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Star Strider 2017년 7월 24일

I would agree, however I was never able to determine if: (1) the particular values of the independent variable are invariant and that the ‘peak’ values at those values changed; (2) if the goal was to determine the total ‘energy’ (or whatever the dependent variable represents) in a given range of the independent variable; (3) the number of ‘peaks’ in a given range; (4) the pattern of the peaks in a range; or (5) something else.

댓글을 달려면 로그인하십시오.

Answer 4

Andreas Nagl 2017년 7월 28일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/347531-group-data-into-bins-with-irregular-sizes-track-peaks#answer_275913

편집: Andreas Nagl 2017년 7월 28일

Dear all,

thank you very much for your contributions. What I ended up doing was a cell array of tables, one table for each peak. So by looping through the peaks I sorted them into the respective tables.

For a big number of peaks this can get rather slow (as I always look up the last position of a peak in it's table to account for shifts), but for me it's good enough.

So it's quite similar to John BG's solution (Thanks again!). I hope this helps!

However, I'd like to highlight Jan Simon's answer: Even though as it's not what I wanted initially, I will do that. It's a great idea, thanks!

Best, Andreas

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Group data into bins with irregular sizes / Track peaks

댓글 수: 2
없음 표시없음 숨기기

채택된 답변

댓글 수: 2
없음 표시없음 숨기기

추가 답변 (3개)

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

Group data into bins with irregular sizes / Track peaks

댓글 수: 2 없음 표시없음 숨기기

채택된 답변

댓글 수: 2 없음 표시없음 숨기기

추가 답변 (3개)

댓글 수: 8 이전 댓글 6개 표시이전 댓글 6개 숨기기

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기