Attempting to find patterns within my data

조회 수: 18 (최근 30일)
Sam Mahdi
Sam Mahdi 2019년 8월 2일
댓글: Sam Mahdi 2019년 8월 7일
Hello everyone,
I have an idea I'd like to impliment, but I don't quite know how to.
I have created a script that will designate a 154xm matrix (the more data points I add, the more columns that are created). However, as it stands, I just have a long list of numbers, but it would be impossible to interpert this data once I add more data points (getting 154x100 matrix), so I want a write a program that can analyze the data for me.
It might just be easier for me to demonstrate what I want to do:
A =
8 1 2
2 4 4
5 2 1
6 1 3
1 1 1
Assume I have a 5x3 column. What I want to do is find a diagonal pattern that goes through my matrix at values below 2. So in this example if we scoure each column and the elements in this column, we can determine easily find the diagonal line that goes through each columns values that have a value below 2 (I have zeroed all the values out to demonstrate what I mean)
A =
0 0 0
0 0 0
0 0 1
0 1 0
1 0 0
Now I don't actually want to zero out my actual data (since multiple diaganol lines may exist), but I hope it's clear what I'm trying to do. I have found the pattern I was looking for in my data. Now in a 5x3 matrix, you can easily visualize this by looking at it, but using a 154x100 matrix, this becomes impossible to visualize.
If it helps, this is the script I am currently using to obtain my data:
predictions=load('Predictions2.txt');
experimental=load('Experimental.txt');
x=predictions(:,1);
error=predictions(:,2);
y=experimental(:,1);
z = zeros(1,6);
sizeval = 3; % in this example I am using 3 data points, so I will have 3 columns in my final matrix
b = zeros(sizeval,154);
d=(1:154); %this is simply for plotting purposes and is not used in any calculations
e=zeros(sizeval,6);
e=zeros(1,6);
for n=1:154 % there are 154 predictions, so I am determining the RMSD of 1 data point (using 6 different parameters) against each prediction
for j=1:sizeval % each data point has 6 parameters, here I am creating the loop to calculate RMSDs for multiple data points
for i=1:6 % I am taking the RMSD between the prediction and experimental values
xindex = i+(6*(n-1));
yindex=i+(6*(j-1));
z(i)=((x(xindex)-y(yindex)))^2;
e(1,i)=(z(i)/(error(xindex)^2));
if e(1,i)>1000
e(1,i)=0;
end
b(j,n)=sqrt((1/5)*sum(e,2)); %this is the output of my data, creating a 154xm (m being data points) matrix
end
end
b'
end
With an output like this:
ans =
3.9481 5.3775 5.1606
4.4432 3.6738 3.7466
2.7247 6.6981 6.7029
5.4045 4.2693 3.9113
1.3158 10.7013 10.4940
7.9002 6.2291 5.8123
2.2395 10.3191 10.1340
2.6847 9.3292 9.2099
7.5437 7.5024 7.2936
5.8558 8.5550 8.3015
1.6878 11.2286 11.0484
6.7887 8.6833 8.4203
12.6863 1.7771 0.9488
13.4256 4.2317 3.4892
2.3376 8.3851 8.2385
5.0820 5.3472 5.0439
10.3929 1.7875 1.3311
4.1463 3.4607 2.2643
6.0488 5.8100 5.6339
...
  댓글 수: 5
Image Analyst
Image Analyst 2019년 8월 2일
It's easy to get the 1's in A by doing:
[rows, columns] = find(A);
If you each separate, contiguous grouping of 1's in A, then you can use bwlabel() and/or regionprops() depending on exactly what you want. Post your larger matrix in the text files, if you want an example.
Image Analyst
Image Analyst 2019년 8월 3일
You can certainly threshold
A = b < someValue; % Produces a logical matrix. Or use > someValue.
Then you can skeletonize the lines/regions down to single pixel wide lines with bwmorph()
A = bwmorph(A, 'skel', inf);
imshow(A);

댓글을 달려면 로그인하십시오.

채택된 답변

the cyclist
the cyclist 2019년 8월 2일
% The original data
A = [
8 1 2
2 4 4
5 2 1
6 1 3
1 1 1];
% Get the dimensions of A
[m,n] = size(A);
% Initialize the pattern matrix as all false. Will fill in valid
% antidiagonals as true.
pattern = false(m,n);
% Find the vector of linear indices that span the first possible
% antidiagonal
dvec = n : m-1 : n + (n-1)*(m-1);
% Work down all antidiagonals, and fill in "true" if the pattern is
% matched, updating the linear indices as we go.
for ni = n : m
pattern(dvec) = all(A(dvec)<2);
dvec = dvec + 1;
end
  댓글 수: 19
the cyclist
the cyclist 2019년 8월 7일
What Guillaume said is all true.
"How do I learn a programming language (or programming in general) really well?" is a huge topic. In the case of MATLAB there are very good beginner-level materials out there, e.g. the MATLAB Onramp.
Things that I think help a person come up to speed more quickly:
  • Having real-world problems that one is trying solve. In my experience, nothing motivates one to learn more than the need for a solution.
  • Trying to understand the core concepts of the language. For example, understanding the power of vectorization is key to using MATLAB well.
  • Not just blindly copying & pasting code (from here, Stack Overflow, etc), but instead trying to really understand what the algorithms are doing. [You seem to be trying that!] Remembering those techniques, for next time, helps you build up that "bag of tricks" for similar problems.
  • Really really trying hard to solve problems yourself before asking for help. In my experience, I remember better when I figured it out for myself. (There is of course a balance here, between the value of figuring it out, and the frustration of pounding your head against a wall.)
In the end, it really is the experience of doing, over and over again, that builds that expertise.
Sam Mahdi
Sam Mahdi 2019년 8월 7일
To Guillaume:
No, sorry I was trying to understand the cyclists code first before I moved on to yours.
But thank you guys for your help and feedback. I'm currently in a Machine learning class that uses Matlab, so sorta learning linear algebra and all the things you can do with matrices and vectors/arrays as I go, as well as trying to apply it to what I'm doing (like my job above).

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Guillaume
Guillaume 2019년 8월 2일
편집: Guillaume 2019년 8월 2일
%demo data:
A = logical([
0 1 0
0 0 1
1 0 1
0 1 0
1 0 1
0 1 0
0 1 1])
Finding the start index (in the first column) of dagonals:
indices = hankel(1:size(A, 1)+1-size(A, 2), size(A, 1)+1-size(A, 2):size(A, 1)) + (0:size(A, 2)-1) * size(A, 1);
isdiago = all(A(indices), 2);
diag_idx = indices(isdiago)
Finding the start index in the first column of antidiagonals:
indices = toeplitz(size(A, 2):size(A, 1), size(A, 2):-1:1) + (0:size(A, 2)-1) * size(A, 1)
isantidiag = all(A(indices), 2);
antidiag_idx = indices(isantidiag)
If you want the indices in all the columns, just repmat the isdiago, isantidiag across all columns of the respective indices.

카테고리

Help CenterFile Exchange에서 Operating on Diagonal Matrices에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by