Problem Solving Issue: Specific data in outermost for loop in nested loops being ignored in function that should track trajectory/existence of objects in coordinate system

Question

I have a set of data output by a program simulation that has point objects in space moving with time, with objects occasionally being created and destroyed. The data is organized as follows: for every pair of rows, the first row gives the time and the number of objects in that time, and the second row gives the (x, y) coordinates of each object. The x and y coordinates can take on values between 1 and 200, and the boundary conditions are periodic (this fact might not be relevant to the problem, but appears in the code). 

  Do note that this will be a long read...

A sample from the original data is as follows:

  time #objects	NaN	NaN	NaN	NaN
  x1	y1	x2	y2	x3	y3
  1	3	NaN	NaN	NaN	NaN
  2.94	166.2	34.6	77.5	45.6	103.3
  2	20	NaN	NaN	NaN	NaN
  2.9	166.2	34.7	77.6	45.7	103.3
  3	3	NaN	NaN	NaN	NaN
  34.8	77.7	45.8	103.4	48.0	197.7
  4	18	NaN	NaN	NaN	NaN
  1.0	175.7	35.1	77.7	45.8	103.4
  5	3	NaN	NaN	NaN	NaN
  1.1	176.1	35.3	77.7	45.9	103.4

However, I reformatted the data into a form that I thought was easier to manipulate: I took the transpose so that I can get all the coordinates per time column:

  t1	t2	t3…
  x1	x1	x1…	 
  y1	y1	y1…
  x2	x2	x2…
  y2	y2	y2 …

My goal is to find the objects that are created AND destroyed within the timeframe given and analyze them and record the distances traveled by each before they die. I decided to do this by tracing the trajectory of each object in a systematic manner (from the first time column). I loop through each coordinate pair in each time column – and this happens in a bigger loop through all the time periods (hopefully, the code will clarify what I mean by this). I calculate the Euclidean distance traveled between the (x, y) coordinate under consideration, and all possible coordinate pairs in the next time column. Then, I compare these and find the pair that has traveled less than the threshold value for the accepted constant Euclidean distance allowed (in order to consider it as belonging to the same object). If such a coordinate pair exists, I flag those coordinates in a copy of the data matrix so that they cannot be considered as candidates for other objects.
Now, the problem is that the simulation (that produces the data that I work with) also gives erroneous data in some time periods. So, for example, (x, y) for t=3 is clearly erroneous in the following data, and it is the same object that goes from t=1 to t=5:

  time t; coordinates (x, y)
  t=1; (1.1, 1)
  t=2; (1.2, 1)
  t=3; (100, 10)
  t=4; (1.4, 1)
  t=5; (1.4, 2)

So, I have to allow for these errors in the simulation when I track the objects. I do this by systematically comparing with Euclidean distances in the next three time columns instead of just the next one. So, if I cannot find a match for the object in the next time column, I check with the column after that. If that fails too, I check with the next. Failing that, I mark the object as dead and move to the next object.
Now, the problem with my function is that it does not add to final data matrix any objects after the first bigger through the number of time columns. So, for example, if I have the following testing data:

  
    t  #objects	NaN	NaN	NaN	NaN
    x1	y1	x2	y2	x3	y3
    1	1	NaN	NaN	NaN	NaN
    1	2	NaN	NaN	NaN	NaN
    2	1	NaN	NaN	NaN	NaN
    1	2	NaN	NaN	NaN	NaN
    3	2	NaN	NaN	NaN	NaN
    1	2	100	2	NaN	NaN
    4	1	NaN	NaN	NaN	NaN
    1	2	100.5	2	NaN	NaN
    5	2	NaN	NaN	NaN	NaN
    1	2	102	2	NaN	NaN
    6	2	NaN	NaN	NaN	NaN
    1	2	102.5	2	NaN	NaN
    7	2	NaN	NaN	NaN	NaN
    1	2	103	2	NaN	NaN
    8	1	NaN	NaN	NaN	NaN
    1	2	NaN	NaN	NaN	NaN
    9	1	NaN	NaN	NaN	NaN
    1	2	NaN	NaN	NaN	NaN
    10	1	NaN	NaN	NaN	NaN
    1	2	NaN	NaN	NaN	NaN

My function does flag the coordinates of the obvious object that lives on (it has the x, y coordinates (1, 2) all throughout). Yet, I do not want this object because it was not created after the time began. Nevertheless, my function misses the data that I do want to analyze: the object that was created/destroyed in between the time period given. It is born at t=3 and dies at t=7. I used the diary feature to see how exactly my function progresses. The loops do, for example, reach the coordinates (100, 2), and the function acknowledges that it exists (not isnan/empty) – but still, when it dies at t=7, it is not added to the final data matrix. 
My code (two functions) follows below. The first function is called on by the second. The second function, even though rather long, is mostly repetition of the same code for different cases (especially the else statements in the latter half).

  function [ euclidean_distances1, condition_true_indices1, … ] = euclidean_v2( periodic_data_limits, euclidean_at_least, current_time_column, next_time_column )
  % This function calculates the Euclidean distances travelled by all possible combinations of an object, given the data of the following time column.
      
      % Case 1 - just check the following time column
      euclidean_distances1 = bsxfun(@hypot, ...
          bsxfun(@minus, repmat(current_time_column(1), size(next_time_column(1:2:end, :))), next_time_column(1:2:end, :)), ...
          bsxfun(@minus, repmat(current_time_column(2), size(next_time_column(2:2:end, :))), next_time_column(2:2:end, :))); 
      condition_true_indices1 = find(bsxfun(@le, euclidean_distances1, euclidean_at_least));

      % I excluded the code from the following cases because they might not be relevant to the issue at hand, and because they are very similar to the line above. They serve only to account for the fact that the coordinates are periodic, and so there can be more possible Euclidean distances traveled.
      % Case 2 - x2 periodic limit
      % Case 3 - y2 periodic limit
      % Case 4 - x2, y2 periodic limits
  end

  function [final_data, average_stats] = track_objects(rawData)
   
      % The maximum distance that a coordinate in the 
          % next time column should have traveled in order to be considered to belong to
          % the same point.
      euclidean_at_least = 20; 
      
      % Coordinate data is periodic. So, a point that goes beyond the
          % thresholds below will continue from 0, 0. For example, coordinates that
          % are (199.9, 85.0) at t=2 and (1.1, 85.6) at t=3 should be considered
          % of the same point.
      periodic_data_limits = [200, 200]; % [x, y]
      
      % Preparing the data into a format that is good for this function.
      rawData_transposed = rawData';
      times_and_pairs = rawData_transposed(1:2, 1:2:end);
      number_of_times = times_and_pairs(1, end);
      number_of_pairs = times_and_pairs(2, :);
      rawData_rearranged = rawData_transposed(:, 2:2:end); % Get only columns with coordinates.
      
      % The following variable will be altered on the fly (by having any values 
          % that are identified to belong to a point NaN'ed). This is to ensure
          % that the same coordinate is not considered as belonging to two
          % different points.
      rawData_rearranged_copy = rawData_rearranged; 
      % Pre-setting some variables that will be manipulated in loops.
      final_data = [];
      euclidean_temp = 0;
      birth_time = 0;
      death_time = 0;
      id = 0;
      
      % Loop through the number of times (1 through 10, in this example).
          % This essentially loops through each row for the column number r.
      for r=1:number_of_times
          
          % Loop through the number of pairs in given time column (1 through 2, for r=3, in this example).
              % This essentially loops through each row for the column number r.
          for q=1:number_of_pairs(r) 
              euclidean_temp = 0; % Set the distance traveled by the point so far as = 0.
              death_time = 0; % Set the time of point death as = 0. This is assumed to imply that the point is alive.
              
              % If the x-coordinate under consideration has already been
                  % flagged and is NaN, move on to next row in the same column r.
              if(isnan(rawData_rearranged_copy(2*q-1:2*q, r)))
                  continue;
              else 
                  % Either a new pair has been born, or an existing one is
                      % continuing to move.
                  id = id+1;
                  birth_time = r; 
                  for p=1:number_of_times
                      if(p+1 > number_of_times) % Check to see whether there is no next time column r.
                          break;
                      end
                      % Either there exists a pair that meets the maximum
                          % euclidean distance traveled acceptable directly, or
                          % the point has died. Before assuming that the point
                          % has indeed vanished, tests are done to account for
                          % the periodic limits (via function euclidean_v2).
                      [euclidean_distances1, condition_true_indices1, euclidean_distances2, condition_true_indices2, ...
                          euclidean_distances3, condition_true_indices3, euclidean_distances4, condition_true_indices4] = ...
                          euclidean_v2(periodic_data_limits, euclidean_at_least, rawData_rearranged(2*q-1:2*q, p), ...
                          rawData_rearranged_copy(:, p+1));
                      % If all cases do not have a match, do the same tests
                          % with the next column r against the original coordinates.
                      if(isempty(condition_true_indices1) && isempty(condition_true_indices2) && isempty(condition_true_indices3) && isempty(condition_true_indices4))
                          % Next time column...
                          if(p+2 > number_of_times) % Check to see whether there is no next time column r.
                              break;
                          end
                          % Similar tests...
                          [euclidean_distances1, condition_true_indices1, euclidean_distances2, condition_true_indices2, ...
                          euclidean_distances3, condition_true_indices3, euclidean_distances4, condition_true_indices4] = ...
                          euclidean_v2(periodic_data_limits, euclidean_at_least, rawData_rearranged(2*q-1:2*q, p), ...
                          rawData_rearranged_copy(:, p+2));
                          
                          if(isempty(condition_true_indices1) && isempty(condition_true_indices2) && isempty(condition_true_indices3) && isempty(condition_true_indices4))
                              % Next time column...
                              if(p+3 > number_of_times) 
                                  break;
                              end
                              % Similar tests...
                              [euclidean_distances1, condition_true_indices1, euclidean_distances2, condition_true_indices2, ...
                              euclidean_distances3, condition_true_indices3, euclidean_distances4, condition_true_indices4] = ...
                              euclidean_v2(periodic_data_limits, euclidean_at_least, rawData_rearranged(2*q-1:2*q, p), ...
                              rawData_rearranged_copy(:, p+3));
                              
                              % This is as much allowance as is allowed. The
                                  % point can now be marked dead.
                              if(isempty(condition_true_indices1) && isempty(condition_true_indices2) && isempty(condition_true_indices3) && isempty(condition_true_indices4))
                                  death_time = p+r-1; 
                                  
                                  % Add statistics to output matrix.
                                  % Points that were not created AND destroyed in the given
                                      % time period (and those that did not
                                      % travel) are not required.
                                  if((birth_time > 1) && (euclidean_temp > 0))
                                      final_data = [final_data; birth_time death_time euclidean_temp r q p id]; 
                                  end
                                  break;
                              else
                                  % The point does have a match in a
                                      % following column, and so corresponding
                                      % values in the next column of the copy matrix should be
                                      % flagged (I decided to use NaN to mark
                                      % them), so that another point cannot claim
                                      % this pair of coordinates as its own. 
                                  % The first line below gets the best match
                                      % for the most-likely coordinate, of the
                                      % four possible cases.
                                  [~, case_number] = min([min(euclidean_distances1) min(euclidean_distances2) min(euclidean_distances3) min(euclidean_distances4)]);
                                  euclidean_distances = eval(sprintf('euclidean_distances%d', case_number)); % Or do: if(case_number == 1)... for each case, 1-4.
                                  [minimum_euclidean_value, index_of_minimum_value] = min(euclidean_distances); 
                                  euclidean_temp = euclidean_temp + minimum_euclidean_value;
                                  rawData_rearranged_copy(index_of_minimum_value:index_of_minimum_value+1, (p+1):(p+3)) = NaN;
                                  continue;
                              end
                          else
                              [~, case_number] = min([min(euclidean_distances1) min(euclidean_distances2) min(euclidean_distances3) min(euclidean_distances4)]);
                              euclidean_distances = eval(sprintf('euclidean_distances%d', case_number)); 
                              [minimum_euclidean_value, index_of_minimum_value] = min(euclidean_distances); 
                              euclidean_temp = euclidean_temp + minimum_euclidean_value;
                              rawData_rearranged_copy(index_of_minimum_value:index_of_minimum_value+1, (p+1):(p+2)) = NaN;
                              continue;
                          end
                      else
                          [~, case_number] = min([min(euclidean_distances1) min(euclidean_distances2) min(euclidean_distances3) min(euclidean_distances4)]);
                          euclidean_distances = eval(sprintf('euclidean_distances%d', case_number)); 
                          [minimum_euclidean_value, index_of_minimum_value] = min(euclidean_distances); 
                          euclidean_temp = euclidean_temp + minimum_euclidean_value;
                          rawData_rearranged_copy(index_of_minimum_value:index_of_minimum_value+1, p+1) = NaN;
                          continue;
                      end
                  end
              end
          end
      end
      if(isempty(final_data))
          disp('No points were born and destroyed in the given data set.');
      else
          delta_lifetime = bsxfun(@minus, final_data(:, 2), final_data(:, 1));
          average_stats = [mean(delta_lifetime) mean(final_data(:, 3)) (mean(final_data(:, 3))/mean(delta_lifetime))]; % Format: [mean_lifespan; mean_distance_travelled; mean_speed]
      end
  end



Does anyone see what my error is? Is there a conceptually better/easier approach that I can use to get what I want? I have been jabbing at this object-problem for four weeks now, and this is my fourth approach to trying to solve it. I understand that this is a very long read for a problem that may be very trivial – but I do appreciate your time!

Problem Solving Issue: Specific data in outermost for loop in nested loops being ignored in function that should track trajectory/existence of objects in coordinate system

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

답변 (0개)

카테고리

태그

Community Treasure Hunt

Problem Solving Issue: Specific data in outermost for loop in nested loops being ignored in function that should track trajectory/existence of objects in coordinate system

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

답변 (0개)

카테고리

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기