How do I compare two data sets of unequal length?

I have two sets of data, taken on different days, from the same sensor. The temperature was swept from 26C to -30C to 80C and back to 26C. The sensor was read periodically during the temperature sweep. The data sets consist of a temperature column, and another column representing the sensor readings. I would like to take a difference between the two sets of sensor readings, generating another data set having a column of temperatures, and a column of differences between the two original sets of sensor readings. If each data set had exactly the same vector of temperatures, I could just subtract one vector of sensor readings from the other. However, the temperature vectors do not contain exactly the same temperatures, and they don't even have the same number of elements. I would like to interpolate one set of temperatures and sensor readings to match the temperatures of the other, so I have two data sets of the same size, at the same temperatures. One complicating factor is that, due to sensor hysteresis with respect to temperature, the sensor readings are different on the downward temperature ramps from those on the upward temperature ramp. Therefore I can't sort the data on temperature, because that would mix the upward and downward ramps. If I could sort the data on temperature, I could use timeseries objects, with temperature in place of time. However, that won't work in this case.

댓글 수: 8

A sample of your data would be helpful to visualize the problem.
DH
DH 2018년 8월 8일
Ok - here are two files. These are greatly reduced from my actual data sets, but they give you the idea.
What is 'Sensory Data'?
Also, how are you going to pair the two temperature vectors? Are you pairing them by time-of-day? If so, where's the time data?
DH
DH 2018년 8월 8일
The column header should be "Sensor Data" in both data sets. That is the column of data that I wish to compare at each temperature point. It doesn't really matter what it represents - it could be voltage, current, length, etc.
Consider the first few data points:
Data set 1:
Temperature, Sensor Data
26, 0.80
24, 0.82
22, 0.84
Data Set 2:
Temperature, Sensor Data
25, 0.78
23, 0.80
The output data set would interpolate data set 1 to the temperatures of data set 2:
Output Data Set:
Temperature, Sensor Data
25, 0.81
23, 0.83
Adam Danz
Adam Danz 2018년 8월 8일
편집: Adam Danz 2018년 8월 8일
I think I understand your problem now (sensorY was a typo). I'll think about it. In the meantime, here are @DH's data in case anyone else is thinking about this. The red and blue are the two data sets and you can see the lag in temperature and the sensor between the two data sets.
...this is a tough one. You can't use interp1() because the first input is required to be monotonic without duplicates which your data isn't. Even if you sort by temperature and store the sorted index values, you still have duplicates. What is the final goal here? I know you want to measure the difference between the sensors at the same temperature. But you have duplicate measures within (nearly) the same temperature. For example, your temperature data passes through 0 twice. Can you use the average of those 2 sensor measures for the temp=0 data point?
DH
DH 2018년 8월 9일
It would not work to average the sensor data for two identical temperatures, because the identical temperatures may not be on the same ramp - one may be as the temperature is going up, and the other as the temperature is going down. Due to hysteresis, the sensor data on the upward ramp may be different from that on the downward ramp. I need to preserve the hysteresis. However, it would be acceptable to drop one of each pair of exact duplicates - something like this:
[uniquetemps, it, iu] = unique(data.Temperature, 'stable');
datareduced = data(it,:);
This would give me a dataset with no duplicate temperatures, and the order would be preserved. However, the temperatures are still not monotonic.
I think I see how to do it. Your mention of interp1 clued me in.
data1 = readtable([tstfldr 'dataset1.csv']);
data2 = readtable([tstfldr 'dataset2.csv']);
[uniquetemps, it, iu] = unique(data1.Temperature, 'stable');
data1reduced = data1(it,:);
interpSnsr1 = interp1(data1reduced.Temperature, ...
data1reduced.SensorData, data2.Temperature);
df = interpSnsr1 - data2.SensorData;
Thank you for your help.

댓글을 달려면 로그인하십시오.

 채택된 답변

DH
DH 2018년 8월 14일

0 개 추천

Problem solved - see response from Adam Danz on 8 Aug 2018 at 20:10, and my response - DH on 9 Aug 2018 at 12:15.

추가 답변 (2개)

Yuvaraj Venkataswamy
Yuvaraj Venkataswamy 2018년 8월 8일

0 개 추천

댓글 수: 2

DH
DH 2018년 8월 8일
편집: DH 2018년 8월 8일
Those methods finds members of one array that are equal to members of another array, right? My two temperature arrays may not have any members that are exactly equal. I want to interpolate one to the other.
Yeah, this method won't work.

댓글을 달려면 로그인하십시오.

Yuvaraj Venkataswamy
Yuvaraj Venkataswamy 2018년 8월 8일

0 개 추천

if true
id = ismember(dataset1', dataset2', 'rows');
X = 1:size(dataset1, 2);
Y = X(id);
end

질문:

DH
2018년 8월 8일

댓글:

2018년 8월 14일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by