Split table from database into sub-tables dependent from timestamp difference

Hi there,
I'm importing measurement data from a PostgreSQL database into one Matlab table, called data:
%% Make connection to database
conn = database('PostgreSQL','NameOfDatabase','Password');
%Set query to execute on the database
query = ['SELECT * ' ...
'FROM postgres.public.data'];
%% Execute query and fetch results
data = fetch(conn,query);
the data table looks like this:
timestamp ID x y z
'2019-07-22 15:46:24.919' 41034 32.5437012000000 6.59714985000000 0
'2019-07-22 15:46:25.019' 41034 32.5466003000000 6.59952021000000 0
'2019-07-22 15:46:25.119' 41034 32.5657997000000 6.60097980000000 0
'2019-07-22 15:46:25.219' 41034 32.5840988000000 6.59719992000000 0
'2019-07-22 15:46:25.319' 41034 32.5975990000000 6.59470987000000 0
'2019-07-22 15:46:25.419' 41034 32.6068001000000 6.59368992000000 0
'2019-07-22 15:46:25.519' 41034 32.6091003000000 6.59696007000000 0
'2019-07-22 15:46:25.619' 41034 32.6016006000000 6.59864998000000 0
'2019-07-22 15:46:25.719' 41034 32.5957985000000 6.59865999000000 0
Now I want to create a new table for each measurement. I can only distinguish the next measurement from a time difference (of a few seconds) in the timestamp column. This is somehow what I'm trying:
format long
data.timestamp = categorical(data.timestamp);
Difference = caldiff(data.timestamp);
Tablename = ('Mess' + 'Counter');
Counter = 0;
for each Difference > 2 seconds
% create new table (Tablename)
% insertrows from start to difference row
Counter = Counter + 1;
end
Is caldiff the right funktion for the very small time differences?
Thanks

댓글 수: 1

"Now I want to create a new table for each measurement"
Why?
MATLAB tables are designed to make it easy to group data and apply operations on those groups:
In contrast, splitting up data like that usually just makes it harder to work with.
What is your actual goal? How do you wish to process your data?

댓글을 달려면 로그인하십시오.

 채택된 답변

Jon
Jon 2019년 7월 24일
편집: Jon 2019년 7월 24일
First I would recommend making a new column in your table with the time expressed as a MATLAB datetime array, for example you could use the following, (see https://www.mathworks.com/help/matlab/matlab_prog/set-display-format-of-date-and-time-arrays.html#buhb8v6-1 for datetime format conversion)
data.time = datetime(data.timestamp,'InputFormat','yyyy-MM-dd HH:mm:ss.SSS')
Then following up on Stephens comment, you really don't want to split this up into lots of small tables, just add the information you want to this one. So for example you could assign a column called measurement which assigns values 1,2,3 ... according to the time range that the measurements occur.
So for example
% calculate elapsed time in seconds from start of test
% (assumes you have already added the time column as a datetime array
et = seconds(data.time - data.time(1));
% make new column with measurement number, with a new measurement starting every 2 seconds
data.measurement = ceil(et/2)
Now if you want to retrieve some specific measurements, for example the x values for measurement number 3 can just use
x = data.x(data.measurement==3)

댓글 수: 14

Hi Jon, thanks for your help.
Stephen and you are right, I just need to add the measurement number and then work with groups.
So I added the time column:
% new time column
data.time = datetime(data.timestamp,'Inputformat','yyyy-MM-dd HH:mm:ss.SSSS');
% to display fractional seconds
data.time.Format = 'MMM dd, yyyy HH:mm:ss.SSS'
The elapsed time always refers to the first entry. What I try is to check the elapsed time between each row. So if I start measuring the next Minute or Hour or Day, the time brake is bigger than 2 seconds (or even bigger than a Minute or Hour or Day). So I better work with a diff funktion:
% elapsed time
et = seconds(diff(data.time));
For this part:
% new column with measurement number, with a new measurement starting AFTER a break of 2 seconds in time
data.measurement = ceil(et > 2);
I get an Error message:
To assign to or create a variable in a table, the number of rows must match the height of the table.
The elapsed time double has one row less than the table, maybe because of the diff function. Do I need to add here one starting row in the beginning?
OK, for what you want to do you need to count the jumps that are greater than your threshold (e.g. 2 sec).
So you can do something like
% determine where jumps (new measurements start)
% note we assume a new measurement at the start of the data set
isJump = [true; diff(et)>2]
% assign measurement numbers by counting jumps
data.measurement = cumsum(isJump)
Note that I use the "trick" here that the elements of logical arrays, like isJump above, are zeros and ones and MATLAB lets you add them up as numbers
I get the same error:
To assign to or create a variable in a table, the number of rows must match the height of the table.
My 'et' and 'isJump' is one row short?
The length of the diff of a vector is one less than the length of the original vector. So if you just use diff(et) you will have a problem.
Are you sure you padded your isJump with an initial value of true as I did?
isJump = [true; diff(et)>2]
Yes, I used exactly your code. Maybe I better not use diff?
Can you please attach your code, maybe I can see what is going on.
clear all;
%% Make connection to database
conn = database('PostgreSQL','name','password');
%Set query to execute on the database
query = ['SELECT * ' ...
'FROM postgres.public.data'];
%% Execute query and fetch results
data = fetch(conn,query);
%% Try
% New time
data.time = datetime(data.timestamp,'Inputformat','yyyy-MM-dd HH:mm:ss.SSSS');
data.time.Format = 'MMM dd, yyyy HH:mm:ss.SSS';
% calculate elapsed time in seconds from start of test
et = seconds(diff(data.time));
% make new column with measurement number, with a new measurement starting every 2 seconds
isJump = [true; diff(et)>2];
data.measurement = cumsum(isJump);
%% Close connection to database
close(conn)
%% Clear variables
clear conn query
Sorry, there was a little confusion on the definition of the variable et as the discussion evolved. Originally I had it as the duration from the very start of the test (first row) It then changed to the duration between rows.
As a result of this confusion you calculated diff twice, so just
change the line
isJump = [true; diff(et)>2];
to
isJump = [true; et>2];
and it should work
Excellent! Glad it worked out.
You may just not have gotten to it, but if this worked for you it would be good to accept the answer so other people who may have the same problem will know to look here.
Hi, I have a ongoing question:
I grouped the data in the column data.measurement by its value (id = 1,2,3,4,5). Now I try to plot only the x,y data (from column 3,4) with id == 5 ?
[group, id] = findgroups(data.measurement);
x= data{:,3};
y= data{:,4};
plot(x,y) with id==5
I don't think it is necessary to use the findgroups function for this purpose. Assuming you now have your data in a table called data, with columns named time, x, y, and measurement. You can simply use
plot(data.x(data.measurement==5),data.y(data.measurement==5))
or perhaps to make the code a little more readable and efficient (avoid finding the matching columns twice) you could use
% make logical vector whose elements are set to true on rows with the desired
% measurement number
idl = data.measurement==5;
% plot the x and y values corresponding to the desired measurement using logical indexing
plot(data.x(idl),data.y(idl))

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Data Type Identification에 대해 자세히 알아보기

제품

릴리스

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by