A very fast way to sort datetime (in "ascend" mode)?
조회 수: 12 (최근 30일)
이전 댓글 표시
A very fast way to sort datetime (in "ascend" mode)?
Here an example:
a = datetime([
'2022-10-27 00:22:50.000'
'2022-10-27 05:29:45.000'
'2022-10-27 05:32:19.400'
'2022-10-27 05:36:44.100'
'2022-10-27 05:39:26.600'
'2022-10-27 05:43:18.200'
'2022-10-27 05:49:17.400'
'2022-10-27 05:55:27.300'
'2022-10-27 05:58:07.000'
'2022-10-27 06:17:13.800'
'2022-10-27 06:41:28.700'
'2022-10-27 07:03:06.000'
'2022-10-27 07:09:49.800'
'2022-10-27 07:17:39.700'
'2022-10-27 07:35:09.000'
'2022-10-27 07:42:33.600'
'2022-10-27 07:46:50.500'
'2022-10-27 08:07:02.700'
'2022-10-27 08:26:29.600'
'2022-10-27 08:45:03.500'
'2022-10-27 08:48:09.700'
'2022-10-27 08:53:57.000'
'2022-10-27 08:59:43.400'
'2022-10-27 09:13:24.100'
'2022-10-27 09:19:25.000'
'2022-10-27 09:26:35.000'
'2022-10-27 09:29:54.000'
'2022-10-27 09:46:45.700'
'2022-10-27 10:10:59.900'
'2022-10-27 10:29:04.600'
'2022-10-27 10:31:50.100'
'2022-10-27 10:37:45.300'
'2022-10-27 10:42:35.800'
'2022-10-27 10:58:42.300'
'2022-10-27 11:03:48.900'
'2022-10-27 11:10:44.700'
'2022-10-27 11:13:11.100'
'2022-10-27 11:31:25.100'
'2022-10-27 11:56:01.200'
'2022-10-27 12:19:25.300'
'2022-10-27 12:25:12.500'
'2022-10-27 12:30:16.900'
'2022-10-27 12:49:19.100'
'2022-10-27 12:55:42.700'
'2022-10-27 12:58:18.100'
'2022-10-27 13:16:06.300'
'2022-10-27 14:00:44.700'
'2022-10-27 14:04:14.200'
'2022-10-27 14:11:54.600'
'2022-10-27 14:17:10.000'
'2022-10-27 14:42:03.300'
'2022-10-27 14:45:22.100'
'2022-10-27 15:02:07.900'
'2022-10-27 15:25:39.600'
'2022-10-27 15:44:58.300'
'2022-10-27 15:48:35.800'
'2022-10-27 15:54:23.100'
'2022-10-27 16:00:17.300'
'2022-10-27 16:19:23.700'
'2022-10-27 16:27:30.800'
'2022-10-27 16:31:30.600'
'2022-10-27 16:52:09.700'
'2022-10-27 17:16:08.800'
'2022-10-27 18:01:39.400'
'2022-10-27 18:08:24.800'
'2022-10-27 18:17:44.500'
'2022-10-27 18:25:02.500'
'2022-10-27 18:27:45.700'
'2022-10-27 18:48:32.000'
'2022-10-27 19:02:01.700'
'2022-10-27 19:24:08.300'
'2022-10-27 19:30:01.400'
'2022-10-27 19:43:02.200'
'2022-10-27 19:48:50.300'
'2022-10-27 19:55:41.900'
'2022-10-27 19:58:23.700'
'2022-10-27 20:17:15.800'
'2022-10-27 20:28:47.600'
'2022-10-27 20:49:36.900'
'2022-10-27 20:53:02.700'
'2022-10-27 21:07:12.100'
'2022-10-27 21:09:47.800'
'2022-10-27 21:50:42.300'
'2022-10-27 22:07:34.100'
'2022-10-27 22:09:18.800'])
% Any way faster than this one ?
tic
[~, idx2] = sort( datenum(a), 1, 'ascend');
toc
댓글 수: 0
채택된 답변
Stephen23
2022년 11월 18일
"Any way faster than this one"
Interestingly, sorting seems to be slightly faster with the superfluous call to DATENUM:
a = datetime(['2022-10-27 00:22:50.000';'2022-10-27 05:29:45.000';'2022-10-27 05:32:19.400';'2022-10-27 05:36:44.100';'2022-10-27 05:39:26.600';'2022-10-27 05:43:18.200';'2022-10-27 05:49:17.400';'2022-10-27 05:55:27.300';'2022-10-27 05:58:07.000';'2022-10-27 06:17:13.800';'2022-10-27 06:41:28.700';'2022-10-27 07:03:06.000';'2022-10-27 07:09:49.800';'2022-10-27 07:17:39.700';'2022-10-27 07:35:09.000';'2022-10-27 07:42:33.600';'2022-10-27 07:46:50.500';'2022-10-27 08:07:02.700';'2022-10-27 08:26:29.600';'2022-10-27 08:45:03.500';'2022-10-27 08:48:09.700';'2022-10-27 08:53:57.000';'2022-10-27 08:59:43.400';'2022-10-27 09:13:24.100';'2022-10-27 09:19:25.000';'2022-10-27 09:26:35.000';'2022-10-27 09:29:54.000';'2022-10-27 09:46:45.700';'2022-10-27 10:10:59.900';'2022-10-27 10:29:04.600';'2022-10-27 10:31:50.100';'2022-10-27 10:37:45.300';'2022-10-27 10:42:35.800';'2022-10-27 10:58:42.300';'2022-10-27 11:03:48.900';'2022-10-27 11:10:44.700';'2022-10-27 11:13:11.100';'2022-10-27 11:31:25.100';'2022-10-27 11:56:01.200';'2022-10-27 12:19:25.300';'2022-10-27 12:25:12.500';'2022-10-27 12:30:16.900';'2022-10-27 12:49:19.100';'2022-10-27 12:55:42.700';'2022-10-27 12:58:18.100';'2022-10-27 13:16:06.300';'2022-10-27 14:00:44.700';'2022-10-27 14:04:14.200';'2022-10-27 14:11:54.600';'2022-10-27 14:17:10.000';'2022-10-27 14:42:03.300';'2022-10-27 14:45:22.100';'2022-10-27 15:02:07.900';'2022-10-27 15:25:39.600';'2022-10-27 15:44:58.300';'2022-10-27 15:48:35.800';'2022-10-27 15:54:23.100';'2022-10-27 16:00:17.300';'2022-10-27 16:19:23.700';'2022-10-27 16:27:30.800';'2022-10-27 16:31:30.600';'2022-10-27 16:52:09.700';'2022-10-27 17:16:08.800';'2022-10-27 18:01:39.400';'2022-10-27 18:08:24.800';'2022-10-27 18:17:44.500';'2022-10-27 18:25:02.500';'2022-10-27 18:27:45.700';'2022-10-27 18:48:32.000';'2022-10-27 19:02:01.700';'2022-10-27 19:24:08.300';'2022-10-27 19:30:01.400';'2022-10-27 19:43:02.200';'2022-10-27 19:48:50.300';'2022-10-27 19:55:41.900';'2022-10-27 19:58:23.700';'2022-10-27 20:17:15.800';'2022-10-27 20:28:47.600';'2022-10-27 20:49:36.900';'2022-10-27 20:53:02.700';'2022-10-27 21:07:12.100';'2022-10-27 21:09:47.800';'2022-10-27 21:50:42.300';'2022-10-27 22:07:34.100';'2022-10-27 22:09:18.800'])
timeit(@()sort(a))
timeit(@()sort(datenum(a)))
Lets try some other numeric formats, which might be more accurate over the dates in question:
timeit(@()sort(juliandate(a)))
timeit(@()sort(exceltime(a)))
timeit(@()sort(posixtime(a)))
댓글 수: 3
Bruno Luong
2022년 11월 18일
편집: Bruno Luong
2022년 11월 18일
idx2 should be the same, but I think timeit() results without second output could not be representative as the algorithm might skip some branching.
The relative result also might change with the size of your data.
추가 답변 (2개)
Peter Perkins
2022년 11월 19일
It seems like the obvious response to
Any way faster than this one ?
[~, idx2] = sort(datenum(a), 1, 'ascend');
is to not pass in parameters that are the defaults. That cuts the time by not quite a factor of two for me. But maybe in your real code you need to pass in non-defaults.
Three things:
1) On my machine, the difference between these two small sorts
[~, idx2] = sort(a, 1, 'ascend');
[~, idx2] = sort(datenum(a), 1, 'ascend');
is literally a few microseconds. The usual response is, "Sure, but I'm doing this in a loop, so a few microseconds adds up". It would help to hear the details of that context.
2) Sure, for small sorts like your example there is a difference. But by the time you have 10000 elements
a = datetime(2022,1,1,0,0,randi(10000,10000,1));
the difference between the two is gone. And the total time is less than .5ms.
3) ALL of those conversions through away precision. In your example, you have resolution of at most .1s, so precision is probably not an issue. But in general, converting datetime to those numeric formats is A BAD IDEA.
댓글 수: 2
Bruno Luong
2022년 11월 22일
"But in general, converting datetime to those numeric formats is A BAD IDEA."
That's true.
For decades a resolution of 53-bit datenum is the only option, whereas in C 128-bit resolution is used for time enumeration on other languages.
Bruno Luong
2022년 11월 18일
편집: Bruno Luong
2022년 11월 18일
The first peak is due to JIT that is not kicks in.
Here is the result on my PC (code attached)
I have to chase what create the spikes every 33 iterations.
참고 항목
카테고리
Help Center 및 File Exchange에서 Dates and Time에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!