Customizable Natural-Order Sort

버전 3.4.5 (80.5 KB) 작성자: Stephen23
Alphanumeric sort of a cell/string/categorical array, with customizable number format.
다운로드 수: 4.8K
업데이트 날짜: 2023/7/13

라이선스 보기

편집자 메모: This file was selected as MATLAB Central Pick of the Week

To sort any file-names or folder-names use NATSORTFILES:
To sort the rows of a string/cell array use NATSORTROWS:
Summary
Alphanumeric sort the text in a string/cell/categorical array. Sorts the text by character code taking into account the values of any number substrings. Compare for example:
X = {'a2', 'a10', 'a1'};
sort(X)
ans = 'a1' 'a10' 'a2'
natsort(X)
ans = 'a1' 'a2' 'a10'
By default NATSORT interprets all consecutive digits as integer numbers, the number substring recognition can be specified using a regular expression, allowing the number substrings to have:
  • a +/- sign
  • a decimal point and decimal fraction
  • E-notation exponent
  • decimal, octal, hexadecimal or binary notation
  • Inf or NaN values
  • criteria supported by regular expressions: lookarounds, quantifiers, etc.
And of course the sorting itself can also be controlled:
  • ascending/descending sort direction
  • character case sensitivity/insensitivity
  • relative order of numbers vs. characters
  • relative order of numbers vs NaNs
Examples
%% Multiple integers (e.g. release version numbers):
>> A = {'v10.6', 'v9.10', 'v9.5', 'v10.10', 'v9.10.20', 'v9.10.8'};
>> sort(A) % for comparison.
ans = 'v10.10' 'v10.6' 'v9.10' 'v9.10.20' 'v9.10.8' 'v9.5'
>> natsort(A)
ans = 'v9.5' 'v9.10' 'v9.10.8' 'v9.10.20' 'v10.6' 'v10.10'
%% Integer, decimal, NaN, or Inf numbers, possibly with +/- signs:
>> B = {'test+NaN', 'test11.5', 'test-1.4', 'test', 'test-Inf', 'test+0.3'};
>> sort(B) % for comparison.
ans = 'test' 'test+0.3' 'test+NaN' 'test-1.4' 'test-Inf' 'test11.5'
>> natsort(B, '[-+]?(NaN|Inf|\d+\.?\d*)')
ans = 'test' 'test-Inf' 'test-1.4' 'test+0.3' 'test11.5' 'test+NaN'
%% Integer or decimal numbers, possibly with an exponent:
>> C = {'0.56e007', '', '43E-2', '10000', '9.8'};
>> sort(C) % for comparison.
ans = '' '0.56e007' '10000' '43E-2' '9.8'
>> natsort(C, '\d+\.?\d*(E[-+]?\d+)?')
ans = '' '43E-2' '9.8' '10000' '0.56e007'
%% Hexadecimal numbers (with '0X' prefix):
>> D = {'a0X7C4z', 'a0X5z', 'a0X18z', 'a0XFz'};
>> sort(D) % for comparison.
ans = 'a0X18z' 'a0X5z' 'a0X7C4z' 'a0XFz'
>> natsort(D, '0X[0-9A-F]+', '%i')
ans = 'a0X5z' 'a0XFz' 'a0X18z' 'a0X7C4z'
%% Binary numbers:
>> E = {'a11111000100z', 'a101z', 'a000000000011000z', 'a1111z'};
>> sort(E) % for comparison.
ans = 'a000000000011000z' 'a101z' 'a11111000100z' 'a1111z'
>> natsort(E, '[01]+', '%b')
ans = 'a101z' 'a1111z' 'a000000000011000z' 'a11111000100z'
%% Case sensitivity:
>> F = {'a2', 'A20', 'A1', 'a10', 'A2', 'a1'};
>> natsort(F, [], 'ignorecase') % default
ans = 'A1' 'a1' 'a2' 'A2' 'a10' 'A20'
>> natsort(F, [], 'matchcase')
ans = 'A1' 'A2' 'A20' 'a1' 'a2' 'a10'
%% Sort order:
>> G = {'2', 'a', '', '3', 'B', '1'};
>> natsort(G, [], 'ascend') % default
ans = '' '1' '2' '3' 'a' 'B'
>> natsort(G, [], 'descend')
ans = 'B' 'a' '3' '2' '1' ''
>> natsort(G, [], 'num<char') % default
ans = '' '1' '2' '3' 'a' 'B'
>> natsort(G, [], 'char<num')
ans = '' 'a' 'B' '1' '2' '3'
%% UINT64 numbers (with full precision):
>> natsort({'a18446744073709551615z', 'a18446744073709551614z'}, [], '%lu')
ans = 'a18446744073709551614z' 'a18446744073709551615z'

인용 양식

Stephen23 (2024). Customizable Natural-Order Sort (https://www.mathworks.com/matlabcentral/fileexchange/34464-customizable-natural-order-sort), MATLAB Central File Exchange. 검색됨 .

MATLAB 릴리스 호환 정보
개발 환경: R2010b
R2009b 이상 릴리스와 호환
플랫폼 호환성
Windows macOS Linux
카테고리
Help CenterMATLAB Answers에서 String Parsing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
버전 게시됨 릴리스 정보
3.4.5

* Accept decimal comma as well as decimal point.
* HTML example use string arrays.

3.4.4

* Add testcases.

3.4.3

* Now R2009b compatible.

3.4.2

* Edit description & help.

3.4.1

* Edit description & help.

3.4.0

* Add plenty of testcases.
* Fix bug in descending sort with an empty input array.

3.3.0

* Improve test function, add test cases.

3.2.0

* Update TESTFUN.

3.1.0

* More robust TESTFUN pretty-print code.
* Improve option checking.

3.0.5

* Improve examples.

3.0.4

* Correct summary.

3.0.3

* Improve string handling.

3.0.2

* Simplify numeric class handling.
* Add permutations test examples.

3.0.1

* handle single element with no number.

3.0.0

* Accepts and sorts a string array, categorical array, cell array of char, etc.
* Regular expression and optional arguments may be string or char.
* Simplify char<num algorithm.
* Simplify debugging output cell array.

2.1.2

* Consistent alignment tab/spaces.

2.1.1

* Add error IDs.

2.1.0

* Fix handling of char<num.

2.0.0

Total rewrite: faster and less memory.
* Remove 'asdigit' option.
* Rename 'beforechar' and 'afterchar' to 'num<char' and 'char<num'.
* Add options 'num<NaN' and 'NaN<num'.
* Improve HTML documentation.
* Include testcases.

1.11.0.0

* Consistent internal variable names.

1.10.0.0

* Minor help edit.
* Improve input checking.
* Improve blurb and HTML.
* Add HTML documentation.

1.9.0.0

* Improve binary numeric handling.
* Improve handling of skipped fields.
* Add an example of skipped field usage.

1.8.0.0

* Improved binary substring parsing.
* Better examples.

1.7.0.0

- Update documentation only, improve examples.

1.6.0.0

- Add binary numeric parsing.
- Improve input checking.
- Replace multiple debugging output arrays with one cell array.
- Allow lookarounds in regular expression.

1.5.0.0

- Simplify hexadecimal example.
- Correct output summary.

1.4.0.0

- Now parses hexadecimal and octal substrings.
- int64 and uint64 parsed at full precision.
- Allow <options> in any order.
- For debugging: return indices of character and numeric arrays.

1.3.0.0

- Implement more compact sort algorithm.
- "sscanf" numeric format can be controlled by an optional input argument.
- Provide use examples.
- Output debugging arrays now char+numeric.

1.1.0.0

- Add examples showing different numeric tokens.
- Case-insensitive sort is now default.

1.0.0.0