How to use Unicode numeric values in regexprep?

Question

Vlad Atanasiu 2024년 3월 28일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2100121-how-to-use-unicode-numeric-values-in-regexprep

답변: Stephen23 2024년 3월 28일

채택된 답변: Yash

MATLAB Online에서 열기

How can "Häagen-Dasz" be converted to "Haagen-Dasz" using Uincode numeric values? For example,

regexprep('Häagen-Dasz','ä','A')

works fine, but

regexprep('Häagen-Dasz','\x{C4}','a')

does not. Here, the hexadecimal \x{C4} stands for [latin capital letter a] with diaeresis, i.e. [ä].

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

VBBV 2024년 3월 28일

I am not sure if i understand your question right, but Read this answer below

https://www.mathworks.com/matlabcentral/answers/2004827-what-unicode-characters-can-be-rendered-in-the-command-window

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Yash 2024년 3월 28일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2100121-how-to-use-unicode-numeric-values-in-regexprep#answer_1432901

편집: Yash 2024년 3월 28일

MATLAB Online에서 열기

Hi Vlad,

'\x{C4}' represents the Unicode character Ä (Latin Capital Letter A with Diaeresis) in hexadecimal notation.

If you want to replace ä (Latin Small Letter A with Diaeresis), you should use \x{E4}, which is its Unicode hexadecimal representation.

In the context of your question, you're looking to replace ä with a. The correct approach would be to use the Unicode numeric value for ä in the regex and replace it with a. Here is the code:

regexprep('Häagen-Dasz','\x{E4}','a')
ans = 'Haagen-Dasz'

Hope this helps!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 2

Stephen23 2024년 3월 28일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2100121-how-to-use-unicode-numeric-values-in-regexprep#answer_1432931

MATLAB Online에서 열기

inp = 'Häagen-Dasz';
baz = @(v)char(v(1)); % only need the first decomposed character.
out = arrayfun(@(c)baz(py.unicodedata.normalize('NFKD',c)),inp) % remove diacritics.
out = 'Haagen-Dasz'

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 3

VBBV 2024년 3월 28일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2100121-how-to-use-unicode-numeric-values-in-regexprep#answer_1432891

MATLAB Online에서 열기

regexprep('Häagen-Dasz','ä','A')
ans = 'HAagen-Dasz'
regexprep('Häagen-Dasz','ä','\x{C4}')
ans = 'HÄagen-Dasz'

댓글 수: 2
없음 표시없음 숨기기

VBBV 2024년 3월 28일

이동: VBBV 2024년 3월 28일

MATLAB Online에서 열기

regexprep('Häagen-Dasz','\x{e4}','a')
ans = 'Haagen-Dasz'

VBBV 2024년 3월 28일

The unicode character for small a is \x{e4}

댓글을 달려면 로그인하십시오.

How to use Unicode numeric values in regexprep?

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (2개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

How to use Unicode numeric values in regexprep?

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (2개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2 없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시없음 숨기기