Search string for special characters

Matt 2017년 9월 8일
댓글: Milos Matovic 2020년 11월 11일
Hi all,
I have a program that saves and loads data to and from a .mat file. Currently the UIPUTFILE and save function method I am using allows the user to save files with special characters. They then cannot load this file again. The most common issue is users saving the file with a period.
How can I search the string for special characters? I want to throw up an error message if the user tries to save the file with a filename that MATLAB will not be able to load with UIGETFILE and the load function.
I have tried regexp but I am struggling to make it do what I want, even using backslashes in front of the characters in places.
if ~isempty(regexp(filename, '[/\*:?"<>|!]'))
uiwait(msgbox('Filename contains illegal characters.' 'Filename Error','error','modal'));
Milos Matovic
Milos Matovic 2020년 11월 11일
Matt, your code works for me and it covers all invalid chars for file names as specified by Windows.
Only change i made is added a double backslash because it is a escape character so regular expression was not accounting for it.
if ~isempty(regexp(filename, '[/\\*:?"<>|!]'))

Stephen23 2017년 9월 8일
편집: Stephen23 2017년 9월 8일
It is invariably easier to build a short list of the permitted characters than an long (and most likely incomplete) list of forbidden characters (trust me, there are more characters out there than you would believe).
Adapt this to your list of "permitted characters":
>> rgx = '^[\w-]+\.[\w]+$';
>> regexp('okayname.txt',rgx)
ans = 1
>> regexp('bad()nam!e.txt',rgx)
ans = []
I know that it can be a challenge to create a working regular expression, and so to help with this I wrote a tool iregexp that you can download from MATLAB FEX:
It lets you try different parse and match string combinations, and shows regexp's outputs in real time as you type.

Pal Szabo
Pal Szabo 2017년 9월 8일
Can't you use strrep? You can replace the special characters with something which works.
Matt 2017년 9월 8일
Hi, thanks, but I need to identify them to provide an error message. I may then remove the illegal characters with strrep to pre-fill the name box on the UIPUTFILE window when the error message is dismissed.

Matt 2017년 9월 8일
I think I have answered my own question.
This finds the use of any character other than a A-Z, a-z, 0-9, a space, hyphen, or underscore.
file = '?T!e[s$t%f.i _l-e_1.mat'
file_without_extension = file(1:(length(file)-4)) % to prevent removal of period before file extension
illegal_chars = regexp(file_without_extension,'[^\w \s \-]+','match')
Stephen23 2017년 9월 8일
편집: Stephen23 2017년 9월 8일
Some notes:
  • Do not do this, it is an unreliable and obfuscated way to remove a file extension:
file_without_extension = file(1:(length(file)-4))
Instead simply use fileparts: it is simpler and correct for any length extension.
  • \s matches all whitespace characters. Do you really want vertical tab in your filenames?

