Get all used variable names from a script

조회 수: 13 (최근 30일)
Robert
Robert 2021년 5월 7일
댓글: Jan 2021년 5월 8일
As in the check "Check usage of restricted variable names" I want to check the names of variables used in a script, only against our more explicit naming conventions. But using symvar also returns keywords like "function", "if" or "end" and also, what is much worse, any word found in comments and even "-delimited strings. Is there any function that can return me all variable names used in a script file or string, but nothing else?
Or to be a bit more precise, as Stephen Cobeldick correctly hinted to the dynamic execution nature of scripting languages: variable names, that are explicitly used in a function header as input or output variables (not varargin, varargout), and variable names explicitly used as left hand arguments in assignments like a = <some expression> or [a, b] = <expression>. That certainly would be sufficient, as the execution context here is eml, so apart from local variables data flow is pretty much under control with signal i/o and data store memory requiring registration as Stateflow.Data objects.
  댓글 수: 1
Stephen23
Stephen23 2021년 5월 7일
편집: Stephen23 2021년 5월 7일
"Is there any function that can return me all variable names used in a script file or string, but nothing else?"
No.
Variables can be created dynamically, even by functions called from your script/function (or functions that they call...). Function scope can also change dynamically, so which functions get called can also change (or even deciding if something is a function or a variable). Only actually running the code can resolve this stack: static code analysis is not sufficient.
It might be possible to provide an "estimate" based on static code analysis, but on the understanding that it can diverge from what variables are "used" when the code is actually run.

댓글을 달려면 로그인하십시오.

답변 (1개)

Jan
Jan 2021년 5월 7일
편집: Jan 2021년 5월 7일
It is hard to parse the code exhaustively for names of variables:
  • Mask strings and char's. This is not trivial:
'"asd"', '"asd', "'asd'", "'asd", "asd"', 'asd''', ...
  • Recognize and remove comments. This inlcudes block comments between %{ and %} as well as "..." .
  • Distinguish the creation of indexed variables from function calls:
f(1);
f(1) = 0;
v = f(1);
v = f ...
(1);
  • Cope with eval, evalin, assignin
  • If you are talking of scripts instead of functions, it is hard to identify if sum(1:5) means the built-in function or if another script has redefined sum as avariable before.
Maybe the best is to run the code and update a list of variables after each line of code:
function Out = TrackVariables(mFile, Data)
% USAGE:
% If you really want a hardcore debugging:
% 1. TrackVariables('D:\MatlabCodes\yourFcn.m')
% This injects a DBSTOP in each line of the code, which calls the
% function TrackVariables with the output of WHOS as 2nd input.
% You can do this for multiple functions at the same time.
% 2. Call yourFcn() or the main routine.
% After each line the output of WHOS is forwarded to TrackVariables and
% the names are stored persistently. If you want, you can expand this
% to store the sizes or types of the variables also.
% 3. Request the collected data by:
% List = TrackVariables();
% 4. Clean up brutally:
% dbclear all
%
% This is NOT a recommendation for using this function to control the
% quality of code, but a brute hack only. If you can identify a
% miss-spelled variable, it was useful.
% Advantage: It tracks even the evil dynamic creation of variables.
% Limitations: The code execution is slowed down. It tracks only branches
% of the code, which actually run, so this might remain invisible:
% if rand < 0.001; KILLER = 17; end
%
% Use MLINT for a smart code analysis.
%
% (C) 2021, Jan, Heidelberg, License: CC BY-SA 3.0
persistent List
if isempty(List)
List = struct();
end
switch nargin
case 1 % Inject a dbstop in each line:
[~, mName] = fileparts(mFile);
Cmd = sprintf('TrackVariables(''%s'', whos)', mName);
Str = strsplit(fileread(mFile), '\n');
for k = 1:numel(Str)
if ~isempty(Str{k})
dbstop('in', mName, 'at', sprintf('%d', k), 'if', Cmd)
end
end
List.(mName) = {};
case 2 % Called for collecting variables:
List.(mFile) = unique(cat(2, List.(mFile), {data.name}));
Out = false; % Do not stop the debugger
case 0 % Flush the list:
Out = List;
List = [];
end
end
Call this as:
TrackVariables('YourFunc.m');
YourFunc % Or the main program
List = TrackVariables;
This does not consider, if the variable is created in subfunctions or nested functions.
I do not trust such meta-programming techniques. Exhaustive unit-testing is more powerful. Most of all, avoid scripts, if you need reliable code.
  댓글 수: 6
Robert
Robert 2021년 5월 7일
편집: Robert 2021년 5월 7일
Hi Jan, what code do you mean? The C-mex code of my parser to come? I'd tink I'd publish that. But generally my target is to identify any explicit variable name as described in my reply to Stephen Cobeledick's comment above. The object might be any code liable to be typed into an eml-function-block. My parsing-mex should 'mask' (or rather eliminate) all occurrences of comments and strings. Anyways, if there is no means of identifying explicit variable names as described before by some API-function, I'll stick to my own implementaion and will let you know, when I'm at some point of publishing (if you're interested).
Jan
Jan 2021년 5월 8일
I meant a parser, which I have written as M-function.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Introduction to Installation and Licensing에 대해 자세히 알아보기

제품


릴리스

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by