Description

MATLAB Code Can Easily Run Much Faster Than You Think!

From the series: MathWorks Research Summit

Yair Altman, Undocumented MATLAB

MATLAB^® is often viewed, incorrectly, as an inherently slow programming environment. Much of this misconception arises from suboptimal user code, as well as inefficient use of available MATLAB tools and functions. Moreover, many users assume that MATLAB code can only be sped up using vectorization and parallelization, and in cases where these are not possible or applicable for any reason, then nothing significant can be done to improve the code’s run time.

To dispel these misconceptions, Yair presents a small taste of the numerous potential speedup methods that can be applied to MATLAB code in a short whirlwind overview of several diverse speedup techniques. The presentation showcases several common use cases where simple MATLAB code changes and techniques can result in significant run-time speedups.

In his presentation, Yair discusses using the built-in Profiler tool in MATLAB, as well as simple yet effective loop optimizations, data caching, graphics rendering and interaction, and various tradeoffs that should be considered with code optimization. MATLAB users who require their code to run faster can use the presented techniques as a starting point for code optimization, keeping in mind that many other speedup techniques can be applied.

Published: 17 Mar 2023

Full Transcript

So my name is Yair Altman. I'm an independent MATLAB consultant, as Peter mentioned, I'm not a worker. I don't represent MathWorks. And everything I'm going to say is my own personal opinion. And take it with a grain of salt.

I run a blog and a website called Undocumented MATLAB, in which I feature a lot of very useful, interesting stuff, which is undocumented and unsupported, and, as mentioned, might change from one release to another. So if it might work today, it might not work in one release or two releases or maybe 10 years down the road.

But everything I'm going to say today is purely based on documented, fully documented, fully supported features of MATLAB. My aim today is to counter a claim that I sometimes hear from the people I work for, which is that since MATLAB is an interpreted language, it is inherently slow and therefore could only be used for some prototyping, not for really productive work. And if it runs slowly, then so be it because that's the way it is.

And my claim is that this is an incorrect perception, and it is due to a large extent because users sometimes don't take the time to invest in improving the runtime performance of their programs. And so the MATLAB product itself is not inherently slow. It is the program that is slow and could be optimized.

And I'll try and show several examples of how we can do that. There are lots of other examples, but we have a very limited amount of time.

So the first thing we should do when we try to improve the runtime speed, the runtime performance, of a program is to profile it. MATLAB has a built in profiler tool that anyone who has not used before should use it in order to identify runtime bottlenecks inside of the program. You should never try to improve the performance of a program before you understand where the real bottlenecks are.

And more often than not, when you run the profiler, you will detect bottlenecks that are unexpected, that are in locations that you did not envision before. So first thing to do is to run the profiler, see where the bottlenecks are, and then focus on those areas.

Once you identify the locations, you can start handling them and try to improve those specific locations. Here's one typical tip, which can be often used. With loops-- and loops are important because they run a specific piece of code many times over, sometimes there are thousands or millions times. And so any performance improvement that you make to the contents of a loop are multiplied by numerous times.

In this particular example, we see that the bolded items are all constant expressions. And there's no reason to recompute them a thousand times. Instead we should move them outside of the loop, a process which is called loop-invariant hosting. And this leads to a second version of the code, which runs much faster because all the constant expressions are only evaluated once.

This has a secondary benefit in that the loop becomes much simpler. And with this simpler loop, it then becomes clearer to see that it can be vectorized, leading to an even further improved version which runs a much faster, without a loop at all.

Here's another example, using caching. Caching is a general concept using used in software programming, not necessarily in MATLAB. And the idea is to store precomputed data in some memory construct, which is stored for later reuse.

So in this particular example, we're trying to convert a bunch of numeric MATLAB date numbers into their string representations, which is a relatively slow process, let's say one millisecond per iteration or something. And we're trying to do that, over thousands of them, numeric dates.

So the idea here is to cache all the data since January 2000, so that when we get to the runtime process of trying to figure out what the timestamps are, what the date strings are, we would already have them cached.

For this, we are using the "persistent" keyword in MATLAB, which stores the data in the function's workspace memory. And it doesn't go out of scope when the function returns to the user. So it remains in scope as long as the MATLAB session is active. And we do not modify or edit the function.

So if you take a look here, after we declare two variables as persistent. It could be unified into a single one. For simplicity's sake, I use two variables. We're checking whether the variable is initialized by default.

All the persistent variables are initialized to a value of an empty array. And then we initialize them to all the date string since 2000. We then check if the runtime values received to the function are available in the cache, and if so, we return it.

The end result of using this is that if we take a vector of a thousand dates from three years ago, give or, take until today, then running it in MATLAB on a particular setup takes 50 milliseconds. So, apparently, less than 1 millisecond per day string. But still, 50 milliseconds for the 1,000 dates.

If we use the cached version, we see that the initial setup took about 200 milliseconds because we cached all the data since the year 2000. But then the runtime invocations took 0.3 milliseconds.

This could be further improved by caching on demand, meaning I start off with an empty cache, and I only cache the data which is being fed. And then every time the program will encounter that particular date from now on, it will reuse it. But even without this optimized version, we see that we still get a large runtime benefit.

Now, I mentioned vectorization, which uses implicit multithreading. A separate type of parallelization is using explicit multiprocess parallelization and using the Parallel Computing Toolbox.

And we have, in addition to the Parallel Computing Toolbox, which enables you to parallelize on different cores on your computers using GPU, we also have the ability to paralyze over clusters and grids using the distributed computing server, or the new name of it, which is the MATLAB Parallel Server.

I would suggest that when you do that, you try and control the number of workers that are assigned in the parallel pool to fit your specific needs. By default, MATLAB launches a pool which has a number of workers, depending on the number of actual cores that you have on your machine, assuming you're using a local cluster.

But in certain cases, you might want to modify that. For example, if you have a heavy I/O intensive process or program, then you might want to launch more workers than cores in order to process the I/O in parallel. Because while the I/O is waiting for the data to be sent or to be received, the CPUs are idle.

On the other hand, using more workers entails additional setup time to create the workers, additional memory that each of the workers use, and broadcast variables, which are being sent to the workers. So there are overheads. So you should only do that in certain cases.

When you do have such multiple workers, try to reduce the amount of data which is broadcast to the workers. This communication time between the main MATLAB process and the workers takes a lot of time. When you reduce the amount of broadcast data, it significantly improves the overall processing time.

Finally, when you try and use parallelization, you should take into account Amdahl's law. If you're not familiar with it, you should look it up. It's available on Wikipedia or whatever resource you'd like to look it on. And, basically, this explains why, when you have a set of four workers, for example, you should not expect a four-time speedup of your program. It is something which is simply unrealistic for all sorts of inherent reasons, which have nothing at all to do with MATLAB.

So if you have a zillion cores running a zillion workers, and your program only speeds up by a factor of 2 don't come to MATLAB or to MathWorks and complain, because the problem is inherent. OK. And nothing can be parallelized a zillion times.

Moving on to graphics, there are a bunch of things that we can do to improve the performance of a graphics program. The first thing that any of you should do when you check your computer is to install the latest graphics driver. It's a very simple operation. And it's surprising how effective it is.

If your computer doesn't use the latest graphics driver, it is possible that MATLAB cannot recognize it. And, therefore, instead of using the hardware acceleration which is available in the card, it would fall back to using a software emulation of OpenGL, which is much slower.

By simply updating the driver of your graphics card to the latest version from the card manufacturer, you are able to enable MATLAB to use hardware acceleration, which is much faster. I show examples of the other four bullets in a moment, so let's jump to the example.

This is a real-world example of a program created by a group headed by a Professor Brendan Meade at Harvard University's Crustal Dynamics research lab. And they were trying to analyze tectonic movements based on numerous sensors that recorded GPS movements around the world.

And so there was a ton of data that needed to be analyzed in various different slices-- velocity, distance, and that sort of thing. And here we're just focused, in this example, on the area around Australia as you can see. And even in that small portion of the world, there are numerous vectors and arrows and pointers and stuff like that.

And it turned out that because of this, the graphics became so slow that everything came to a standstill. When anyone checked on one of the check boxes, for example, to add a new layer of data, they had to wait a few minutes until things became available. By the way, there is a link there for additional details in case anyone wants to get more information.

So the original implementation in this particular case was simply to have a line of beginning and ending longitudes and latitudes being displayed, which is a typical naive implementation.

And this took about half a minute in this particular because, one of the reason was, it did it for the entire world, although we're only displaying the area around Australia. So the natural solution to that would be to limit the display to only display those lines in the area shown in the axis, which is what is done in this faster code, which did run faster.

Finally, an even better solution would be to modify the code so that instead of having thousands of different line components which are being displayed separately, we only have a single line object which is interspaced with nonvalues. And this displays in exactly the same way but much, much faster, as you can see here.

When we update the graphics, instead of simply having a loop that plots the data and calls "drawnow" to refresh the display, it is faster to update the x and y data of the displayed line items. So updating the properties of existing data is always faster than clearing the axis and redisplaying the entire data, much faster.

In addition to that, instead of using drawnow within each loop iteration, we should either put the drawnow action at the end of the loop, so that it only refreshes the display after all of the updates have been completed, or if you want to update after each iteration, use the limit rate parameter in order to ensure that drawnow doesn't run too often.

To conclude, I've just shown a very few simple examples of how, with very small modifications to the code, we can dramatically improve the runtime of MATLAB programs. And so we cannot really say that there is a problem with MATLAB. The problem is very often in our program, which is suboptimal-- not optimized for speed.

We as engineers tend to invest a lot of time in making sure that our program has all the functionality and accuracy that we wish it to have. But sometimes we don't invest even a fraction of that time to ensure that it also runs as quickly as we need to. And if we just took a small amount of time to invest in that, things could be much different.

So to leave you with a parting thought, if you have a tool and you're misusing it, then the problem is not with the tool. Perhaps it's with the misuse. So with that, I'd like to thank you for your attention.

I'll still be around here for today, in case anyone wants to talk with me. I'll be happy to answer questions. I won't be here tomorrow. So catch me here today. I should be pretty recognizable, I should think. [CHUCKLES] Thank you.

Related Resources

Related Products

Feedback

MATLAB

Up Next:

This talk aims to strengthen the teaching of automatic control, digital control systems, and advanced control in the engineering curricula of B.S. and M.S. courses by using appropriate approaches to improve the learning outcomes for students. — Enhancing Learning by Integrating Theory and Practice

View full series (25 Videos)

MATLAB Code Can Easily Run Much Faster Than You Think!

Related Products

MATLAB

Up Next:

Related Videos: