Efficient data structures for object arrays
조회 수: 1 (최근 30일)
I have a vector class which I'll sketch out below
classdef VectorClass < handle
coordinates (1,3) double
function this = VectorClass(coords)
this.coordinates = coords;
this.coordinates = this.coordinates + otherVec.coordinates;
So far so good... If I have a couple of vectors I can play with them nicely. If have a lot vectors though things don't go so well...
for ii = 1:100000
vec(ii) = VectorClass(rand([1,3])) % Make loads of vectors
vecOther = VectorClass([1,1,1]);
vec(:).addVector(vecOther); % Extend all vectors. Takes ages.
% NB: I've done various bits of subsref and subsasgn overloading to make
% this kind of thing possible without loops
If I was just doing procedural programming I would be able to take full advantage of Matlab's vectorisation and just do something like
vec = rand(10000,3);
vecOther = [1,1,1];
vecNew = vec + vecOther; % takes about a nanosecond
But doing it with objects is not very efficient, presumably because the data is scattered about all over the memory.
One way I can think of to improve this would be to have a superclass that defines a storage array for all vectors something like this
classdef VectorManager < handle % Singleton class
coordsStorage (:,3) double
coordsPointers (:) VectorClass
function rowIndex = assignStorage(this,vecObj)
rowIndex = % Return the index of the next free row in .coordsStorage
this.coordsPointers(rowIndex) = vecObj;
this.coordsStorage(index,1:3) = newCoords;
function coords = getCoords(this,index)
coords = this.coordsStorage(index,1:3); % Can pull out as many rows as needed, all at once
So the coordinates for each vector object are stored in rows in coordsStorage, and the handles associated with each row are stored in coordsPointers. When instantiated, each VectorClass object is assigned a row in the coordsStorage array provided by VectorManager. When you want to do operations on many vectors, you can get a logical mask to the relevant rows in coordsStorage and everything will run much faster.
However, the problem now comes that we need to do some kind of memory management for coordsStorage. VectorClass objects are being created and destroyed, and eventually coordsStorage will fill up. To get round this, when VectorClass objects are destroyed, we need to free up that row so that it can be used by other VectorClass objects. Effectively we are running a kind of heap, so we need garbage collection, heap compaction and so on. Although I can see how this could be made to work, it all sounds pretty complicated. There must be an easier way.
Does anyone out there have any experience with this kind of thing, or have done something similar in the past? Essentially, I want the functionality of the first class (VectorClass) where each object has a unique handle which can be passed about and operated on like it was a single object, but with the speed of Matlab's vectorisation.