You can never achieve realtime, but you can get to pretty low times ( like 30ms or so), depending on what you want to do.
So what exacly do you want to do? You can check the Computer Vision System Toolbox ( this one here ) description for some nice examples how to work with videostreams and change them, e. g. detecting and tracking a face in a live video feed.
The more you want to process, the higher your computing time gets.