This is a cool piece of engineering.
The prototypical control system looks like this:
Your input on the left, is called the "set point". In my case that's a target in the visual field. The "plant" is what you're trying to control. In my case that's an eyeball. The "feedback elements" report where the eye currently is, and the "error detector" simply subtracts the current value from the desired value to get an error signal. The controller uses the error signal to drive the plant (eyeball) to the set point (desired location).
Only... Scruffy needs a model of the environment, because that's what driving the eye movements - it's retinal imagery. So there's something missing in the diagram, which is the retina. And the retina has a delay associated with it. So we'll put the retina into the diagram. Also, that thing called a controller has to have a model inside it. And we're going to leave it very general, without specifying what it is or how it's formed. So the new diagram looks like this:
The retina is on the right and it has a delay of k time steps. The model is the part in blue, it has replaced the controller. It has a delay that matches the retinal delay. The actual model lives in the part called G-hat, and we're going to let that self organize, and once that's done we're going to control it from outside. So C is the new controller, and G-hat is a model of the action of C, as determined by two error signals, one delayed from the plant, and another delayed from the model. This control loop will minimize both errors.
Control engineers will recognize this architecture as a Smith predictor, and that's exactly what it is.
Now for the final trick, we're going to slide the delays (both of them), so the point in time the loop optimizes is not t, but rather t+z, where z starts at k/2 and then gets optimized. In other words the self organizing system will find the optimal control point. Since the sign is positive z represents a prediction, and by sliding back and forth we're optimizing the time of the results against the time of the prediction. When they perfectly match, we have a robust optimal control system.
The prototypical control system looks like this:
Your input on the left, is called the "set point". In my case that's a target in the visual field. The "plant" is what you're trying to control. In my case that's an eyeball. The "feedback elements" report where the eye currently is, and the "error detector" simply subtracts the current value from the desired value to get an error signal. The controller uses the error signal to drive the plant (eyeball) to the set point (desired location).
Only... Scruffy needs a model of the environment, because that's what driving the eye movements - it's retinal imagery. So there's something missing in the diagram, which is the retina. And the retina has a delay associated with it. So we'll put the retina into the diagram. Also, that thing called a controller has to have a model inside it. And we're going to leave it very general, without specifying what it is or how it's formed. So the new diagram looks like this:
The retina is on the right and it has a delay of k time steps. The model is the part in blue, it has replaced the controller. It has a delay that matches the retinal delay. The actual model lives in the part called G-hat, and we're going to let that self organize, and once that's done we're going to control it from outside. So C is the new controller, and G-hat is a model of the action of C, as determined by two error signals, one delayed from the plant, and another delayed from the model. This control loop will minimize both errors.
Control engineers will recognize this architecture as a Smith predictor, and that's exactly what it is.
Now for the final trick, we're going to slide the delays (both of them), so the point in time the loop optimizes is not t, but rather t+z, where z starts at k/2 and then gets optimized. In other words the self organizing system will find the optimal control point. Since the sign is positive z represents a prediction, and by sliding back and forth we're optimizing the time of the results against the time of the prediction. When they perfectly match, we have a robust optimal control system.