physics of eye movements

scruffy · Jan 13, 2026

Happy New Year everybody. I spent some time over the holidays away from California and not working, which was nice. Upon return though, there was a large box full of robot parts at the front door, and i've been trying to make room for it.

So I'm gonna stop talking about this for a while, and just "do". There's a lot of work in it. But I'll share this one interesting thing with you, check this out -

So a human eye, is about an inch big. It weighs about 8 grams. To move it we attach 6 motors, 3 pairs in push-pull for horizontal, vertical and oblique.

The first part of the requirement is, it has to cover the field of vision. The field of vision looks like this, it's about 200 degrees horizontally and 130 degrees vertical, and the central 120 degrees horizontally handle depth perception, but the fovea is only 6 degrees of that.

The eye has to move very fast! Humans can do 3 saccades per second, with a maximal velocity of 700 degrees/sec or so. So these servo motors have to be very tight!

It turns out, the fastest commercially available servo motor BARELY meets this requirement. The PTK 7308MGD can handle about 900 degrees/sec. It runs on 8.4 volts, which is convenient. It can easily move 8 grams, and we have six of them.

So here's the idea: you're walking down the street and suddenly you see a shadow in the extreme periphery of your visual field. It causes you to turn your eyes and then your head. The initial saccade might cover 100 degrees, and it has to be completed and accurate within 1/3 second.

Our reflexes are arranged so the eyes move first, and then to foveate a spot in the extreme periphery the head has to turn. WHILE the head is turning the eyes remain focused on the target (this is the vestibulo-ocular reflex).

This is do-able with servo motors and Arduinos, but remember, we're going to train this thing like a baby. There is NO programming involved, only self organization. Humans have stretch receptors and pain receptors in the eyes, so if a baby tries to move its eyes too far or too fast it'll stop, it's self limiting. We have to put very careful controls on our robot version, to keep it from going off the rails, so to speak.

I tried doing the control systems theory on this stuff. It's not easy. The plant requires a high degree of stability to handle the full range of motion. So we need to introduce some nonlinearity, maybe sigmoid curves so the deviation can't exceed 100 degrees, that kind of thing.

Before we can train the gaze reflex, we have to train the opposing oculomotor pairs so they work smewthly with each other. This is done using the spindle (stretch) receptors in a closed feedback loop with both sets of drivers. This part is pretty easy because the feedback is direct - if one motor moves, the other one senses it and reports it. Piece o' cake for a self organizing neural network.

The harder part is the fixation reflex, which requires feedback from visual cortex. A very VERY crude version of it could be done with the brainstem alone, but it would not be very accurate (the eyes would end up "in the vicinity of" the shadow but not "on" it). Remember, the SAME system that does these 100 degree emergency saccades, also handles precision reading. One could argue it's a different "kind" of saccade because one occurs within the fovea while the other occurs in the periphery, and it is true there are two independent channels (called X cells and Y cells in the retina, X cells handling detail near the fovea and Y cells handling motion in the periphery).

Which leads us to a whole 'nother discussion about how fast does the VIDEO have to be. 30 fps is not enough. It has to be 200 fps to calculate the motion with any accuracy. The motion capture people mostly use 120 fps, if you've played with motion capture you know the limitations. Moving outlines come out "pretty" good but not perfect.

At 120 fps, 1/3 of a second will give us 40 frames, and in that time we have to turn every pixel into a Y cell. The parallel processing in the aggregate of Y cells will give us VERY good resolution. Basically we're doing hard wired convolutions on every pixel (using a CUDA), turning every pixel into a filter. For every pixel, we get all possible information for that point, including the velocity of the changes in luminance and color. If we have this information for neighboring pixels, we can completely determine the motion and direction of motion at every point.

'Nuff said. It's a challenge. I need a name for my robot. What should I call it?

MisterBeale · Jan 13, 2026

scruffy said:
I need a name for my robot. What should I call it?

The Beholder?

SweetSue92 · Jan 13, 2026

scruffy said:
Happy New Year everybody. I spent some time over the holidays away from California and not working, which was nice. Upon return though, there was a large box full of robot parts at the front door, and i've been trying to make room for it.

So I'm gonna stop talking about this for a while, and just "do". There's a lot of work in it. But I'll share this one interesting thing with you, check this out -

So a human eye, is about an inch big. It weighs about 8 grams. To move it we attach 6 motors, 3 pairs in push-pull for horizontal, vertical and oblique.

The first part of the requirement is, it has to cover the field of vision. The field of vision looks like this, it's about 200 degrees horizontally and 130 degrees vertical, and the central 120 degrees horizontally handle depth perception, but the fovea is only 6 degrees of that.

View attachment 1205109

The eye has to move very fast! Humans can do 3 saccades per second, with a maximal velocity of 700 degrees/sec or so. So these servo motors have to be very tight!

It turns out, the fastest commercially available servo motor BARELY meets this requirement. The PTK 7308MGD can handle about 900 degrees/sec. It runs on 8.4 volts, which is convenient. It can easily move 8 grams, and we have six of them.

So here's the idea: you're walking down the street and suddenly you see a shadow in the extreme periphery of your visual field. It causes you to turn your eyes and then your head. The initial saccade might cover 100 degrees, and it has to be completed and accurate within 1/3 second.

Our reflexes are arranged so the eyes move first, and then to foveate a spot in the extreme periphery the head has to turn. WHILE the head is turning the eyes remain focused on the target (this is the vestibulo-ocular reflex).

This is do-able with servo motors and Arduinos, but remember, we're going to train this thing like a baby. There is NO programming involved, only self organization. Humans have stretch receptors and pain receptors in the eyes, so if a baby tries to move its eyes too far or too fast it'll stop, it's self limiting. We have to put very careful controls on our robot version, to keep it from going off the rails, so to speak.

I tried doing the control systems theory on this stuff. It's not easy. The plant requires a high degree of stability to handle the full range of motion. So we need to introduce some nonlinearity, maybe sigmoid curves so the deviation can't exceed 100 degrees, that kind of thing.

Before we can train the gaze reflex, we have to train the opposing oculomotor pairs so they work smewthly with each other. This is done using the spindle (stretch) receptors in a closed feedback loop with both sets of drivers. This part is pretty easy because the feedback is direct - if one motor moves, the other one senses it and reports it. Piece o' cake for a self organizing neural network.

The harder part is the fixation reflex, which requires feedback from visual cortex. A very VERY crude version of it could be done with the brainstem alone, but it would not be very accurate (the eyes would end up "in the vicinity of" the shadow but not "on" it). Remember, the SAME system that does these 100 degree emergency saccades, also handles precision reading. One could argue it's a different "kind" of saccade because one occurs within the fovea while the other occurs in the periphery, and it is true there are two independent channels (called X cells and Y cells in the retina, X cells handling detail near the fovea and Y cells handling motion in the periphery).

Which leads us to a whole 'nother discussion about how fast does the VIDEO have to be. 30 fps is not enough. It has to be 200 fps to calculate the motion with any accuracy. The motion capture people mostly use 120 fps, if you've played with motion capture you know the limitations. Moving outlines come out "pretty" good but not perfect.

At 120 fps, 1/3 of a second will give us 40 frames, and in that time we have to turn every pixel into a Y cell. The parallel processing in the aggregate of Y cells will give us VERY good resolution. Basically we're doing hard wired convolutions on every pixel (using a CUDA), turning every pixel into a filter. For every pixel, we get all possible information for that point, including the velocity of the changes in luminance and color. If we have this information for neighboring pixels, we can completely determine the motion and direction of motion at every point.

'Nuff said. It's a challenge. I need a name for my robot. What should I call it?

I picked up jigsaw puzzling as a retirement hobby, but I can only do it for a few minutes at a time, as I get nauseous after. I looked it up and this is due to quick eye scanning and focus.

Human bodies are truly amazing.

scruffy · Jan 17, 2026

This the map of directional selectivity in the SC. If you disconnect all the cortical inputs, you still get this. This is derived from motion (direction) sensitive ganglion cells in the retina.

The figure shows monkeys vs zebrafish, and the difference can be traced directly to neural circuitry.

https://www.sciencedirect.com/science/article/pii/S0959438823000983

You can clearly see the boundaries of the fovea in the monkey map.

BackAgain · Jan 17, 2026

scruffy said:
Happy New Year everybody. I spent some time over the holidays away from California and not working, which was nice. Upon return though, there was a large box full of robot parts at the front door, and i've been trying to make room for it.

So I'm gonna stop talking about this for a while, and just "do". There's a lot of work in it. But I'll share this one interesting thing with you, check this out -

So a human eye, is about an inch big. It weighs about 8 grams. To move it we attach 6 motors, 3 pairs in push-pull for horizontal, vertical and oblique.

The first part of the requirement is, it has to cover the field of vision. The field of vision looks like this, it's about 200 degrees horizontally and 130 degrees vertical, and the central 120 degrees horizontally handle depth perception, but the fovea is only 6 degrees of that.

View attachment 1205109

The eye has to move very fast! Humans can do 3 saccades per second, with a maximal velocity of 700 degrees/sec or so. So these servo motors have to be very tight!

It turns out, the fastest commercially available servo motor BARELY meets this requirement. The PTK 7308MGD can handle about 900 degrees/sec. It runs on 8.4 volts, which is convenient. It can easily move 8 grams, and we have six of them.

So here's the idea: you're walking down the street and suddenly you see a shadow in the extreme periphery of your visual field. It causes you to turn your eyes and then your head. The initial saccade might cover 100 degrees, and it has to be completed and accurate within 1/3 second.

Our reflexes are arranged so the eyes move first, and then to foveate a spot in the extreme periphery the head has to turn. WHILE the head is turning the eyes remain focused on the target (this is the vestibulo-ocular reflex).

This is do-able with servo motors and Arduinos, but remember, we're going to train this thing like a baby. There is NO programming involved, only self organization. Humans have stretch receptors and pain receptors in the eyes, so if a baby tries to move its eyes too far or too fast it'll stop, it's self limiting. We have to put very careful controls on our robot version, to keep it from going off the rails, so to speak.

I tried doing the control systems theory on this stuff. It's not easy. The plant requires a high degree of stability to handle the full range of motion. So we need to introduce some nonlinearity, maybe sigmoid curves so the deviation can't exceed 100 degrees, that kind of thing.

Before we can train the gaze reflex, we have to train the opposing oculomotor pairs so they work smewthly with each other. This is done using the spindle (stretch) receptors in a closed feedback loop with both sets of drivers. This part is pretty easy because the feedback is direct - if one motor moves, the other one senses it and reports it. Piece o' cake for a self organizing neural network.

The harder part is the fixation reflex, which requires feedback from visual cortex. A very VERY crude version of it could be done with the brainstem alone, but it would not be very accurate (the eyes would end up "in the vicinity of" the shadow but not "on" it). Remember, the SAME system that does these 100 degree emergency saccades, also handles precision reading. One could argue it's a different "kind" of saccade because one occurs within the fovea while the other occurs in the periphery, and it is true there are two independent channels (called X cells and Y cells in the retina, X cells handling detail near the fovea and Y cells handling motion in the periphery).

Which leads us to a whole 'nother discussion about how fast does the VIDEO have to be. 30 fps is not enough. It has to be 200 fps to calculate the motion with any accuracy. The motion capture people mostly use 120 fps, if you've played with motion capture you know the limitations. Moving outlines come out "pretty" good but not perfect.

At 120 fps, 1/3 of a second will give us 40 frames, and in that time we have to turn every pixel into a Y cell. The parallel processing in the aggregate of Y cells will give us VERY good resolution. Basically we're doing hard wired convolutions on every pixel (using a CUDA), turning every pixel into a filter. For every pixel, we get all possible information for that point, including the velocity of the changes in luminance and color. If we have this information for neighboring pixels, we can completely determine the motion and direction of motion at every point.

'Nuff said. It's a challenge. I need a name for my robot. What should I call it?

i eye?

scruffy · Jan 17, 2026

This figure shows the feed forward path through the SC.

On the lower right you see two types of cells, one narrow, one wide. Turns out, the SC like the visual cortex is organized into columns. But in the SC there are only about 100 of them, as distinct from the millions in V1. Roughly speaking, the narrow cells correspond with the precise retinotopic mapping in the LGN-V1 pathway, whereas the wide cells correspond with the columns.

The wide cells feed the pulvinar => amygdala pathway, for a specific function which is threat avoidance. This function is related to another nearby structure called the parabigeminal nucleus, and the interesting thing about that is every PBG neuron uses two neurotransmitters, glutamate and acetylcholine.

The output neurons are what drive the eye movements. They feed the burst generating circuitry in the PPRF. At the end of the day the final coding is very simple, number of spikes in the PPRF corresponds with saccade angle (distance).

The columns are involved in attention processing. They feed the areas of the pulvinar that are most closely related to the frontal eye fields and the lateral intraparietal sulcus (both are primary components of the dorsal attention network). This pathway allows some rudimentary visual analysis even in blindsight.

scruffy · Jan 18, 2026

Okay. I have completely deciphered the first two processing layers. Which are, the muscles themselves, and the oculomotor nuclei. They're really quite simple, and the wiring makes complete sense.

The eyes have horizontal and vertical degrees of freedom (conjugate), as well as vergence movements (distinctive) related to depth perception. Each is controlled separately, and saccades are different from smooth pursuit which is different from fixation. Everything from all systems ends up in the oculomotor nuclei. There are three of them, the most important one is the abducens nucleus which abducts the eye (moves it away from the midline). The abducens nucleus abducts the eye on the same side. All the neurons are excitatory. 70% of them excite the ipsilateral muscle, pulling the eye to the side. 30% of them cross the midline and excite the contralateral medial rectus motor neurons, which moves the opposite eye inwards towards the midline and keeps the two eyes in sync. A very small third projection goes to the flocculus of the cerebellum, for the vestibulo-ocular reflex.

The motor neurons in the abducens nucleus are driven by excitatory burst neurons in the pons. There are also inhibitory burst neurons, that inhibit the abducens on the contralateral side. This way the horizontal eye movements end up in a push-pull arrangement with the medial and lateral rectus muscles. The change in eye position (read:eye velocity) is directly proportional to the firing rate of the motor neurons.

An identical system exists for vertical gaze, and the superior and inferior rectus muscles. But interestingly enough, fibers feeding the superior rectus are crossed, while everything else coming out of the oculomotor nucleus is uncrossed. And, there is a separate system involving the nucleus of Edinger-Westphal that handles the lens and the iris, for accommodation and variations in brightness.

Finally, there is a system for control of vergence eye movements, involving the supraoptic area in humans, which behaves differently even though the movements may look like small burst driven saccades. In addition to the fast open loop bursting there is a slow closed loop component involving the cerebellum.

Overall there's nothing fancy going on here, everything is straightforward. Rate codes make computations very easy. All burst neurons are inhibited until a decision is made to initiate a saccade. The burst then says "move this far from where you currently are", which equates with "contract the muscle this much".

The first piece of complexity arises in the next layer where we have to translate a 2d place code into two 1d rate codes.

physics of eye movements

scruffy

Diamond Member

MisterBeale

Diamond Member

SweetSue92

Diamond Member

scruffy

Diamond Member

BackAgain

Neutronium Member & truth speaker #StopBrandon

scruffy

Diamond Member

scruffy

Diamond Member

Space, the final frontier. Or is it?

predictive coding explains consciousness

Similar threads

New Topics

Latest Discussions