my first eye movement !

scruffy · Feb 25, 2026

Well, other than a dumbass bug in my code, it worked!

So let me explain - here I'm using the Raspberry running Python under Linux. It has a camera pointed at a picture of Scruffy on the wall. The purple box is the eyeball. It starts out perfectly centered in the image. Then I tell it to find Scruffy's face, I want the center of the purple box to be exactly on Scruffy's upper lip.

You can see it worked, the vertical part worked perfectly but I forgot to add w/2 to the left side of the box. I got lazy cause I wanted something working, so now I have to go back and properly align the centers and add half the height and width and etc etc.

The pic of Scruffy is from the Stanford dog database, it's a standardized training set for machine learning.

At this point the motor is still disconnected from the Raspberry, I'm just getting the control loop to work. Next week I'll have the motor connected and then we can try some movies.

scruffy · Thursday at 1:51 AM

Bounding box fixed.

I'm testing with video already.

Here's a random scene from a music video. You can see all the little blue bounding boxes around the people.

Unfortunately YOLO has identified the mandala as a "clock" lol.

I'll hook in PyTorch this weekend, with my own neural network trained from scratch. We'll kick some booty.

So now for my next trick, I'll get the eyes to move from one person to the next. Using an "attention" mechanism.

So far this is like an underperforming transformer. In about 2 weeks I'll be ahead of the curve. There is a 256 gB CUDA arriving tomorrow...

scruffy · Thursday at 2:28 AM

It's not bad, all things considered.

YOLO was trained for autonomous vehicles, so it's picking up things like traffic lights real good.

I just wanted to get something running real fast, and YOLO is an industry standard for small devices like the Raspberry (think "slightly better security camera", it's about like that). It picks up cars real good but has trouble with detached faces and very small people. This one came out pretty good.

On this one though, it's missing quite a few of the cops.

These political ones worked pretty good lol - you can see the robot has been correctly identified as "not a person". (Note the traffic light on the top right of the first image).

scruffy · Thursday at 2:46 AM

Oh - these are all videos running in real time. Source was an mp4 file. I'm taking pictures with my phone while the video is playing on the screen. So far, it plays back around half speed. That problem should be solved next week.

YOLO is not a full transformer, it's been greatly reduced for performance reasons. Instead of processing layers one at a time like serial convolutions, it does the entire image all at once in one pass. Hence the problems with size, it doesn't scale well. My network will fix all that. I'm doing predictive coding in the dendrites, something brand new that's never been tried before. My theory is, one layer with active dendrites is equivalent to two without.

If things go well I'll be able to triple the performance of ResNet, because I only have two derivatives to calculate instead of six! Training will take longer in my version, but recognition will be much MUCH faster.

scruffy · Thursday at 11:58 AM

The FedEx man just brought me two boxes that are bigger than my bass amp! Where am I going to put all this stuff?

Got an RTX 5090 and two H200's, that's about 280 gigs of GPU all together. Plus a new motherboard and chassis to fit it all. Now I get to have fun configuring a Linux for all this.

Shouldn't take more than two days... meanwhile I'm limping along with two GTX-1080's. This new system is promising me 280 frames per second. Yowza.

scruffy · Thursday at 4:02 PM

This is my simulator software. Using this, I can quickly and easily design neural networks. I just tell it what the network looks like, and what kind of neurons and connections to use, and it does the rest. The pic shows an eye movement system. On the left is my targeting network and on the right are some motor neurons that drive eye muscles. Right now, there are sliders on the left that specify object locations. On the right are the oculomotor commands needed to move the eyes from one object to another. If you look carefully you can see the spikes from the motor neurons in the colored traces.

This now requires an attention network to provide a "winner take all" capability to the list of available eye movements (so one and only one colored trace is applied to the muscles, all the rest are inhibited).

With this simulator I can do in one day, what it takes Google an entire month to accomplish with a dozen of their best people. It can switch from a spiking network to a rate code with one line of Python code, all the displays adjust automatically. It interfaces with TensorFlow, PyTorch, SpiNNaker, and HP supercomputers (anything that'll run MIDL-C or Concurrent C, and it'll accept as many Raspberries as you want to throw at it, up to the connector limit or the performance limit, whichever comes first).

The Irish Ram · Thursday at 4:06 PM

scruffy · Thursday at 11:00 PM

The Irish Ram said:

You may ask yourself, "self, why is that crazy guy doing all this?"

The answer's real simple: "because I can".

There's 2 million dollars attached to the next ILSVRC. Do you realize, the network that won the last round had a 15.3% error rate? I have that already, with only 2 days' work. My error rate will match the best transformer LLM, it'll be down around 0.5%.

Besides, there's something I know that Yann LeCun doesn't. So they can pay me the 2 million, instead of him.

Meanwhile I get to show you guys how it's done, that way you can get 2 million too. :lmao:

Seriously, the forum keeps me honest, I have to post work product on a regular basis. If I get bored (which happens frequently) the work has to continue.

We're not going to make it to the stars without AI, I'd like to do my part. I ain't no physicist but I know computers and I know biology. I want to make AI really work, not like the BS that passes for intelligence on Google. Not like self driving cars that tear down fences and mailboxes and end up driving down train tracks.

If this project turns out to be a dead end there's 40 other projects I have in mind. AI with a conscience could be an interesting one. (We're halfway there already, Meta is already extracting intention from the inflections of speech). But I figured industrial robotics is a sure money maker, as distinct from ethics which usually costs rather than makes.

scruffy · Thursday at 11:06 PM

scruffy said:
View attachment 1223682

No one commented on this one.

Could you see AI ICE agents?

With a conscience, even?

I'm gonna get me one of those 100 million dollar no-bid government contracts, then not only will I be rich but I'll control every cop on earth.

lmao :lmao:

scruffy · Saturday at 10:50 PM

Well, Linux is working. According to the benchmark I'm getting 242 fps at 4k without the GPU's. That's good enough. For training I can use the GPU's in Google CoLab for free. So that's the next step, train up a full dog model on "my" network.

How the training will work:

3 phases

(this is a targeting system, driven by visual input)

1. train on stills - 282 dog breeds, annotated

2. train on movies of moving and running dogs, one dog at a time

3. train on movies of groups of dogs running in the dog park

Tha goal is to be able to maintain the bounding box on a moving target, and accurately identify the location information for an eye movement needed to center the target on the fovea.

Once that's done I get to play with the motors. That's when the real fun begins. The first acceptance test will be moving the eyes from one target to the next, in some order (size, color, doesn't matter), "as" the dogs are running.

my first eye movement !

scruffy

Diamond Member

scruffy

Diamond Member

scruffy

Diamond Member

scruffy

Diamond Member

scruffy

Diamond Member

scruffy

Diamond Member

The Irish Ram

LITTLE GIRL / Ram Tough

scruffy

Diamond Member

scruffy

Diamond Member

scruffy

Diamond Member

Human ingenuity and ancient monuments.

Your Genetic Passport is here

Similar threads

New Topics

Latest Discussions