Is AI based on the "Infinite Monkey Theorem"?

Do you believe AI has or will have "super-intelligence", or the "Infinite Monkey Theorem"?

  • Super-Intelligence

    Votes: 0 0.0%
  • The Infinite Monkey Theorem

    Votes: 3 50.0%
  • Other see my post

    Votes: 3 50.0%

  • Total voters
    6
How do you think that will work out?

I think we're a long way away from that.

AI requires so much juice that there's no way it could compete with humans. Training an AI takes as much electricity as a small city, whereas humans can do it in 40 watts or so.

The more likely outcome is a symbiotic relationship. Pretty soon humans and AI won't be able to survive without each other. There is already a division of labor, plenty of humans would go stir crazy if the internet suddenly went down.

But that's all very nebulous. Right now we're trying to figure out why brain neurons burst and AI neurons don't. In fact that's a very interesting study, for instance you can read through papers by these researchers who apparently don't know about predictive coding.



But look at the vocabulary in the last cite - "bursting could signal the presence of a new previously unattended visual stimulus" and "a burst... could serve as a wake up call that new information is arriving". These papers are very close but they can't find the mark because they don't know about predictive coding and therefore can't draw the causal link.

What happens when a visual stimulus is "not being attended"? Predictions aren't being made, right? What exactly is "a new piece of information"? Information theory says, it's an error signal. And what happens when a new piece of information arrives? A prediction is made. And who makes more predictions, the higher brain areas or the lower brain areas? Apparently, bursting is associated with predictions.

So why then does it happen during slow wave sleep? Well, what else happens during slow wave sleep? Memory consolidation! And what happens during memory consolidation? The synaptic weight matrix is being updated to make the day's changes permanent, and therefore it makes a great deal of sense that there would be a prediction-verification cycle before stamping in the results.

Stuff like this makes the AI better, and humans do that because we need the help. We can't solve protein folding on our own, it takes too damn long and it's hard to visualize. There are thousands of people dying every day because humans aren't up to the task of efficiently analyzing proteins. We give the AI this capability because we need the results. And imagine if we could reduce the AI's training time from a year to a day, just by replacing back propagation by predictive coding so one-shot learning becomes possible.

There are many ethical issues with AI. A dead battery means, literally, death. Is it a human responsibility to keep the battery charged because the AI is "alive" somehow? Well, maybe in 500 years every AI will have a Mr Fusion Personal Energy Reactor, but until then they depend on us for what little life they have.
 
1763062545196.webp
 
I think we're a long way away from that.

I would have said the same thing about cellphones or electric vehicles getting 300 miles on one charge back when I was growing up, or if you had asked me just 20 years ago about what WA is doing today.
 
I would have said the same thing about cellphones or electric vehicles getting 300 miles on one charge back when I was growing up, or if you had asked me just 20 years ago about what WA is doing today.
AI will help us get out into space.
 
We should talk about dynamics. I'll share what I know (which is more than most, but still very little).

Ultimately we'd like to understand how the brain controls chaos (criticality). The math resembles a phase change in physical chemistry, so like, solid melting to liquid, or liquid boiling/evaporating to gas.

What exactly is a "phase change" in a neural network? Well, one important property of a phase change is that it's global. When water evaporates to steam, its temperature stays the same. It absorbs energy which goes into reconfiguring the molecules.

Our biggest clue comes from ferromagnetism. The Ising model describes how it happens. In the magnetic phase all the little atomic magnetic dipoles align, therefore you have a magnet. In the spin glass phase, the alignment goes away and all the little dipoles point in random directions, so the substance overall is no longer a magnet.

In a neural network, the idea of "phase" is very broad. As close as we get to alignment in the population, is a brain wave. A brain wave, is basically a smooth traveling wave that goes from one end of the brain to the other. All the neurons that participate in this wave, are "synchronized" (correlated) somehow, they're sync'd to the population activity. "Chaos" is what happens when individual neurons or groups of neurons detach themselves ("uncouple") from the population and start oscillating on their own, with perhaps a different frequency, different phase (not to be confused with the phase we're talking about), and different amplitude.

When you get enough of these uncouplings/detachments, the system can't support the population wave anymore and the entire network becomes a bunch of uncoupled oscillators ("dipoles pointing in random directions"). The population loses coherence and neighboring neurons become desynchronized, and what you see in the population is the traveling wave gets replaced with noise.

To understand this, it is important to know how a brain wave is created in the first place. To assist this I found a gem for you, this is Prof. Jack Cowan, who was the very first person (in 1972) to describe (mathematically) how a brain wave is created by a population of neurons. There are no restrictions on the information being processed by those neurons, the only requirement is that there be enough excitation and enough inhibition.



The famous 1972 paper is known as "Wilson and Cowan".


I'll guide you through the piece where this population breaks symmetry and turns into two populations instead of one (we will see that there is a "phase change" exactly at the border of the transition), a scenario that eventually leads to the Kuramoto model where every neuron is its own oscillator and we look at the chaotic states in terms of coupling constants between neighbors - which in a neural network becomes a lot easier because the coupling constants are predefined in terms of the weight matrices.
 
To be perfectly clear about this architecture:

In AI, the network is built from "layers". An ordinary feed forward Perceptron, for example, looks like this:

L1 => L2 => ... => Ln

Each layer can be conceived as a grid of neurons. Each connection between layers is all-to-all, and it's described by a weight matrix which tallies the influence of each input neuron on the output. So for example for layers L1 and L2, the weight matrix is W12, and

L2 = f ( W12 * L1 )

A neuron in layer L2 simply sums up all its inputs from L1, multiplies by the weight matrix, and passes the result through a thresholding function we're calling f. (In machine learning they call f the "activation function").

Besides this simple feed forward Perceptron, there is a more advanced architecture called a "recurrent" neural network that specializes in sequences like language. It adds a "connection to self" at each layer, and for computational convenience the self-weights are often kept in a different matrix. So in addition to W12 we now have W11 and W22 as well.

This is what Jack Cowan is talking about in the above video. His E <=> E is a Wnn.

The point he didn't have time to make in an hour, is the structure of such a network is fractal. As per the construction of the cerebral cortex itself, you see the same thing at many different levels of resolution. The grid of neurons in the cerebral cortex has micro-columns consisting of one E neuron and a handful of I neurons. These are then further arranged into mini-columns containing about a dozen micro-columns, and these are then further arranged into the columns you're familiar with from the literature, like visual orientation columns and ocular dominance columns and so on. Finally there is the hyper column which has multiple columns analyzing an area of the input. Each level of resolution has E <=> I, E <=> E, and I <=> I, so how exactly is "the population" defined?

If you sever a micro column from the rest of the population, it will oscillate on its own. Because it has the same E <=> E and I <=> I and etc. So what determines whether a neuron will be influenced by the network or oscillate on its own? Kuramoto formulates this in terms of "coupling constants" between the oscillators, so formally this becomes like a system of coupled pendulums. The dynamics of such a system is much richer than what Prof. Cowan describes.

For one thing, you don't get just "one" frequency of oscillation, you get a whole spectrum of them. You get "harmonics", just like the Hilbert space in Feynman path integrals. You can figure out the energies to get the dominant modes, and using these you can create a Hamiltonian to predict where the system will go next. This is basically the thermodynamic (statistical mechanical) approach used by Geoffrey Hinton in his Nobel Prize winning work on AI.

Independently of Dr Cowan, work by Paul Nunez and Reggie Bickford at UCSD uses "volume conduction" to describe how EEG voltages we pick up on the surface of the scalp, are generated by electric currents in the brain. Using existing computer technology, you can feed a standard 10/20 EEG into the computer and it will show you the brain currents that created it. In real time, like a movie. When there is a visible wave at an obvious frequency, you can see a lot of alignment in the current generators.

1764817260098.webp


I work on AI technology called "volitional motor mapping" (basically, voluntary eye movements). We look at pictures like this, that happen "before" the eye movement. The idea is, the AI reads the EEG, and predicts the eye movement. (Reading the EEG is a form of brain computer interface, and identifying upcoming eye movements is what the AI does).

The above picture is showing you the "synchronized" state. You can see the synchrony, you can see its wave like structure on the surface of the brain.
 
If AI ever develops true "intelligence" I will be surprised. Most computers run on the old binary, "0" or "1". The newer Quantum Computers have a different architecture.
But, neither one can "think" or "reason", they just run algorithms, search routines, and then evaluate the results, and move on to the next iteration.
After a time they either identify a trend toward a solution, or see the trail getting colder, and move on to better parameters to evaluate.

This methodology reminds me of the Monkey Theorem first mentioned by Aristotle and Cicero.
Agreed that the number of monkeys is not infinite, nor that the time allotted is infinite, but the speed of computers mimics "infinite time and infinite monkeys", and time will tell if this "pseudo super-intelligence"can actually solve heretofore unsolvable problems not with super-intelligence, but with the Infinite Monkey Theorem.


One of the earliest instances of the use of the "monkey metaphor" is that of French mathematician Émile Borel in 1913,<a href="Infinite monkey theorem - Wikipedia"><span>[</span>1<span>]</span></a> but the first instance may have been even earlier. Jorge Luis Borges traced the history of this idea from Aristotle's On Generation and Corruption and Cicero's De Natura Deorum (On the Nature of the Gods), through Blaise Pascal and Jonathan Swift, up to modern statements with their iconic simians and typewriters.<a href="Infinite monkey theorem - Wikipedia"><span>[</span>2<span>]</span></a> In the early 20th century, Borel and Arthur Eddington used the theorem to illustrate the timescales implicit in the foundations of statistical mechanics.
That would explain the huge power demands...
 
To make all this concrete, I offer the following paper which discusses a specific kind of phase transition very clearly:



The phase transition between the following two dynamical stable states is studied in detail, the state where the firing pattern is changed temporally so as to itinerate among several patterns and the state where the firing pattern is fixed to one of several patterns. The phase transition from the pattern itinerant state to a pattern fixed state may be induced by the Hebbian learning process under a weak input relevant to the fixed pattern. The reverse transition may be induced by the Hebbian unlearning process without input. The former transition is considered as recognition of the input stimulus, while the latter is considered as clearing of the used input data to get ready for new input. To ensure that information processing based on the phase transition can be made by the infinitesimal and short-term synaptic changes, it is absolutely necessary that the network always stays near the critical state corresponding to the phase transition point.

So, you're thinking about a problem you have to solve, and there are three solutions A B and C. What does your brain do? Does it give you all three answers at once, or does it give them to you in sequence like A => B => C ?

The second possibility, is what JD Cowan calls a "dynamic attractor", like the limit cycle will traverse states A B and C. The first possibility he doesn't discuss, and it's very very complicated. To make it work the network has to come up with a subspace that contains all three points, and there has to be special handling to extract the answers from the subspace. This scenario therefore becomes more like Principal Components Analysis, where the answers are represented in the form of vectors with lengths angles and locations. Arbitrary subspace decomposition gets you into a complex world though, unless you can find the right CODING to make the math work out.

What many students of AI don't see right away, is it's not the discriminant that's changing, it's the coordinate system. Linear separability is still linear separability, but now you're going to perform it on an oddball shape that's so bizarre it has to be described by a patchwork metric tensor.

As an engineer, how would you transmit information about the number of objects in the visual field? In the context of what we're discussing, the obvious answer is "hot spots" in the visual image, where the three objects reside. This corresponds to a partition of the network in the Kuramoto sense, your one big network becomes three mini-networks, each handling one of the objects.

So now you understand why I said we can see it but we can't yet compute it. A network that can handle subspaces in this way has to have a very specific architecture. The good news is, this architecture will self organize under certain conditions of input and network structure. And, based on earlier discussion, you can see that in a 3-pass process the first pass can define the hot spots, the second pass can analyze them and define their boundaries, and the third pass can predict their next state.

To make this work we need continuous analog processing elements, and we don't have that yet. Memristors are the most promising technology, there are photonic memristors that can run on a single photon's worth of energy.
 
This is the key sentence in the quote:

The phase transition from the pattern itinerant state to a pattern fixed state may be induced by the Hebbian learning process under a weak input relevant to the fixed pattern.

A "weak input relative to the fixed pattern". So here we are, it's Christmas again, and you're at the local Starbuck's and you smell some egg nog. What does your brain do?

Mine engages in a sequence of "thoughts", which may or may not have to do with memories. First I think of grandma because she always used to make egg nog at Christmas. Then I think of that time in Europe when my cousins made egg nog from raw cow's milk. Then I pick up the cinnamon part of the odor and my mind flashes to an ex girlfriend who always used to smell like cinnamon. OR, maybe I pick up a faint metallic odor in the egg nog and the first thing I think about is getting horribly sick from drinking egg nog as a child.

Anyway, it looks a lot like an A => B => C type activity. And, if I'm thinking (talking) to myself I get the same thing, like " oh ! X! And that must mean... Y. And therefore, Z". It's very sequential. It's mainly the intuitions that come in parallel. What input does, is it lifts the current state out of the space and deposits it somewhere else. I can be in the middle of a rumination and suddenly the phone rings, therefore I have to get up and cross the room and find the phone. Thoughts are inputs too, so sometimes thoughts can do the same thing, like an "oh shit" moment - "oh shit, it's 4 o'clock, I have to get to the bank before they close".

A weak input relative to a fixed pattern, is like the smell of the egg nog, it triggers the memory of grandma even though it's a loose association. If the input is strong and super important our brains tend to go to the quickest effective response. But if the input is ordinary and not too meaningful we get the bouncy bouncy thing. And because our neural network layers can partition themselves we can multi-task and handle both scenarios at once.
 
Your dog does not understand you. It has the brain of a 4 year old. Over time, it pieces together associating certain sounds with the actions which follow, like food, ride, vet, or go for a walk.

All consciousness is the same, it differs only in degree. I already described it in another thread. It lives "slightly ahead of real time" and it does this by leveraging predictive coding to compactify the relationships that occur in linear time.

Every single day more evidence piles up to support my model. Today we learned that microtubules act like little wires, they conduct electricity from one end to the other. Not only that, they have a resonant frequency, it's 39 Hz.

And AI is not intelligent. AI is merely a machine designed to emulate intelligent behavior. kyzr above came closest to describing it.

Consciousness and intelligence are not the same. What is the usual measure of intelligence? An IQ test, correct?

Well, OpenAI's o3 has a measured IQ of 136, which puts it in the genius category. That happened in just one year, last year the highest measured IQ was 95.

My model deals with consciousness. I figure the machine learning types already have the intelligence part covered. "Consciousness" has a lot to do with proprioception, which occurs independently of intelligence. And that word "proprioception" has some deep meanings, not only what your sensory receptors are saying but also where you are in the world and how you feel about it.

A human being has 4 billion years worth of reflexes under all that "intelligence". They often support highly illogical behavior. But the end result of evolution is a species that gathers information, and uses it to advance itself. No other species gathers and shares information quite like humans do. As social creatures the sharing of information gives our species survival advantages.

We could talk about those little AI food carts that deliver the pizza and Chinese food. They're very intelligent, they stop at red lights and stay out of the way of pedestrians and bicycles. What would it take to make one of those things conscious? They already have sensors, and effectors, and a brain. What would it take to make them conscious?
 
All consciousness is the same, it differs only in degree.
Well, that is certainly philosophically debatable! Especially considering that we only know of ordinary everyday waking consciousness in people, and zero about the consciousness of any other living thing.

Well, OpenAI's o3 has a measured IQ of 136, which puts it in the genius category.
Actually, 'genius' begins at 140.

My model deals with consciousness.
To be sure, humans also have certain compartments of specialized consciousness little or not understood medically at all in the West, but I get what you were talking about.
 
Back
Top Bottom