What a cool idea. How does it work? AFAIK The human brain at least does sparse backprop and has SOME neural circuits that feed-backward, so how do you manage it without anything?
Thanks! I have other ideas, following Jeff Hawkins's Thousand Brains Project, but in this one I'm trying to get to cortical columns from the other side, from "standard" deep neural networks.
The short version: each layer trains itself independently using Hinton's Forward-Forward algorithm. Instead of propagating error gradients backward through the whole network, each layer has its own local objective: "real data should produce high activation norms, corrupted data should produce low ones." Gradients never cross layer boundaries. The human brain is massively parallel and part of that is not using backprop, so I'm trying to use that as inspiration.
You're right that the brain has backward-projecting circuits. But those are mostly thought to carry contextual/modulatory signals, not error gradients in the backprop sense. I'm handling cross-layer communication through attention residuals (each layer dynamically selects which prior layers to attend to) and Hopfield memory banks (per-layer associative memory written via Hebbian outer products, no gradients needed).
The part I'm most excited about is "sleep". During chat, user feedback drives reward-modulated Hebbian writes to the memory banks (instant, no gradients, like hippocampal episodic memory). Then a /sleep command consolidates those into weights by generating "dreams" from the bank-colored model and training on them with FF + distillation. No stored text needed, only the Hopfield state. The model literally dreams its memories into its weights.
Still early, training a 100M param model on TinyStories right now, loss is coming down but I don't have eval numbers yet.
The idea is that the brain uses what the authors refer to as "feedback alignment" rather than backprop. Even if it turns out not to be literally true of the brain, the idea is interesting for AI.
I also love the idea of grafting on the memory banks. It reminds me of early work on DNC's (Differentiable Neural Computer's). I tried to franken-bolt a DNC onto an LLM a few years back and mostly just earned myself headaches. :)
It's fun to see all the wild and wacky stuff other folks like myself are tinkering with in the lab.
Sequent builds cryptographically secure online voting infrastructure used in 200+ real elections across multiple countries. We're a fully remote team working on an open-source platform combining Rust, TypeScript, and modern DevOps. We handle End-to-end encrypted voting, cryptographic mixnets, and tamper-evident logging.
Almost none of the parent’s bullet points are solved by building on the Moon instead of in Earth orbit.
The energy demands of getting to the 240k mile Moon are IMMENSE compared to 100 mile orbit.
Ultimately, when comparing the 3 general locations, Earth is still BY FAR the most hospitable and affordable location until some manufacturing innovations drop costs by orders of magnitude. But those manufacturing improvements have to be made in the same jurisdiction that SpaceXAI is trying to avoid building data centers in.
This whole things screams a solution in search of a problem. We have to solve the traditional data center issues (power supply, temperature, hazard resilience, etc) wherever the data centers are, whether on the ground or in space. None of these are solved for the theoretical space data centers, but they are all already solved for terrestrial data centers.
But none of those are usable, right? It will take decades of work at least to get a commercial grade mining operation going and even then the iron, titanium, aluminum would need to be fashioned...
Ah, I see the idea now. It is to get people to talk about robotics and how robots will be able to do all this on the moon or wherever.
That's a hard problem to solve. Invest enough in solving that problem and you might get the ability to manufacture a radiator out of it, but you're still going to have to transport the majority of your datacenter to the moon. That probably works out more expensive than launching the whole thing to LEO
Sounds more difficult. Not only is the moon further, you also need to use more fuel to land on it and you also have fine, abrasive dust to deal with. There’s no wind of course, but surely material will be stirred up and resettle based on all the landing activity.
And it’s still a vacuum with many of the same cooling issues. I suppose one upside is you could use the moon itself as a heat sink (maybe).
And 2.5s is best case. Signal strength issues, antenna alignment issues, and all sorts of unknown unknowns conspire to make high-integrity/high-throughput digital signal transmissions from a moon-based compute system have a latency much worse than that on average.
Yeah, carrying stuff 380k km and still deploying in vacuum (and super dusty ground) doesn't solve anything but adds cost and overhead. One day maybe, but not these next decades nor probably this century.
Still a vacuum so the same heat dissipation issues, adding to it that the lunar dust makes solar panels less usable, and the lunar surface on the solar side gets really hot.
Sequent builds cryptographically secure online voting infrastructure used in 200+ real elections across multiple countries. We're a fully remote team working on an open-source platform combining Rust, TypeScript, and modern DevOps. We handle End-to-end encrypted voting, cryptographic mixnets, and tamper-evident logging.
reply