top of page



Social Learning and Biases
In our pursuit to perfect neural networks, we often look to how humans learn for reference, which has had varying degrees of success....
Ethan Smith
Jun 45 min read
Â
Â
Â


Recurrent Parameterless Attention is a Consensus Algorithm
In another post, I wrote about parameterless (boneless) attention as a means of mixing information across datapoints weighted by their...
Ethan Smith
May 242 min read
Â
Â
Â


The mean preference is a bad estimate of preferences.
I felt compelled to make this post after seeing yet another reinforcement learning paper for diffusion models that does spectacularly in...
Ethan Smith
May 186 min read
Â
Â
Â


How do we tackle noisy recognition?
Something I've been thinking about a lot lately is how humans handle noisy recognition. Maybe you recognize the image above, if not you...
Ethan Smith
Apr 913 min read
Â
Â
Â


Boneless Attention and Low Rank Attention Layers
I’ve seen a lot of convoluted tutorials on attention but nothing really made it click for me more as understanding as mixing a projected...
Ethan Smith
Mar 238 min read
Â
Â
Â


The Need for Relative Optimizers | Hypothesis on Muon
Presently, most optimizers used in deep learning do not explicitly accommodate their updates with respect to the expected range of...
Ethan Smith
Mar 1811 min read
Â
Â
Â


Softmax Attention is a Fluke
Calibrated Attention Calibrated Attention NanoGPT Attention is the magic ingredient of modern neural networks. It is the core of what has...
Ethan Smith
Mar 1310 min read
Â
Â
Â


Discrete Diffusion Sudoku and Diffusion Lore
A short attempt at a small portion of the diffusion Family Tree https://www.canva.com/design/DAGgnVB3x2s/b52Y3Kg-frWdRlPzI3_5pA/edit?utm_...
Ethan Smith
Mar 35 min read
Â
Â
Â


How I like to think about diffusion
It's a bit hard to see in the diagram but in addition to being convolved with a gaussian, these points are also drifting towards zero....
Ethan Smith
Jan 264 min read
Â
Â
Â


Classifier free guidance and reinforcement learning
https://sweet-hall-e72.notion.site/Classifier-Free-Guidance-to-Approximate-RL-9f78c02801c6434da61f37c8d843c5bf
Ethan Smith
Jan 261 min read
Â
Â
Â


Why are Modern Neural Nets the way they are? And Hidden Hypernetworks.
https://sweet-hall-e72.notion.site/Why-are-Modern-Neural-Nets-the-way-they-are-And-Hidden-Hypernetworks-6c7195709e7b4abbada921875a951c54
Ethan Smith
Oct 6, 20241 min read
Â
Â
Â


Do Diffusion Transformers Deserve The Hype?
https://sweet-hall-e72.notion.site/Do-Diffusion-Transformers-Deserve-The-Hype-9b9ca7bead374b47aac96558714c203b
Ethan Smith
Jul 28, 20241 min read
Â
Â
Â


Automated LoRA Discovery and Teaching Neural Networks to make Neural Networks
https://sweet-hall-e72.notion.site/Automated-LoRA-Discovery-and-Teaching-Neural-Networks-to-make-Neural-Networks-22aa3b5ad66e4bc985ff2c93...
Ethan Smith
May 26, 20241 min read
Â
Â
Â


Diffusion and Autoregressive Models for Learning to Solve Mazes
https://sweet-hall-e72.notion.site/Diffusion-and-Autoregressive-Models-for-Learning-to-Solve-Mazes-c3bc4bcdfa304ecd9531ee5445a4da66
Ethan Smith
May 21, 20241 min read
Â
Â
Â


Traversing through CLIP Space, PCA and Latent Directions
https://sweet-hall-e72.notion.site/Traversing-through-CLIP-Space-PCA-and-Latent-Directions-b898932e13684d58957405b4a2747a79
Ethan Smith
May 6, 20241 min read
Â
Â
Â


Learning Space Filling Curves with Autoencoders
https://sweet-hall-e72.notion.site/Learning-Space-Filling-Curves-with-Autoencoders-e39e41ce75894c3a8fecfee0f3bbfb23?pvs=4
Ethan Smith
Apr 14, 20241 min read
Â
Â
Â


Mimicking Diffusion Models by Sequencing Frequency Coefficients
https://sweet-hall-e72.notion.site/Mimicking-Diffusion-Models-by-Sequencing-Frequency-Coefficients-8e5a60e876d640c390369627d55330b1
Ethan Smith
Mar 13, 20241 min read
Â
Â
Â


ContrastiveDPO for Diffusion, Generalizing DPO to multiple items
https://sweet-hall-e72.notion.site/ContrastiveDPO-for-Diffusion-Generalizing-DPO-to-multiple-items-PART1-226b3746aa4d4ff9995d1e26b38a9674
Ethan Smith
Mar 8, 20241 min read
Â
Â
Â


Dipole Attention: Opposites May Be Deep Connections
Image from: https://twitter.com/toshi2fly/status/911306344376012800 Post: https://sweet-hall-e72.notion.site/Dipole-Attention-Opposites-M...
Ethan Smith
Mar 5, 20241 min read
Â
Â
Â


Speeding up Diffusion: Reviewing DiffusionGANs, Consistency Models, and Flow Models
https://sweet-hall-e72.notion.site/Speeding-up-Diffusion-Reviewing-DiffusionGANs-Consistency-Models-and-Flow-Models-80b985120b8f472094cdc...
Ethan Smith
Mar 4, 20241 min read
Â
Â
Â
bottom of page