Command Palette
Search for a command to run...
Franz A. Heinsen

Abstract
We propose a routing algorithm that takes a sequence of vectors and computes a new sequence with specified length and vector size. Each output vector maximizes "bang per bit," the difference between a net benefit to use and net cost to ignore data, by better predicting the input vectors. We describe output vectors as geometric objects, as latent variables that assign credit, as query states in a model of associative memory, and as agents in a model of a Society of Mind. We implement the algorithm with optimizations that reduce parameter count, computation, and memory use by orders of magnitude, enabling us to route sequences of greater length than previously possible. We evaluate our implementation on natural language and visual classification tasks, obtaining competitive or state-of-the-art accuracy and end-to-end credit assignments that are interpretable.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-classification-on-cifar-10 | Heinsen Routing + BEiT-large 16 224 | Percentage correct: 99.2 |
| image-classification-on-cifar-100 | Heinsen Routing + BEiT-large 16 224 | PARAMS: 309.8M Percentage correct: 93.8 |
| image-classification-on-imagenet | Heinsen Routing + BEiT-large 16 224 | Number of params: 312.8M Top 1 Accuracy: 86.7% |
| sentiment-analysis-on-imdb | Heinsen Routing + RoBERTa Large | Accuracy: 96.2 |
| sentiment-analysis-on-sst-2-binary | Heinsen Routing + RoBERTa-large | Accuracy: 96.0 |
| sentiment-analysis-on-sst-5-fine-grained | Heinsen Routing + RoBERTa Large | Accuracy: 59.8 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.