Chinese AI firm DeepSeek has published a paper on arXiv describing Manifold-Constrained Hyper-Connections (mHC), a training architecture the company says could let engineers build and scale large language models without the huge computational costs normally required.
The mHC approach builds on hyper-connections (HCs), introduced in 2024, which give neural network