[R] I am looking for good research papers on compute optimization during model training, ways to reduce FLOPs, memory usage, and training time without hurting convergence.
[R] The Post-Transformer Era: State Space Models, Mamba, and What Comes After Attention
[D] Ph.D. from a top Europe university, 10 papers at NeurIPS/ICML, ECML— 0 Interviews Big tech
[D] Am I wrong to think that contemporary most machine learning reseach is just noise?
[R] I probed 6 open-weight LLMs (7B-9B) for "personality" using hidden states — instruct fine-tuning is associated with measurable behavioral constraints