[R] Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings
[D] What are the must-have books for graduate students/researchers in Machine Learning; especially for Dynamical Systems, Neural ODEs/PDEs/SDEs, and PINNs?
[R] paper on Evaluative Fingerprints: Stable and Systematic Differences in LLM Evaluator Behavior
[R] Why doubly stochastic matrix idea (using Sinkhorn-Knopp algorithm) only made popular in the DeepSeek's mHC paper, but not in earlier RNN papers?