2024 Machine Learning Paper List
Below are some interesting paper I read in 2024. Feel free to leave a comment or email me to share or suggest more exciting papers to yuchaohuang [at] g [dot] ntu [dot] edu!
Theoretical Works
-
How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason D. Lee (2024).
arXiv preprint, arXiv:2402.14735.
Link to Paper -
Provably Learning a Multi-Head Attention Layer
Sitan Chen, Yuanzhi Li (2024).
arXiv preprint, arXiv:2402.04084.
Link to Paper -
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Tri Dao, Albert Gu (2024).
arXiv preprint, arXiv:2405.21060.
Link to Paper -
A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration
Yingqian Cui, Pengfei He, Xianfeng Tang, Qi He, Chen Luo, Jiliang Tang, Yue Xing (2024).
arXiv preprint, arXiv:2410.16540.
Link to Paper
Diffusion Model
-
Slight Corruption in Pre-training Data Makes Better Diffusion Models
Hao Chen, Yujin Han, Diganta Misra, Xiang Li, Kai Hu, Difan Zou, Masashi Sugiyama, Jindong Wang, Bhiksha Raj (2024).
arXiv preprint, arXiv:2405.20494.
Link to Paper -
Learning Diffusion at Lightspeed
Antonio Terpin, Nicolas Lanzetti, Martín Gadea, Florian Dörfler (2024).
arXiv preprint, arXiv:2406.12616.
Link to Paper -
Generalized Schrödinger Bridge Matching
Guan-Horng Liu, Yaron Lipman, Maximilian Nickel, Brian Karrer, Evangelos A. Theodorou, Ricky TQ Chen (2023).
arXiv preprint, arXiv:2310.02233.
Link to Paper -
A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training
Kai Wang, Mingjia Shi, Yukun Zhou, Zekai Li, Zhihang Yuan, Yuzhang Shang, Xiaojiang Peng, Hanwang Zhang, Yang You (2024).
arXiv preprint, arXiv:2405.17403.
Link to Paper -
Diffusion Forcing: Next-Token Prediction Meets Full-Sequence Diffusion
Boyuan Chen, Diego Marti Monso, Yilun Du, Max Simchowitz, Russ Tedrake, Vincent Sitzmann (2024).
arXiv preprint, arXiv:2407.01392.
Link to Paper
Foundation Model
-
Evaluating Quantized Large Language Models
Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang (2024).
arXiv preprint, arXiv:2402.18158.
Link to Paper -
scGPT: Toward Building a Foundation Model for Single-Cell Multi-Omics Using Generative AI
Haotian Cui, Chloe Wang, Hassaan Maan, Kuan Pang, Fengning Luo, Nan Duan, Bo Wang (2024).
Nature Methods, 1–11.
Link to Paper -
Thinking LLMs: General Instruction Following with Thought Generation
Tianhao Wu, Janice Lan, Weizhe Yuan, Jiantao Jiao, Jason Weston, Sainbayar Sukhbaatar (2024).
arXiv preprint, arXiv:2410.10630.
Link to Paper -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Yiran Ding, Li Lyna Zhang, Chengruidong Zhang, Yuanyuan Xu, Ning Shang, Jiahang Xu, Fan Yang, Mao Yang (2024).
arXiv preprint, arXiv:2402.13753.
Link to Paper -
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Hritik Bansal, Arian Hosseini, Rishabh Agarwal, Vinh Q. Tran, Mehran Kazemi (2024).
arXiv preprint, arXiv:2408.16737.
Link to Paper -
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou (2024).
arXiv preprint, arXiv:2408.12528.
Link to Paper -
Unified Training of Universal Time Series Forecasting Transformers
Gerald Woo, Chenghao Liu, Akshat Kumar, Caiming Xiong, Silvio Savarese, Doyen Sahoo (2024).
arXiv preprint, arXiv:2402.02592.
Link to Paper -
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts
Xu Liu, Juncheng Liu, Gerald Woo, Taha Aksu, Yuxuan Liang, Roger Zimmermann, Chenghao Liu, Silvio Savarese, Caiming Xiong, Doyen Sahoo (2024).
arXiv preprint, arXiv:2410.10469.
Link to Paper -
A Decoder-Only Foundation Model for Time-Series Forecasting
Abhimanyu Das, Weihao Kong, Rajat Sen, Yichen Zhou (2023).
arXiv preprint, arXiv:2310.10688.
Link to Paper -
Chronos: Learning the Language of Time Series
Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, Yuyang Wang (2024).
arXiv preprint, arXiv:2403.07815.
Link to Paper -
Cell2Sentence: Teaching Large Language Models the Language of Biology
Daniel Levine, Syed Asad Rizvi, Sacha Lévy, Nazreen Pallikkavaliyaveetil, David Zhang, Xingyu Chen, Sina Ghadermarzi, Ruiming Wu, Zihe Zheng, Ivan Vrkic, et al. (2023).
BioRxiv, Cold Spring Harbor Laboratory.
Link to Paper
Transformer
-
ngpt: Normalized Transformer with Representation Learning on the Hypersphere
Ilya Loshchilov, Cheng-Ping Hsieh, Simeng Sun, Boris Ginsburg (2024).
arXiv preprint, arXiv:2410.01131.
Link to Paper -
Differential Transformer
Tianzhu Ye, Li Dong, Yuqing Xia, Yutao Sun, Yi Zhu, Gao Huang, Furu Wei (2024).
arXiv preprint, arXiv:2410.05258.
Link to Paper
Misc
-
Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Jason Ramapuram, Federico Danieli, Eeshan Dhekane, Floris Weers, Dan Busbridge, Pierre Ablin, Tatiana Likhomanenko, Jagrit Digani, Zijin Gu, Amitis Shidani, et al. (2024).
arXiv preprint, arXiv:2409.04431.
Link to Paper -
De Novo Design of High-Affinity Protein Binders with AlphaProteo
Vinicius Zambaldi, David La, Alexander E. Chu, Harshnira Patani, Amy E. Danson, Tristan O.C. Kwan, Thomas Frerix, Rosalia G. Schneider, David Saxton, Ashok Thillaisundaram, et al. (2024).
arXiv preprint, arXiv:2409.08022.
Link to Paper -
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Yu Sun, Xinhao Li, Karan Dalal, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen, Xiaolong Wang, Sanmi Koyejo, et al. (2024).
arXiv preprint, arXiv:2407.04620.
Link to Paper -
The Unbearable Slowness of Being
Jieyu Zheng, Markus Meister (2024).
arXiv preprint, arXiv:2408.10234.
Link to Paper -
Discrete Flow Matching
Itai Gat, Tal Remez, Neta Shaul, Felix Kreuk, Ricky TQ Chen, Gabriel Synnaeve, Yossi Adi, Yaron Lipman (2024).
arXiv preprint, arXiv:2407.15595.
Link to Paper
Enjoy Reading This Article?
Here are some more articles you might like to read next: