We introduce the Bi-Directional Sparse Hopfield Network (BiSHop), a novel end-to-end framework for deep tabular learning. BiSHop handles the two major challenges of deep tabular learning: non-rotationally invariant data structure and feature sparsity in tabular data. Our key motivation comes from the recent established connection between associative memory and attention mechanisms. Consequently, BiSHop uses a dual-component approach, sequentially processing data both column-wise and row-wise through two interconnected directional learning modules. Computationally, these modules house layers of generalized sparse modern Hopfield layers, a sparse extension of the modern Hopfield model with adaptable sparsity. Methodologically, BiSHop facilitates multi-scale representation learning, capturing both intra-feature and inter-feature interactions, with adaptive sparsity at each scale. Empirically, through experiments on diverse real-world datasets, we demonstrate that BiSHop surpasses current SOTA methods with significantly less HPO runs, marking it a robust solution for deep tabular learning.
arXiv
Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization
Yu-Min Tseng*, Yu-Chao Huang*, Teng-Yun Hsiao*, and 4 more authors
Recently, methods investigating how to adapt large language models (LLMs) for specific sce- narios have gained great attention. Particularly, the concept of persona, originally adopted in dialogue literature, has re-surged as a promising avenue. However, the growing research on persona is relatively disorganized, lacking a systematic overview. To close the gap, we present a comprehensive survey to categorize the current state of the field. We identify two lines of research, namely (1) LLM Role-Playing, where personas are assigned to LLMs, and (2) LLM Personalization, where LLMs take care of user personas. To the best of our knowledge, we present the first survey tailored for LLM role-playing and LLM personalization under the uni- fied view of persona, including taxonomy, current challenges, and potential directions. To foster future endeavors, we actively maintain a paper collection available to the community.
arXiv
L2O-g†: Learning to Optimize Parameterized Quantum Circuits with Fubini-Study Metric Tensor
Before the advent of fault-tolerant quantum computers, variational quantum algorithms (VQAs) play a crucial role in noisy intermediate-scale quantum (NISQ) machines. Conventionally, the op- timization of VQAs predominantly relies on manually designed optimizers. However, learning to optimize (L2O) demonstrates impressive performance by training small neural networks to replace handcrafted optimizers. In our work, we propose L2O-g†, a quantum-aware learned optimizer that leverages the Fubini-Study metric tensor (g†) and long short-term memory networks. We theoreti- cally derive the update equation inspired by the lookahead optimizer and incorporate the quantum geometry of the optimization landscape in the learned optimizer to balance fast convergence and gen- eralization. Empirically, we conduct comprehensive experiments across a range of VQA problems. Our results demonstrate that L2O-g† not only outperforms the current SOTA hand-designed opti- mizer without any hyperparameter tuning but also shows strong out-of-distribution generalization compared to previous L2O optimizers. We achieve this by training L2O-g† on just a single generic PQC instance. Our novel quantum-aware learned optimizer, L2O-g†, presents an advancement in addressing the challenges of VQAs, making it a valuable tool in the NISQ era.