Publications and Preprints

(2025). SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From. In Lock-LLM NeuriPS Workshop 2025.
(2025). Decomposing Extrapolative Problem Solving: Spatial Transfer and Length Scaling with Map Worlds.
(2025). When Transformers Can (or Can’t) Generalize Compositionally? A Data-Distribution Perspective. In NeuriPS WCTD Workshop 2025.
(2025). Cut the Deadwood Out: Training-Free Backdoor Purification via Guided Module Substitution. In Findings of Association for Computational Linguistics EMNLP 2025.
(2024). How much of my dataset did you use? Quantitative Data Usage Inference in Machine Learning. In ICLR 2025 [Oral Presentation (Top ∼1.5% among submissions)].
(2024). The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline. In NeurIPS Workshop on Backdoors in Deep Learning, 2023. [Oral Presentation]; In ICML 2024 [Oral Presentation (Top ∼2% among submissions)].
(2024). Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities. In CACM, 2024.
PDF