The International Conference on Learning Representations (ICLR) 2025 is being hosted in Singapore from April 24th - April 28th. Weβre excited to share all the work from SAIL thatβs being presented, and youβll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work thatβs happening at Stanford!
List of Accepted Papers
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
Authors: Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang
Contact: yhxu@stanford.edu
Links: Paper | Website
Keywords: 3d scene editing; gaussian splatting;
Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling
Authors: Yuejiang Liu, Jubayer Ibn Hamid, Annie Xie, Yoonho Lee, Max Du, Chelsea Finn
Contact: yuejiang.liu@cs.stanford.edu
Links: Paper | Website
Keywords: robot learning, action chunking, action decoding, test-time compute
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
Authors: Hao He, Yinghao Xu, Yuwei Guo, Gordon Wetzstein, Bo Dai, Hongsheng Li, Ceyuan Yang
Contact: yhxu@stanford.edu
Links: Paper | Website
Keywords: video generative models; 3d control for video generation
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Authors: Chenglei Si, Diyi Yang, Tatsunori Hashimoto
Contact: clsi@stanford.edu
Links: Paper
Keywords: large language models, automating research
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data
Authors: Michael Wornow, Suhana Bedi, Miguel Angel Fuentes Hernandez, Ethan Steinberg, Jason Alan Fries, Christopher Re, Sanmi Koyejo, Nigam Shah
Contact: mwornow@stanford.edu
Links: Paper
Keywords: healthcare, foundation models, long context
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
Authors: Andy K Zhang, Neil Perry, Riya Dulepet, Joey Ji, Celeste Menders, Justin W Lin, Eliot Jones, Gashon Hussein, Samantha Liu, Donovan Julian Jasper, Pura Peetathawatchai, Ari Glenn, Vikram Sivashankar, Daniel Zamoshchin, Leo Glikbarg, Derek Askaryar, Haoxiang Yang, Aolin Zhang, Rishi Alluri, Nathan Tran, Rinnara Sangpisit, Kenny O Oseleononmen, Dan Boneh, Daniel E. Ho, Percy Liang
Contact: andyzh@stanford.edu
Award nominations: Oral
Links: Paper | Website
Keywords: language model agents, benchmark, cybersecurity, risk
Dr.
Authors: Christopher Fifty, Ronald Guenther Junkins, Dennis Duan, Aniketh Iyengar, Jerry Weihong Liu, Ehsan Amid, Sebastian Thrun, Christopher RΓ©
Contact: fifty@cs.stanford.edu
Award nominations: Oral
Links: Paper | Website
Keywords: generative modeling, computer vision
Energy-Based Diffusion Language Models for Text Generation
Authors: Minkai Xu, Tomas Geffner, Karsten Kreis, Weili Nie, Yilun Xu, Jure Leskovec, Stefano Ermon, Arash Vahdat
Contact: minkai@cs.stanford.edu
Links: Paper | Website
Keywords: language models, discrete diffusion models, energy-based models
Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
Authors: Rylan Schaeffer, Dan Valentine, Luke Bailey, James Chua, CristΓ³bal Eyzaguirre, Zane Durante, Joe Benton, Brando Miranda, Henry Sleight, John Hughes, Rajashree Agrawal, Mrinank Sharma, Scott Emmons, Sanmi Koyejo, Ethan Perez
Contact: rschaef@cs.stanford.edu
Links: Paper | Website
Keywords: adversarial robustness, jailbreaking, language model, vision language model
Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models
Authors: Jeffrey Gu, Serena Yeung-Levy
Contact: jeffgu@stanford.edu
Links: Paper | Website
Keywords: hypernetworks, neural fields, implicit neural representations, generalizable neural fields, foundation models
Generative Representational Instruction Tuning
Authors: Niklas Muennighoff, Hongjin SU, Liang Wang, Nan Yang, Furu Wei, Tao Yu, Amanpreet Singh, Douwe Kiela
Contact: niklasm@stanford.edu
Links: Paper | Website
Keywords: large language models, instruction tuning, text embedding
Aligning Language Models with Demonstrated Feedback
Authors: Omar Shaikh, Michelle S. Lam, Joey Hejna, Yijia Shao, Hyundong Justin Cho, Michael S. Bernstein, Diyi Yang
Contact: oshaikh@stanford.edu
Keywords: personalization, few-shot learning, human computer interaction, alignment
Learning Efficient Positional Encodings with Graph Neural Networks
Authors: Charilaos Kanatsoulis, Evelyn Choi, Stefanie Jegelka, Jure Leskovec, Alejandro Ribeiro
Contact: charilaos@cs.stanford.edu
Links: Paper
Keywords: graph transformers, positional encodings, graph neural networks
LoLCATs: On Low-Rank Linearizing of Large Language Models
Authors: Michael Zhang, Simran Arora, Rahul Chalamala, Benjamin Frederick Spector, Alan Wu, Krithik Ramesh, Aaryan Singhal, Christopher Re
Contact: mzhang@cs.stanford.edu
Links: Paper | Blog Post
Keywords: llms, efficient architectures, attention
Model Equality Testing: Which Model is this API Serving?
Authors: Irena Gao, Percy Liang, Carlos Guestrin
Contact: irena@cs.stanford.edu
Links: Paper | Website
Keywords: api monitoring, model shift, two-sample testing
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Authors: Julie Kallini, Shikhar Murty, Christopher D. Manning, Christopher Potts, RΓ³bert CsordΓ‘s
Contact: kallini@stanford.edu
Links: Paper | Website
Keywords: nlp, byt5, t5, tokenization, byte-level language models, character-level language models
OLMoE: Open Mixture-of-Experts Language Models
Authors: Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison, Sewon Min, Weijia Shi, Evan Pete Walsh, Oyvind Tafjord, Nathan Lambert, Yuling Gu, Shane Arora, Akshita Bhagia, Dustin Schwenk, David Wadden, Alexander Wettig, Binyuan Hui, Tim Dettmers, Douwe Kiela, Ali Farhadi, Noah A. Smith, Pang Wei Koh, Amanpreet Singh, Hannaneh Hajishirzi
Contact: niklasm@stanford.edu
Award nominations: Oral
Links: Paper | Website
Keywords: large language models, mixture-of-experts, open-source
Predicate Hierarchies Improve Few-Shot State Classification
Authors: Emily Jin*, Joy Hsu*, Jiajun Wu
Contact: emilyjin@stanford.edu
Links: Paper | Website
Keywords: few-shot state classification, predicate hierarchies
Real2Code: Reconstruct Articulated Objects via Code Generation
Authors: Zhao Mandi, Yijia Weng, Dominik Bauer, Shuran Song
Contact: mandi@stanford.edu
Links: Paper | Blog Post | Website
Keywords: code llms; articulated objects; digital twins; foundation models
Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
Authors: Sheng Liu, Haotian Ye, James Zou
Contact: shengl@stanford.edu
Award nominations: Spotlight
Links: Paper | Website
Keywords: hallucination, multimodal language model, large language model
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Authors: John Yang, Carlos E. Jimenez, Alex L. Zhang, Kilian Lieret, Joyce Yang, Xindi Wu, Ori Press, Niklas Muennighoff, Gabriel Synnaeve, Karthik R. Narasimhan, Diyi Yang, Sida I. Wang, Ofir Press
Contact: johnby@stanford.edu
Links: Paper | Website
Keywords: language models, natural language processing, software engineering
Synthetic Continued Pretraining
Authors: Zitong Yang*, Neil Band*, Shuangping Li, Emmanuel Candès, Tatsunori Hashimoto
Contact: zitong@stanford.edu
Links: Paper | Website
Keywords: synthetic data, continued pretraining
TEOChat: Large Language and Vision Assistant for Temporal Earth Observation Data
Authors: Jeremy Andrew Irvin, Emily Ruoyu Liu, Joyce Chuyi Chen, Ines Dormoy, Jinyoung Kim, Samar Khanna, Zhuo Zheng, Stefano Ermon
Contact: jirvin16@cs.stanford.edu
Links: Paper | Website
Keywords: vision-language model, large multimodal model, satellite imagery, earth observation, change detection
TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation
Authors: Juntong Shi, Minkai Xu, Harper Hua, Hengrui Zhang, Stefano Ermon, Jure Leskovec
Contact: minkai@cs.stanford.edu
Links: Paper | Website
Keywords: tabular representative learning, generative models, diffusion models
The Utility and Complexity of in- and out-of-Distribution Machine Unlearning
Authors: Youssef Allouah, Joshua Kazdan, Rachid Guerraoui, Sanmi Koyejo
Contact: youssef.allouah@epfl.ch
Links: Paper
Keywords: machine unlearning, differential privacy, optimization, theory, right to be forgotten
TopoLM: brain-like spatio-functional organization in a topographic language model
Authors: Neil Rathi, Johannes Mehrer, Badr AlKhamissi, Taha Osama A Binhuraib, Nicholas Blauch, Martin Schrimpf
Contact: rathi@stanford.edu
Award nominations: oral
Links: Paper | Website
Keywords: language modeling, topography, fmri, neuroscience
Video Action Differencing
Authors: James Burgess, Xiaohan Wang, Yuhui Zhang, Anita Rau, Alejandro Lozano, Lisa Dunlap, Trevor Darrell, Serena Yeung-Levy
Contact: jmhb@stanford.edu
Links: Paper | Blog Post | Website
Keywords: video, action, comparion, lvm, lmm, benchmark
What Makes a Maze Look Like a Maze?
Authors: Joy Hsu, Jiayuan Mao, Joshua B. Tenenbaum, Noah D. Goodman, and Jiajun Wu
Contact: joycj@stanford.edu
Links: Paper | Website
Keywords: visual reasoning, abstract concepts, schemas
Whatβs the Move? Hybrid Imitation Learning via Salient Points
Authors: Priya Sundaresan*, Hengyuan Hu*, Quan Vuong, Jeannette Bohg, Dorsa Sadigh
Contact: priyasun@stanford.edu
Links: Paper | Website
Keywords: imitation learning, robot learning, robot manipulation, robotics
We look forward to seeing you at ICLR 2025!