Peiyang Song

I am an undergraduate student studying Computer Science at California Institute of Technology (Caltech), advised by Prof. Steven Low, with a minor in Robotics advised by Prof. Günter Niemeyer. I am a researcher in the Stanford AI Lab (SAIL), working with Prof. Noah Goodman and Gabriel Poesia in the Computation & Cognition Lab (CoCoLab). I have also been fortunate to work with Prof. Anima Anandkumar (Caltech), Dr. Kaiyu Yang (Meta), Prof. Tim Sherwood (UC Santa Barbara), and Dr. Jeremy Lau (Google) during my undergrad.

宋沛洋  /  Email  /  CV  /  Bio  /  Google Scholar  /  GitHub  /  LinkedIn  /  Twitter

profile photo
News

[Sep. 2024] Our paper Creative and Context-Aware Translation of East Asian Idioms with GPT-4 is accepted to EMNLP 2024 Findings.
[Sep. 2024] Our paper on LLM inhibitory control & A-not-B cognitive errors is accepted to EMNLP 2024 Findings.
[Sep. 2024] I am giving an invited tutorial at NSSS 2024 on Neuro-Symbolic Theorem Proving with Lean: slides.
[June 2024] I am joining Stanford AI Lab (SAIL) and CoCoLab, working on mathematical reasoning with LLMs.
[May 2024] Attending NeuS at Berkeley, CA.

Research

My current research interest is mainly in machine reasoning, especially AI for mathematics and code generation. In the past, I also worked on energy-efficient machine learning systems and machine translation.

Creative and Context-Aware Translation of East Asian Idioms with GPT-4
Kenan Tang*, Peiyang Song*, Yao Qin, and Xifeng Yan (* Equal Contribution)
Findings of Empirical Methods in Natural Language Processing (EMNLP), 2024
arXiv / code / proceeding

To compile a dictionary of East Asian idiom translations demands much time and creativity even for expert translators. To alleviate such burden, we automate high-quality data generation with GPT-4, and discover Pareto-optimal prompting strategies on both faithfulness and creativity, outperforming existing translation engines and human baseline.

In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models
Pengrui Han*, Peiyang Song*, Haofei Yu, and Jiaxuan You (* Equal Contribution)
Findings of Empirical Methods in Natural Language Processing (EMNLP), 2024
arXiv / code / proceeding

Motivated by the crucial cognitive phenomenon of A-not-B errors, we present the first systematic evaluation on the surprisingly vulnerable inhibitory control abilities of LLMs. We reveal that this weakness undermines LLMs' trustworthy reasoning capabilities across diverse domains, and introduce various mitigations.

Towards Large Language Models as Copilots for Theorem Proving in Lean
Peiyang Song, Kaiyu Yang, and Anima Anandkumar
NeurIPS Mathematical Reasoning and AI (MATH-AI) Workshop, 2023
arXiv / code / poster / demo / slides / media

We introduce a framework for running neural network inference directly in Lean. It enables various LLM-based proof automation tools that integrate seamlessly into the workflow of Lean users, including tools for suggesting proof steps(tactics), selecting premises, and searching for complete proofs using LLMs.

Energy Efficient Convolutions with Temporal Arithmetic
Rhys Gretsch, Peiyang Song, Advait Madhavan, Jeremy Lau, and Tim Sherwood
ACM Int'l Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024
paper

We introduce energy-efficient convolution that improves the energy per pixel of each convolution frame by more than 2× compared to the state-of-the-art while improving the energy delay product by four orders of magnitude, by developing a new temporal arithmetic with a negative log transformation.

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
Kaiyu Yang, Aidan Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, and Anima Anandkumar
Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track, 2023, Oral presentation
arXiv / project / code / poster / slides / proceeding / media

Can LLMs generate mathematical proofs that can be rigorously checked? We release LeanDojo: an open-source playground consisting of toolkits, benchmarks, and models for LLMs to prove formal theorems in the Lean proof assistant.

Media

My works have been proudly covered by many media. Some representative ones include:


Last updated: Oct. 2024. Website template credit: Jon Barron.