Research
My current research interest is in machine reasoning, especially AI for mathematics and code generation. In the past, I have also worked on energy-efficient machine learning systems.
|
|
In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models
Pengrui Han*, Peiyang Song*, Haofei Yu, and Jiaxuan You (* Equal Contribution)
ICML Workshop on LLMs and Cognition, 2024
code
Motivated by the crucial cognitive phenomenon of A-not-B errors, we present the first systematic evaluation on the surprisingly vulnerable inhibitory control abilities of LLMs. We reveal that this weakness undermines LLMs' trustworthy reasoning capabilities across diverse domains, and introduce various mitigations.
|
|
Towards Large Language Models as Copilots for Theorem Proving in Lean
Peiyang Song, Kaiyu Yang, and Anima Anandkumar
NeurIPS Mathematical Reasoning and AI (MATH-AI) Workshop, 2023
arXiv
/
code
/
poster
/
demo
/
slides
/
media
We introduce a framework for running neural network inference directly in Lean. It enables programmers to build various LLM-based proof automation tools that integrate seamlessly into the workflow of Lean users, including tools for suggesting proof steps and completing intermediate proof goals using LLMs.
|
|
Energy Efficient Convolutions with Temporal Arithmetic
Rhys Gretsch, Peiyang Song, Advait Madhavan, Jeremy Lau, and Tim Sherwood
ACM Int'l Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024
paper
We introduce energy-efficient convolution that improve the energy per pixel of each convolution frame by more than 2× compared to a state-of-the-art while improving the energy delay product by four orders of magnitude, by developing a new temporal arithmetic with a negative log transformation.
|
|
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
Kaiyu Yang, Aidan Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, and Anima Anandkumar
Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track, 2023, Oral presentation
arXiv
/
project
/
code
/
poster
/
slides
/
media
Can LLMs generate mathematical proofs that can be rigorously checked? We release LeanDojo: an open-source playground consisting of toolkits, benchmarks, and models for LLMs to prove formal theorems in the Lean proof assistant.
|
Awards
- Early Research Scholarship (2023)
- Caltech SURF Award (2023)
- UCSB Creative Studies Honors (2022)
|
|