MMLS 2025

Machine Learning Research in the Midwest

About the Event

The Midwest ML Symposium aims to convene regional machine learning researchers for stimulating discussions and debates, to foster cross-institutional collaboration, and to showcase the collective talent of ML researchers at all career stages. [past events]

When: June 23–24, 2025

Where: University of Chicago, Logan Center for the Arts [Google Map]

Directions, transportation, and parking near the Logan Center : here

General visitor parking at UChicago : here

Accommodations on or near campus : here

UChicago sun through trees [Image Credit]

Sponsor Opportunities

The Midwest ML Symposium invites sponsors to have opportunities for exposure and connection with our community. In addition to supporting the regional Machine Learning community, you will be gratefully recognized in various media and materials, and have the opportunity to closely engage with symposium participants.

Information: Learn about various sponsorship levels, benefits, and opportunities here! Sponsors are encouraged to contact the Midwest ML Symposium local organizing committee. To discuss special requirements and to ask general questions regarding sponsorship of the Symposium, please contact Anne Brown at annebrown@uchicago.edu.

UChicago Quad [Image Credit]

2025 Organizers

Local Organizer Committee (UChicago)

Advisory Board

Plenary Speakers

Sanjeev Arora

Charles C. Fitzmorris Professor of Computer Science at Princeton

Director of Princeton Language and Intelligence

Heng Ji

Professor of Computer Science at UIUC

Founding Director of Amazon-Illinois Center on AI

Tuomas Sandholm

Angel Jordan University Professor of Computer Science at CMU

Co-director of CMU AI and also a serial entrepreneur

Ben Zhao

Neubauer Professor of Computer Science at UChicago

Time Magazine's "The 100 Most Influential People in AI" (2024)

Invited Speakers

Ari Holtzman

University of Chicago

Chaowei Xiao

University of Wisconsin–Madison

Frederic Koehler

University of Chicago

Han Zhao

University of Illinois at Urbana Champaign

Haifeng Xu

University of Chicago

Huan Zhang

University of Illinois Urbana-Champaign

Mengxue Hou

Notre Dame University

Sijia Liu

Michigan State University

Tianhao Wang

Toyota Technological Institute at Chicago

Wei Hu

University of Michigan

Yexiang Xue

Purdue University

Yiping Lu

Northwestern University

Zahra Ghodsi

Purdue University

Zirui (Ray) Liu

University of Minnesota

Ruqi Zhang

Purdue University

Schedule

View Full Program (PDF)

Day 1

Monday, June 23
Day 2

Tuesday, June 24

8:00-9:00am

Registration/Breakfast

9:00-9:15am

Opening Remarks

9:15-10:15am

[+] Plenary: Ben Zhao (UChicago)

Topic: Societal impact and the ivory tower: an adversarial ML perspective

Abstract: It is undeniable that computing research has the power to rapidly reshape the world we live in, and ML is literally proving this point in real time. But it is also true that we often are not aware or cognizant of the positive and negative impacts of our work. In this talk, I argue that we as researchers need to be more accountable for not just our research results, but how they may be used in downstream applications. Recognizing such impacts is arguably a very challenging task itself. Using my own experience in recent adversarial ML projects, I describe the duality of ML’s impact today, both in real harms it has produced via misuse, and in protective benefits it can provide. I share some of the ethical questions we faced when considering the design and deployment of our tools Glaze and Nightshade, and our experiences through this process. Finally, I suggest some takeaways, including possible perspectives on evaluating new research directions, as well as some concrete research questions that offer potential for positive technical and societal impact.

10:15-10:45am

[+] Lightning Talks Session 1: AI Reasoning, Agents, and Model Optimization

Manling Li (Northwestern University) – RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
Chenxiao Yang (Toyota Technological Institute at Chicago) – PENCIL: Long Thoughts with Short Memory
Justin Wang (University of Chicago) – ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
Zhiqi Gao (University of Wisconsin–Madison) – Theoretical Physics Benchmark (TPBench) - a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Athanasios Glentis (University of Minnesota) – Memory-Efficient LLM Pretraining via Minimalist Optimizer Design
Shuo Xie (Toyota Technological Institute at Chicago) – Adam Exploits ℓ∞-geometry of Loss Landscape via Coordinate-wise Adaptivity

10:45-11:00am

Break

11:00–12:15pm

[+] Trustworthy Machine Learning: Sijia Liu (MSU), Zahra Ghodsi (Purdue), Han Zhao (UIUC)

Sijia Liu – Robust Unlearning for LLMs
As generative AI systems continue to evolve, the ability to selectively remove information from trained models, known as machine unlearning, has become increasingly essential for ensuring regulatory compliance, enforcing ethical constraints, and mitigating the retention of harmful or sensitive content. This talk focuses on a pressing challenge in this space: the robustness of unlearning in large language models (LLMs). We examine how current unlearning methods remain vulnerable to relearning attacks and post-unlearning fine-tuning, where previously removed knowledge can be partially recovered from a small subset of forgotten or auxiliary data. From an optimization perspective, we introduce a novel connection between robust unlearning and sharpness-aware minimization (SAM), showing that promoting flatter loss landscapes through smoothness-based optimization enhances a model’s resistance to relearning. This draws a natural parallel to principles from adversarial robustness. The talk concludes with a discussion of open challenges and future directions for embedding unlearning into the AI lifecycle, ensuring long-term safety, compliance, and trustworthiness across the data, model, and optimization stack.

Zahra Ghodsi – Collaborating with Confidence: Securing Federated Learning Systems
Artificial Intelligence (AI) is increasingly implemented in distributed settings thanks to its ability to process large amounts of data and its power to enable a wide range of applications. Networks of intelligent devices can therefore work collaboratively to facilitate new directions in several domains such as distributed healthcare and transportation. Deploying AI successfully in the distributed or federated setting requires collaboration of a large number of devices which belong to different parties. This collaboration, however, raises security concerns relating to privacy of assets and robustness in the presence of accidental or intentional errors. In this talk, I outline the challenges in developing secure and privacy-preserving federated learning frameworks where the data or even the identity of participants can be sensitive. I highlight the need for designing new holistic solutions where requirements such as privacy and robustness must be simultaneously guaranteed. I conclude by briefly discussing the lessons learned and future research directions.

Han Zhao – Revisiting Scalarization in Multi-Task Learning
Linear scalarization, i.e., combining all loss functions by a weighted sum, has been the default choice in the literature of multi-task learning (MTL) since its inception. In recent years, there has been a surge of interest in developing Specialized Multi-Task Optimizers (SMTOs) that treat MTL as a multi-objective optimization problem. However, it remains open whether there is a fundamental advantage of SMTOs over scalarization. In this talk, I will revisit scalarization from a theoretical perspective. I will be focusing on linear MTL models and studying whether scalarization is capable of fully exploring the Pareto front. Our findings reveal that, in contrast to recent works that claimed empirical advantages of scalarization, when the model is under-parametrized, scalarization is inherently incapable of full exploration, especially for those Pareto optimal solutions that strike the balanced trade-offs between multiple tasks. I will conclude the talk by briefly discussing the extension of our results to general nonlinear neural networks and our recent work on using online Chebyshev scalarization to controllably steer the search of Pareto optimal solutions.

12:15-1:45pm

Lunch + Poster Session A

1:45–3:00pm

[+] Deep Learning Theory and Optimization: Wei Hu (UM), Frederic Koehler (UChicago), Tianhao Wang (TTIC)

Wei Hu – Abrupt Learning in Transformers
Training Transformers on algorithmic tasks frequently exhibits an intriguing "abrupt learning" phenomenon in their training dynamics: an extended performance plateau followed by a sudden, sharp improvement. In this talk, I will present several empirical observations aiming to uncover universal characteristics and underlying mechanisms behind such dynamics.

Frederic Koehler – On Inductive Bias in Generative Modeling
There has been a lot of work on understanding the inductive bias of learning via gradient descent and related algorithms. For example, many fascinating phenomena have been discovered in supervised settings such as linearized neural networks, matrix factorization, logistic regression, etc. There are, relatively speaking, fewer such examples which have been worked out in the case of generative modeling and density estimation. I will discuss one such example where we were able to rigorously analyze --- for variational autoencoders --- and the role that the data distribution plays in this setting.

Tianhao Wang – Structured Preconditioners in Adaptive Optimization: A Unified Analysis
We present a novel unified analysis for a broad class of adaptive optimization algorithms with structured (e.g., layerwise, diagonal, and kronecker-factored) preconditioners for both online regret minimization and offline convex optimization. Our analysis not only provides matching rate to several important structured preconditioned algorithms including diagonal AdaGrad, full-matrix AdaGrad, and AdaGrad-Norm, but also gives an improved convergence rate for a one-sided variant of Shampoo over that of original Shampoo. Interestingly, more structured preconditioners (e.g., diagonal Adagrad, AdaGrad-Norm which use less space and compute) are often presented as computationally efficient approximations to full-matrix Adagrad, aiming for improved optimization performance through better approximations. Our unified analysis challenges this prevailing view and reveals, perhaps surprisingly, that more structured preconditioners, despite using less space and computation per step, can outperform their less structured counterparts. To demonstrate this, we show that one-sided Shampoo, which is relatively much cheaper than full-matrix AdaGrad could outperform it both theoretically and experimentally.

3:00-3:30pm

Break

3:30-4:35pm

[+] Plenary: Tuomas Sandholm (CMU)

Topic: General search techniques without common knowledge for imperfect-information games, and application to superhuman Fog of War chess

Abstract: Since the advent of AI, games have served as progress benchmarks, and most real-world settings are imperfect-information games. Meanwhile, imperfect-information variants of chess have existed for over a century, present extreme challenges, and have been the focus of significant AI research. Beyond calculation needed in regular chess, they require reasoning about information gathering, the opponent’s knowledge, signaling, bluffing, etc. The most popular variant, Fog of War (FoW) chess (aka. dark chess) is a recognized challenge problem in AI after superhuman performance was reached in no-limit Texas hold’em poker. We present Obscuro, the first superhuman AI for FoW chess. It introduces advances to search in imperfect-information games, enabling strong, scalable reasoning. Most prior search techniques - such as those used to achieve superhuman play in no-limit Texas hold’em - require the construction of the “common knowledge set” as a first step, making them unusable for games with this much imperfect information. Experiments against the prior state-of-the-art AI and human players - including the world’s best - show that Obscuro is significantly stronger. FoW chess is now the largest (by amount of imperfect information) turn-based game in which superhuman performance has been achieved and the largest game in which imperfect-information search has been successfully applied. This is joint work with my PhD student Brian Hu Zhang.

4:30-5:30pm

AI 2050 Special Session

5:30-7:30pm

Poster Session B + Reception

8:00-9:00am

Breakfast and Registration

9:00-10:00am

[+] Plenary: Heng Ji (UIUC)

Topic: mCLM: A Function-Infused and Synthesis-Friendly Modular Chemical Language Model

Abstract: Everything in our wonderful world is composed of molecules. Recent advances in block-based chemistry involve the manual design of drugs and materials by decomposing molecules into building blocks—i.e., functional modules—and reassembling them into new molecules with desired functions. However, the process of discovering and manufacturing functional molecules has remained slow, expensive, and highly specialist-dependent. In this talk I will present our recent efforts at teaching computers to speak two complementary languages: one that represents molecular subgraph structures indicative of specific functions, and another that describes these functions in natural language. Unlike existing approaches that add such knowledge as a post hoc step, we have developed a tiny prototype (1B parameters) function- and synthesis-aware modular chemical language model (mCLM) which has proven to outperform ChatGPT on discovering new drugs with better functions.

10:00-10:30am

[+] Lightning Talks Session 2: Machine Learning Foundations and Applications

Sepehr Dehdashtian (Michigan State University) – OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes
Raphael Rossellini (University of Chicago) – Can a calibration metric be both testable and actionable?
Pascal Jutras (Purdue University) – Consistent Controlled Diffusion Samplers Achieve Single-Step Sampling
Zhenghao Zhao (University of Illinois Chicago) – Distilling long-tailed datasets
Feiran Wang (Illinois Institute of Technology) – X-Field: A Physically Grounded Representation for 3D X-ray Reconstruction

10:30-10:45am

Break

10:45–12:00pm

[+] AI for Science/Engineering/Robotics: Yiping Lu (Northwestern), Mengxue Hou (Notre Dame), Yexiang Xue (Purdue)

Yiping Lu – Two Tales, One Resolution: Physics-Informed Inference Time Scaling and Precondition
In this talk, I will introduce a novel framework for physics-informed debiasing of machine learning estimators, which we call Simulation-Calibrated Scientific Machine Learning (SCaSML). This approach leverages the structure of physical models to achieve three key objectives: (1) Unbiased Predictions: It produces unbiased predictions even when the underlying machine learning predictor is biased. (2) Overcoming Dimensionality Challenges: It mitigates the curse of dimensionality that often affects high-dimensional estimators. (3) Inference Time Scaling: Improve the machine learning estimation by allocating inference time computation.

The SCaSML paradigm integrates a (potentially) biased machine learning algorithm with a de-biasing procedure that is rigorously designed using numerical analysis and stochastic simulation. We dynamically refine and debias the SCiML predictions during inference by enforcing the physical laws. Our methodology aligns with recent advances in inference-time computation—similar to those seen in the large language model literature—demonstrating that additional computation can enhance ML estimates.

Furthermore, we establish a surprising equivalence between our framework and another research direction that utilizes approximate (linearized) solvers to precondition iterative methods. This connection not only bridges two distinct areas of study but also offers new insights and algorithms into improving estimation accuracy in physics-informed machine learning settings.

Mengxue Hou – Assured Neural-symbolic Abstraction for Hierarchical Robotic Planning
To enable a smart and autonomous system to be cognizant, taskable, and adaptive in exploring an unknown and unstructured environment, robotic decision-making relies on learning a parameterized knowledge representation. However, one fundamental challenge in deriving the parameterized representation is the undesirable trade-off between computation efficiency and model fidelity. This talk addresses this challenge in the context of underwater vehicle navigation in unknown marine environments. To improve fidelity of the reduced-order model, we develop a learning method to generate a non-Markovian reduced-order representation of the environmental dynamics. Such abstraction guarantees to improve the modeling accuracy. Further, taking advantage of the abstracted model, we develop a Large-Language-Model-guided hierarchical planner to translate human specified missions directly to a set of executable actions with low computation cost.

Yexiang Xue – Embedding Automated Reasoning into Neural Generation
Automated reasoning and machine learning are two fundamental pillars of artificial intelligence. Many real-world applications are beyond reach when reasoning or learning are applied in isolation. Reasoning without learning leads to rigid and brittle formulations, while learning without reasoning produces suboptimal models violating critical constraints, hallucinating, and behaving unexpectedly in unseen situations. This talk introduces Spatial Reasoning Integrated Generator (SPRING) for design generation. SPRING embeds a neural and symbolic integrated spatial reasoning module inside the deep generative network. The spatial reasoning module samples the set of locations of objects to be generated from a backtrack-free distribution, guaranteed to satisfy user specifications while capturing subtle utility and aesthetics.

SPRING offers interpretability, allowing users to visualize and diagnose the generation process through visualizing the predictions of neural networks. SPRING is also adept at managing novel user specifications, thanks to its proficiency in zero-shot constraint transfer. SPRING is supported by our recently defined Contextual Analog Logic with Multimodality (CALM), in which predicates have analog truth values to capture subtle human preferences. CALM is grounded in multimodal environments (texts and images) with the aid of neural networks, while classic logic requires explicit definition of symbolic representations and their groundings, which can be ad-hoc, brittle, and unscalable.

12:00-1:00pm

Lunch + Poster Session C

1:00-2:00pm

[+] Plenary: Sanjeev Arora (Princeton)

Topic: LLM Skills and Metacognition: Scaffolding for new forms of learning?

Abstract: LLMs, especially their recent “reasoning” incarnations, are capable of impressive problem solving. This talk will argue that a key role in this success is their “metacognition” capabilities (“thinking about thinking”), which we find arise spontaneously in LLMs. We’ll give diverse examples of such metacognition and argue that it gives insight into how LLM training gives rise to complex capabilities, as well as how these capabilities may be enhanced in future. We will also introduce “Concept-enhanced learning”, a simple setting that gives a hint about how LLM metacognition itself may emerge.

2:00-2:15pm

Break

2:15–3:30pm

[+] Generative AI: Ruqi Zhang (Purdue), Ari Holtzman (UChicago), Zirui Liu (UMN)

Ruqi Zhang – Toward Capable and Reliable LLMs via Probabilistic Modeling
As large language models (LLMs) are increasingly deployed in complex and high-stakes applications, advancing their capabilities and reliability is essential. In this talk, I will explore how probabilistic modeling provides principled and effective approaches for moving toward more capable and reliable LLMs, with a focus on reasoning, alignment, and safety.

First, I will explore how self-correction—viewed as modeling the probabilistic relationship between initial and revised reasoning paths—can serve as a powerful strategy for improving LLM reasoning, even with limited annotated data. Next, I will introduce a framework that casts LLM alignment as a problem of probabilistic inference, and present two discrete sampling techniques for efficient inference. Finally, I will show how variational inference can be used to automatically uncover diverse adversarial inputs, providing a comprehensive, distributional characterization of model vulnerabilities.

Ari Holtzman – Articulating the Ineffable: What we can’t yet (define/express) about (LLMs/ourselves)
One of the most frustrating parts about trying to work with deep generative models is that we are often unable to satisfactorily define what they are doing and how they do it. What do models consistently miss? What do they consistently believe? How do they store new information? In addition to current concrete studies, I will make the case that LLM systems can and should be used to future-proof humans against the influence of increasingly persuasive LLMs. By helping us articulate ideas that express our deeply held individual intuitions, machine-assisted expression can help us make humans less manipulable—and helps us know ourselves better.

Zirui Liu – Massive Outlier Values in LLMs: Engineering and Science
Deploying LLMs for long context processing and long generation scenarios are major challenges in LLM serving. A variety of compression techniques have been proposed like quantization, token eviction, and linear-attention models. However, our understanding of how LLMs internally process information is still limited. In this talk, I will highlight one widely existing but under-discussed observation: the abnormal distribution of massive outlier values in the Key and Value token embeddings within self-attention modules. We show how these extreme values are closely tied to context processing and demonstrate ways to leverage them for more efficient computation.

On the engineering side, I’ll introduce our work on 2-bit KV cache quantization, which significantly improves both memory usage and inference throughput. On the scientific side, I’ll discuss our new findings on the role these extreme values play in shaping model behavior.

3:30-3:50pm

Closing Remarks

Posters

Poster Sessions and Presenters

MMLS 2025 Best Poster Award Recipients:

Jake Trauger (University of Michigan) – On Next‑Token Prediction in LLMs: How End Goals Determine the Consistency of Decoding Algorithms
Raphael Rossellini (University of Chicago) – Can a Calibration Metric Be Both Testable and Actionable?
Nana Porter-Honicky (University of Michigan) – Forecasting Gait Kinetics and Kinematics for Biological Joint Impedance Estimation Using ML
Chenxiao Yang (TTIC) – PENCIL: Long Thoughts with Short Memory
Manling Li (Northwestern University) – RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

Honorable Mentions:

Dang Nguyen (University of Chicago) – On the Effectiveness and Generalization of Race Representations for Debiasing High-Stakes Decisions
David Stewart (Wayne State) – Fourier Neural Operators applied to ultra-relativistic quark-gluon hydrodynamics: modeling, normalization, and super resolution
Jingyang Lyu (University of Wisconsin–Madison) – A statistical theory of overfitting for imbalanced classification
Changyu Gao (University of Wisconsin–Madison) – Optimal Rates for Robust Stochastic Convex Optimization
Xiaoyan Bai (University of Chicago) – Concept Incongruence: An Exploration of Time and Death in Role Playing
Tianyu Cao (Purdue University) – A Unified Acceleration Framework for Decentralized Optimization
Tianao Li (Northwestern University) – BayesiaNF: Scalable Posterior Estimation for Bayesian Inverse Imaging
Sida Li (University of Chicago) – Prediction‑Powered Adaptive Shrinkage Estimation
Sepehr Dehdashtian (Michigan State University) – OASIS Uncovers: High-Quality T2I Models