June 23–24     Chicago, IL

MMLS 2025

Midwest Machine Learning Symposium

Machine Learning Research in the Midwest



About the Event

The Midwest ML Symposium aims to convene regional machine learning researchers for stimulating discussions and debates, to foster cross-institutional collaboration, and to showcase the collective talent of ML researchers at all career stages. [past events]

When: June 23–24, 2025

Where: University of Chicago, Logan Center for the Arts [Google Map]

  • Directions, transportation, and parking near the Logan Center : here
  • General visitor parking at UChicago : here
  • Accommodations on or near campus : here
  • More information about dormitory housing for student symposium attendees forthcoming.
  • UChicago sun through trees       [Image Credit]
    Sponsor Opportunities

    The Midwest ML Symposium invites sponsors to have opportunities for exposure and connection with our community. In addition to supporting the regional Machine Learning community, you will be gratefully recognized in various media and materials, and have the opportunity to closely engage with symposium participants.

    Information: Learn about various sponsorship levels, benefits, and opportunities here! Sponsors are encouraged to contact the Midwest ML Symposium local organizing committee. To discuss special requirements and to ask general questions regarding sponsorship of the Symposium, please contact Anne Brown at annebrown@uchicago.edu.

    UChicago Quad       [Image Credit]
    2025 Organizers

    Haifeng Xu (Co-chair, UChicago) | Chenhao Tan (Co-chair, UChicago) | Ce Zhang (UChicago) | Zhiyuan Li (TTIC) | Ruqi Zhang (Purdue) | Ren Wang (IIT) | Emma Alexander (Northwestern) | Chaowei Xiao (UW Wisconsin)

    Local Organizer Committee (UChicago)

    Haifeng Xu (Co-chair) | Chenhao Tan (Co-chair) | Rebecca Willett (Stats/CS) | David Uminsky (DSI) | Maria Fernandez (DSI) | Mark Schulze (DSI)

    Advisory Board

    Rob Nowak (Chair, UW Madison) | Maxim Raginsky (UIUC) | Laura Balzano (UMich) | Avrim Blum (TTIC) | Rebecca Willett (UChicago) | Nati Srebro (TTIC) | Po-Ling Loh (Cambridge) | Matus Telgarsky (NYU) | Mike Franklin (UChicago)

    Plenary Speakers



    Sanjeev Arora
    Sanjeev Arora

    Charles C. Fitzmorris Professor of Computer Science at Princeton

    Director of Princeton Language and Intelligence

    Heng Ji
    Heng Ji

    Professor of Computer Science at UIUC

    Founding Director of Amazon-Illinois Center on AI

    Tuomas Sandholm
    Tuomas Sandholm

    Angel Jordan University Professor of Computer Science at CMU

    Co-director of CMU AI and also a serial entrepreneur

    Ben Zhao
    Ben Zhao

    Neubauer Professor of Computer Science at UChicago

    Time Magazine's "The 100 Most Influential People in AI" (2024)

    Invited Speakers



    Ari Holtzman
    Ari Holtzman

    University of Chicago

    Chaowei Xiao
    Chaowei Xiao

    Assistant Professor of Computer Sciences at University of Wisconsin–Madison

    Frederic Koehler
    Frederic Koehler

    University of Chicago

    Han Zhao
    Han Zhao

    University of Illinois at Urbana Champaign

    Haifeng Xu
    Haifeng Xu

    Assistant Professor of Computer Science at University of Chicago

    Huan Zhang
    Huan Zhang

    Assistant Professor of Electrical and Computer Engineering at UIUC

    Mengxue Hou
    Mengxue Hou

    Notre Dame University

    Sijia Liu
    Sijia Liu

    Michigan State University

    Tianhao Wang
    Tianhao Wang

    Toyota Technological Institute at Chicago

    Wei Hu
    Wei Hu

    University of Michigan

    Yexiang Xue
    Yexiang Xue

    Purdue University

    Yiping Lu
    Yiping Lu

    Northwestern University

    Zahra Ghodsi
    Zahra Ghodsi

    Purdue University

    Zirui (Ray) Liu
    Zirui (Ray) Liu

    University of Minnesota

    Ruqi Zhang
    Ruqi Zhang

    Purdue University

    Schedule

    Topic: Societal impact and the ivory tower: an adversarial ML perspective

    Abstract: It is undeniable that computing research has the power to rapidly reshape the world we live in, and ML is literally proving this point in real time. But it is also true that we often are not aware or cognizant of the positive and negative impacts of our work. In this talk, I argue that we as researchers need to be more accountable for not just our research results, but how they may be used in downstream applications. Recognizing such impacts is arguably a very challenging task itself. Using my own experience in recent adversarial ML projects, I describe the duality of ML’s impact today, both in real harms it has produced via misuse, and in protective benefits it can provide. I share some of the ethical questions we faced when considering the design and deployment of our tools Glaze and Nightshade, and our experiences through this process. Finally, I suggest some takeaways, including possible perspectives on evaluating new research directions, as well as some concrete research questions that offer potential for positive technical and societal impact.

    • Manling Li (Northwestern University) – RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
    • Chenxiao Yang (Toyota Technological Institute at Chicago) – PENCIL: Long Thoughts with Short Memory
    • Justin Wang (University of Chicago) – ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
    • Zhiqi Gao (University of Wisconsin–Madison) – Theoretical Physics Benchmark (TPBench) - a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
    • Athanasios Glentis (University of Minnesota) – Memory-Efficient LLM Pretraining via Minimalist Optimizer Design
    • Shuo Xie (Toyota Technological Institute at Chicago) – Adam Exploits ℓ∞-geometry of Loss Landscape via Coordinate-wise Adaptivity

    Sijia Liu – Robust Unlearning for LLMs
    As generative AI systems continue to evolve, the ability to selectively remove information from trained models, known as machine unlearning, has become increasingly essential for ensuring regulatory compliance, enforcing ethical constraints, and mitigating the retention of harmful or sensitive content. This talk focuses on a pressing challenge in this space: the robustness of unlearning in large language models (LLMs). We examine how current unlearning methods remain vulnerable to relearning attacks and post-unlearning fine-tuning, where previously removed knowledge can be partially recovered from a small subset of forgotten or auxiliary data. From an optimization perspective, we introduce a novel connection between robust unlearning and sharpness-aware minimization (SAM), showing that promoting flatter loss landscapes through smoothness-based optimization enhances a model’s resistance to relearning. This draws a natural parallel to principles from adversarial robustness. The talk concludes with a discussion of open challenges and future directions for embedding unlearning into the AI lifecycle, ensuring long-term safety, compliance, and trustworthiness across the data, model, and optimization stack.

    Zahra Ghodsi – Collaborating with Confidence: Securing Federated Learning Systems
    Artificial Intelligence (AI) is increasingly implemented in distributed settings thanks to its ability to process large amounts of data and its power to enable a wide range of applications. Networks of intelligent devices can therefore work collaboratively to facilitate new directions in several domains such as distributed healthcare and transportation. Deploying AI successfully in the distributed or federated setting requires collaboration of a large number of devices which belong to different parties. This collaboration, however, raises security concerns relating to privacy of assets and robustness in the presence of accidental or intentional errors. In this talk, I outline the challenges in developing secure and privacy-preserving federated learning frameworks where the data or even the identity of participants can be sensitive. I highlight the need for designing new holistic solutions where requirements such as privacy and robustness must be simultaneously guaranteed. I conclude by briefly discussing the lessons learned and future research directions.

    Han Zhao – Revisiting Scalarization in Multi-Task Learning
    Linear scalarization, i.e., combining all loss functions by a weighted sum, has been the default choice in the literature of multi-task learning (MTL) since its inception. In recent years, there has been a surge of interest in developing Specialized Multi-Task Optimizers (SMTOs) that treat MTL as a multi-objective optimization problem. However, it remains open whether there is a fundamental advantage of SMTOs over scalarization. In this talk, I will revisit scalarization from a theoretical perspective. I will be focusing on linear MTL models and studying whether scalarization is capable of fully exploring the Pareto front. Our findings reveal that, in contrast to recent works that claimed empirical advantages of scalarization, when the model is under-parametrized, scalarization is inherently incapable of full exploration, especially for those Pareto optimal solutions that strike the balanced trade-offs between multiple tasks. I will conclude the talk by briefly discussing the extension of our results to general nonlinear neural networks and our recent work on using online Chebyshev scalarization to controllably steer the search of Pareto optimal solutions.

    Wei Hu – Abrupt Learning in Transformers
    Training Transformers on algorithmic tasks frequently exhibits an intriguing "abrupt learning" phenomenon in their training dynamics: an extended performance plateau followed by a sudden, sharp improvement. In this talk, I will present several empirical observations aiming to uncover universal characteristics and underlying mechanisms behind such dynamics.

    Frederic Koehler – On Inductive Bias in Generative Modeling
    There has been a lot of work on understanding the inductive bias of learning via gradient descent and related algorithms. For example, many fascinating phenomena have been discovered in supervised settings such as linearized neural networks, matrix factorization, logistic regression, etc. There are, relatively speaking, fewer such examples which have been worked out in the case of generative modeling and density estimation. I will discuss one such example where we were able to rigorously analyze --- for variational autoencoders --- and the role that the data distribution plays in this setting.

    Tianhao Wang – Structured Preconditioners in Adaptive Optimization: A Unified Analysis
    We present a novel unified analysis for a broad class of adaptive optimization algorithms with structured (e.g., layerwise, diagonal, and kronecker-factored) preconditioners for both online regret minimization and offline convex optimization. Our analysis not only provides matching rate to several important structured preconditioned algorithms including diagonal AdaGrad, full-matrix AdaGrad, and AdaGrad-Norm, but also gives an improved convergence rate for a one-sided variant of Shampoo over that of original Shampoo. Interestingly, more structured preconditioners (e.g., diagonal Adagrad, AdaGrad-Norm which use less space and compute) are often presented as computationally efficient approximations to full-matrix Adagrad, aiming for improved optimization performance through better approximations. Our unified analysis challenges this prevailing view and reveals, perhaps surprisingly, that more structured preconditioners, despite using less space and computation per step, can outperform their less structured counterparts. To demonstrate this, we show that one-sided Shampoo, which is relatively much cheaper than full-matrix AdaGrad could outperform it both theoretically and experimentally.

    Topic: General search techniques without common knowledge for imperfect-information games, and application to superhuman Fog of War chess

    Abstract: Since the advent of AI, games have served as progress benchmarks, and most real-world settings are imperfect-information games. Meanwhile, imperfect-information variants of chess have existed for over a century, present extreme challenges, and have been the focus of significant AI research. Beyond calculation needed in regular chess, they require reasoning about information gathering, the opponent’s knowledge, signaling, bluffing, etc. The most popular variant, Fog of War (FoW) chess (aka. dark chess) is a recognized challenge problem in AI after superhuman performance was reached in no-limit Texas hold’em poker. We present Obscuro, the first superhuman AI for FoW chess. It introduces advances to search in imperfect-information games, enabling strong, scalable reasoning. Most prior search techniques - such as those used to achieve superhuman play in no-limit Texas hold’em - require the construction of the “common knowledge set” as a first step, making them unusable for games with this much imperfect information. Experiments against the prior state-of-the-art AI and human players - including the world’s best - show that Obscuro is significantly stronger. FoW chess is now the largest (by amount of imperfect information) turn-based game in which superhuman performance has been achieved and the largest game in which imperfect-information search has been successfully applied. This is joint work with my PhD student Brian Hu Zhang.

    Topic: ThemeLLM: A Retrieval and Structuring Approach for Theme-Focused, LLM-Guided Scientific Exploration

    Abstract: Large Language Models (LLMs) may bring unprecedented power for scientific discovery. However, current LLMs may still encounter major challenges for effective scientific exploration due to their lack of in-depth, theme-focused data and knowledge. Retrieval augmented generation (RAG) has recently become an interesting approach for augmenting LLMs with grounded, theme-specific datasets. We discuss the challenges of RAG and propose a retrieval and structuring (RAS) approach, which enhances RAG by improving retrieval quality and mining structures (e.g., extracting entities and relations and building knowledge graphs) to ensure its effective integration of theme-specific data with LLM. We show the promise of retrieval and structuring approach at augmenting LLMs and discuss its potential power for future LLM-enabled science exploration.

    • Sepehr Dehdashtian (Michigan State University) – OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes
    • Raphael Rossellini (University of Chicago) – Can a calibration metric be both testable and actionable?
    • Pascal Jutras (Purdue University) – Consistent Controlled Diffusion Samplers Achieve Single-Step Sampling
    • Zhenghao Zhao (University of Illinois Chicago) – Distilling long-tailed datasets
    • Anthony Goeckner (Northwestern University) – Graph Neural Network-based Multi-agent Reinforcement Learning for Resilient Distributed Coordination of Multi-Robot Systems
    • Feiran Wang (Illinois Institute of Technology) – X-Field: A Physically Grounded Representation for 3D X-ray Reconstruction

    Yiping Lu – Two Tales, One Resolution: Physics-Informed Inference Time Scaling and Precondition
    In this talk, I will introduce a novel framework for physics-informed debiasing of machine learning estimators, which we call Simulation-Calibrated Scientific Machine Learning (SCaSML). This approach leverages the structure of physical models to achieve three key objectives: (1) Unbiased Predictions: It produces unbiased predictions even when the underlying machine learning predictor is biased. (2) Overcoming Dimensionality Challenges: It mitigates the curse of dimensionality that often affects high-dimensional estimators. (3) Inference Time Scaling: Improve the machine learning estimation by allocating inference time computation.

    The SCaSML paradigm integrates a (potentially) biased machine learning algorithm with a de-biasing procedure that is rigorously designed using numerical analysis and stochastic simulation. We dynamically refine and debias the SCiML predictions during inference by enforcing the physical laws. Our methodology aligns with recent advances in inference-time computation—similar to those seen in the large language model literature—demonstrating that additional computation can enhance ML estimates.

    Furthermore, we establish a surprising equivalence between our framework and another research direction that utilizes approximate (linearized) solvers to precondition iterative methods. This connection not only bridges two distinct areas of study but also offers new insights and algorithms into improving estimation accuracy in physics-informed machine learning settings.

    Mengxue Hou – Assured Neural-symbolic Abstraction for Hierarchical Robotic Planning
    To enable a smart and autonomous system to be cognizant, taskable, and adaptive in exploring an unknown and unstructured environment, robotic decision-making relies on learning a parameterized knowledge representation. However, one fundamental challenge in deriving the parameterized representation is the undesirable trade-off between computation efficiency and model fidelity. This talk addresses this challenge in the context of underwater vehicle navigation in unknown marine environments. To improve fidelity of the reduced-order model, we develop a learning method to generate a non-Markovian reduced-order representation of the environmental dynamics. Such abstraction guarantees to improve the modeling accuracy. Further, taking advantage of the abstracted model, we develop a Large-Language-Model-guided hierarchical planner to translate human specified missions directly to a set of executable actions with low computation cost.

    Yexiang Xue – Embedding Automated Reasoning into Neural Generation
    Automated reasoning and machine learning are two fundamental pillars of artificial intelligence. Many real-world applications are beyond reach when reasoning or learning are applied in isolation. Reasoning without learning leads to rigid and brittle formulations, while learning without reasoning produces suboptimal models violating critical constraints, hallucinating, and behaving unexpectedly in unseen situations. This talk introduces Spatial Reasoning Integrated Generator (SPRING) for design generation. SPRING embeds a neural and symbolic integrated spatial reasoning module inside the deep generative network. The spatial reasoning module samples the set of locations of objects to be generated from a backtrack-free distribution, guaranteed to satisfy user specifications while capturing subtle utility and aesthetics.

    SPRING offers interpretability, allowing users to visualize and diagnose the generation process through visualizing the predictions of neural networks. SPRING is also adept at managing novel user specifications, thanks to its proficiency in zero-shot constraint transfer. SPRING is supported by our recently defined Contextual Analog Logic with Multimodality (CALM), in which predicates have analog truth values to capture subtle human preferences. CALM is grounded in multimodal environments (texts and images) with the aid of neural networks, while classic logic requires explicit definition of symbolic representations and their groundings, which can be ad-hoc, brittle, and unscalable.

    Topic: LLM Skills and Metacognition: Scaffolding for new forms of learning?

    Abstract: LLMs, especially their recent “reasoning” incarnations, are capable of impressive problem solving. This talk will argue that a key role in this success is their “metacognition” capabilities (“thinking about thinking”), which we find arise spontaneously in LLMs. We’ll give diverse examples of such metacognition and argue that it gives insight into how LLM training gives rise to complex capabilities, as well as how these capabilities may be enhanced in future. We will also introduce “Concept-enhanced learning”, a simple setting that gives a hint about how LLM metacognition itself may emerge.

    Ruqi Zhang – Toward Capable and Reliable LLMs via Probabilistic Modeling
    As large language models (LLMs) are increasingly deployed in complex and high-stakes applications, advancing their capabilities and reliability is essential. In this talk, I will explore how probabilistic modeling provides principled and effective approaches for moving toward more capable and reliable LLMs, with a focus on reasoning, alignment, and safety.

    First, I will explore how self-correction—viewed as modeling the probabilistic relationship between initial and revised reasoning paths—can serve as a powerful strategy for improving LLM reasoning, even with limited annotated data. Next, I will introduce a framework that casts LLM alignment as a problem of probabilistic inference, and present two discrete sampling techniques for efficient inference. Finally, I will show how variational inference can be used to automatically uncover diverse adversarial inputs, providing a comprehensive, distributional characterization of model vulnerabilities.

    Ari Holtzman – Articulating the Ineffable: What we can’t yet (define/express) about (LLMs/ourselves)
    One of the most frustrating parts about trying to work with deep generative models is that we are often unable to satisfactorily define what they are doing and how they do it. What do models consistently miss? What do they consistently believe? How do they store new information? In addition to current concrete studies, I will make the case that LLM systems can and should be used to future-proof humans against the influence of increasingly persuasive LLMs. By helping us articulate ideas that express our deeply held individual intuitions, machine-assisted expression can help us make humans less manipulable—and helps us know ourselves better.

    Zirui Liu – Massive Outlier Values in LLMs: Engineering and Science
    Deploying LLMs for long context processing and long generation scenarios are major challenges in LLM serving. A variety of compression techniques have been proposed like quantization, token eviction, and linear-attention models. However, our understanding of how LLMs internally process information is still limited. In this talk, I will highlight one widely existing but under-discussed observation: the abnormal distribution of massive outlier values in the Key and Value token embeddings within self-attention modules. We show how these extreme values are closely tied to context processing and demonstrate ways to leverage them for more efficient computation.

    On the engineering side, I’ll introduce our work on 2-bit KV cache quantization, which significantly improves both memory usage and inference throughput. On the scientific side, I’ll discuss our new findings on the role these extreme values play in shaping model behavior.

    Posters



    Poster Sessions and Presenters

    Registration


    Registration has closed. To be added to the waitlist, please sign up here.

    Sponsors

    View the details of sponsorship opportunities here!