[Paper Review] Constrained optimization under uncertainty for decision-making problems: Application to Real-Time Strategy games
This paper proposes a novel method to solve constrained optimization problems under uncertainty within the standard Constrained Optimization Problem (COP) formalism by integrating Rank Dependent Utility (RDU) from decision theory. It enables regular constraint solvers to handle uncertainty without new formalisms or solvers, demonstrated by a winning bot in the 2018 µRTS AI competition using RDU-based decision-making for unit production under partial observability.
Decision-making problems can be modeled as combinatorial optimization problems with Constraint Programming formalisms such as Constrained Optimization Problems. However, few Constraint Programming formalisms can deal with both optimization and uncertainty at the same time, and none of them are convenient to model problems we tackle in this paper. Here, we propose a way to deal with combinatorial optimization problems under uncertainty within the classical Constrained Optimization Problems formalism by injecting the Rank Dependent Utility from decision theory. We also propose a proof of concept of our method to show it is implementable and can solve concrete decision-making problems using a regular constraint solver, and propose a bot that won the partially observable track of the 2018 {\mu}RTS AI competition. Our result shows it is possible to handle uncertainty with regular Constraint Programming solvers, without having to define a new formalism neither to develop dedicated solvers. This brings new perspective to tackle uncertainty in Constraint Programming.
Motivation & Objective
- To address the lack of constraint programming formalisms that simultaneously handle optimization and uncertainty in combinatorial decision problems.
- To model single-stage decision-making problems where uncertainty affects only the objective function, not the constraints.
- To enable standard COP solvers to handle uncertainty by embedding decision-theoretic utility models like RDU.
- To demonstrate practical applicability through a competitive AI bot in the partially observable µRTS game environment.
- To show that RDU-based optimization outperforms Expected Utility and random strategies in real-time strategy decision-making under uncertainty.
Proposed method
- Adapts the Rank Dependent Utility (RDU) framework from decision theory to rank solutions in COPs under uncertainty.
- Uses the objective function as a utility score and applies RDU’s weighting of cumulative probabilities to rank decision outcomes.
- Models the decision problem as a standard COP with deterministic constraints and an RDU-transformed objective function.
- Applies the RDU model with both optimistic and pessimistic weighting functions (φ) to reflect risk preferences.
- Implements the model in a constraint solver (GHOST) to generate unit production strategies in µRTS under fog-of-war.
- Employs a non-adaptive, single-stage decision model where decisions are made before stochastic outcomes (enemy strategies) are revealed.
Experimental results
Research questions
- RQ1Can standard COP formalisms be extended to handle uncertainty in the objective function without introducing new formalisms or solvers?
- RQ2How does RDU-based optimization compare to Expected Utility and random decision-making in partially observable RTS games?
- RQ3Can a COP-based approach with RDU outperform existing methods in real-time strategy AI under uncertainty?
- RQ4What impact do risk preferences (optimistic vs. pessimistic φ) have on performance in short-horizon RTS decision-making?
- RQ5Is it feasible to implement uncertainty-aware optimization using only standard constraint solvers and decision-theoretic utility models?
Key findings
- The RDU-based approach outperformed both Expected Utility and random unit production strategies in the 2018 µRTS AI competition, winning the partially observable track.
- On small maps (8x8, 12x12, 16x16), the RDU method with pessimistic φ achieved the highest normalized score (59.5), surpassing Expected Utility (56.5) and the baseline (52.5).
- On large maps (24x24, 32x32, 64x64), the RDU method with optimistic φ achieved the best score (81.5), significantly outperforming the baseline (76.0) and Expected Utility (78.5).
- The pessimistic RDU variant performed better on small maps, likely due to the need for immediate reaction to unfavorable enemy compositions in confined spaces.
- The method successfully enabled a standard COP solver to handle uncertainty without modifying the solver or formalism, proving feasibility and practicality.
- The results confirm that RDU-based utility modeling allows constraint solvers to effectively rank and select decisions under uncertainty, even in complex, partially observable environments.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.