π -> 04/18/25: ECS170-L9
Games and Adversarial Search Review
π€ Vocab
β Unit and Larger Context
Review of Games and Adversarial Search
Minimax v.s. MCTS, Which is best depends on
- Accuracy of evaluation function for minimax or alpha-beta, versus
- Selection policy of MCTS
If a single move can change the course of the game, MCTS might not find it. - Algorithms can combine minimax and MCTS
Non Classical Search
Local Search as Optimization
- Introduction: local search and optimization problems
- Hill Climbing
- Simulated Annealing
- Beam Search
AND-OR Search in Non-Deterministic Environments - Belief States and AND-OR Search
βοΈ -> Scratch Notes
All problems can be formulated as search problems with varying levels of complexities. In this class we will cover the most straightforward ones
Local Search
- When path to goal doesnβt matter, you only care about the goal state
- Iterate
- Remember only current state
- Move to best neighboring state, forget the past
Advantages:
- Low memory usage
- Often finds reasonable solutions in large/infinite state spaces
- Useful for solving optimization problems
Optimization
Find or approximate best state according to some objective function
Usually can find an optimal solution when the search space is convex, or the objective function is convex
Optimization - Finds a global maximum or minimum
Local Search - Finds the best state (the lowest valley or the highest peak) in the state space
Hill Climbing
Given the current state
- If
then move to - Else, halt at n
Terminates when a peak is reach
Has no look-ahead beyond the immediate neighbors of the current state
Chooses a random best successor if there is more than one
Cannot backtrack, since it doesnβt remember where itβs been
How to avoid local maxima?
- Randomly generate initial state, conduct search from there
- Works very well with few local maxima
| Pros | Cons |
|---|---|
| Convex problems solved fast | Real problems rarely convex |
| If downhill moves not allowed, cannot escape local maxima | |
| Random restart is complete, but very inefficient |
Simulated Annealing
- Combine hill climbing with a random walk in some way that yields both efficiency and completeness
- Basic idea: allows to pick an action that results in a worse state, but gradually diminishes this probability
Adopt a control parameter T (also called βtemperatureβ)
- T controls the probability π in picking a worse successor
- T starts out high to promote escape from local maxima early on
- T gradually decreases toward 0 over time to avoid getting thrown off track late in the search (based on a βscheduleβ function)
Beam Search
Hill climbing only keeps track of one best state
Beam search keeps track of k states in memory
- starts with π randomly selected initial states
- Find successors of all π states
- Take top π successors across all successors and put them in memory
- repeat
π§ͺ -> Refresh the Info
Did you generally find the overall content understandable or compelling or relevant or not, and why, or which aspects of the reading were most novel or challenging for you and which aspects were most familiar or straightforward?)
Did a specific aspect of the reading raise questions for you or relate to other ideas and findings youβve encountered, or are there other related issues you wish had been covered?)
π -> Links
Resources
- Put useful links here
Connections
- Link all related words