If you're here, chances are you've read about A* search. Just a quick review:

It is a state space search algorithm, that examines nodes in a space full of them. At any moment of your search, you've got a "frontier" of nodes that you may be immediately examining:

Search_fig1

Associated with each node, s, is a cost of reaching it, g(s). You proceed with your seach by examining your frontier of nodes, and picking one that minimizes a criteria I will mention in a moment, f(s). You then "expand" s, meaning you find out s's immediate neighbors in this space and then get rid of it (remove it from the frontier), and then you stick those neighbors on the frontier list. Thus, the frontier list contains every node you have not expanded. A* is just a way of telling you, which one you're going to expand next.

So, right at this moment in the search depicted in the figure, the frontier is:

s1 G' n s2 s3

Of course the whole point of doing the walk through this space is to find a node that has particular qualities that you're looking for, i.e. a goal node. Multiple nodes in the space may actually contain the goal criteria, it's just that in doing this search you're looking for the path of nodes that gets you to a goal containing node optimally.

Back to selecting nodes on the frontier. The special thing to A* is in the function f(s) that you use to rank the nodes within the frontier. You select the next node on the frontier to look at using a best first search on the function f like so:

For each node s on the frontier, f(s) = g(s) + h(s), where g(s) is the measured (actual) cost to get from the start node to here (node s), and h(s) is the estimated cost of getting from here (node s) to the goal.

Now, if h(s) (the estimated cost of getting from here to the goal) is guaranteed to be less than or equal to the actual cost (i.e., h(s) is admissible), then you can make all sorts of guarantees about the outcome of the search process. Specifically, that A* is complete (if there is a solution, A* will find it) and optimal (the solution it finds costs the least to get to starting from the initial node).

What happens when h(s) is no longer admissible?

Well, let's see. Suppose you've got a frontier of nodes at some point in the search:

One of them is G', which as the goal in it, although suppose we known that G' is not the optimal goal. This means:

f(G') = g(G') + h(G')

Suppose that the optimal way of getting to a goal costs you C, i.e. the optimal goal G is such that:

f(G) = g(G) + h(G) = g(G) + 0 (because h(s) = 0 at a goal node) = C

So, regarding that suboptimal goal G':

f(G') = g(G') + h(G') = g(G') > C

If there is actually a path from the initial node to the optimal goal, then there must be some node, n, in your current frontier that leads you to the optimal goal. i.e., if you expanded n, you would find the optimal route to the goal in that sub tree. For that node, n:

f(n) = g(n) + h(n) <= C

if h is admissible. Thus, n has a lower f value than G' does, so we can see that A* always chooses to expand the path through n first (leading to the optimal solution).

If h is not admissible, then G' is still not the optimal solution:

f(G') = h(G') + g(G') > C

just as before. However, regarding that other node n that actually leads to the optimal solution:

f(n) = h(n) + g(n)

f(n) is not necessarily less than C. That is, if h(n) is an overestimation, then f(n) is definitely not less than C. So now you have both G' and n sitting in the frontier, and both are bigger than C. Thus, in general, an inadmissible heuristic means that G' can be selected before n.

This is not necessarily, bad, actually. If your heuristic is such that G' pops out of your search first, then although G' is not optimal, it takes a fewer number of steps to get to than the optimal solution G does. In cases where a shorter plan that works is prefered to a longer (but cheaper) one, this is just what we're looking for. What this means is if your heuristic is inadmissible and leads to G' before G, then it takes less time to compute the solution than it would for an admissible heuristic, because A* has to churn a little bit longer (by going through n) in order to discover G.

The moral is, if you're just looking for path to the goal quickly, rather than spending a longer amount of time finding the optimal goal, then perhaps it's worth it crafting an inadmissible heuristic.

In fact inadmissible heuristics have already been used to this effect in state-space planners yielding exactly this benefit. The HSP planner uses an automatically generated inadmissible heuristic to gain such speed. HSP was later advanced upon to become the FF (fast forward) planner, where it achieved its apex of coolness.

Posted
Authorddini

In case you have missed it, the studio that brought you Okami and Viewtiful Joe is gone. The reason seems to be sales, basically. Although the article implies the conclusion that a creative game means a game that does not sell well, I'm not convinced that is true in general. After all, there are independent film makers, right? Aren't they able to make a living? There are tv shows such as battlestar galactica or deadwood which are good, and yet remain on the air.

So I don't buy the explanation that something creative and good necessarily does not sell.

Posted
Authorddini

Have you guys heard of Pandora? Here's how it works. You tell it some band you like. It analyzes the actual sound patterns within the music, and produces a play list of recommended similar music. Basically an instant custom tailored internet radio station. I'm simultaneously impressed by their success in that I actually like what they recommend (yay!) and dismayed that Don is not an unknowable snowflake.

Posted
Authorddini
CategoriesUncategorized

There are unique AI problems to solve regarding autonomous agents in virtual environments. I write about them in a paper here but I would like to elaborate upon some of them here. In general, when you're making an AI for an autonomous agent, it can be to automate basically any task. For example, a land rover, an intelligent display, or even simulating an entire city population. Within virtual environments, however, and video games in particular, AI is very often (currently most often) used to control a virtual human being.

This has several consequences for designing your AI, but only one of them is for planning. As noted in the paper, the virutal human-ness of your agent means that now, just any old plan that acheives your goal is not acceptable. Flying over a building will get you from here to there, but humans can't fly over buildings, so that plan is not acceptable. So simulating a human restricts the space of valid plans.

Something that occured to me recently, though, is that AI for virtual humans does not simply entail making a system to make P true given we are in a world in which P isn't true (i.e. a planner). To see what I mean, take a look at this Bioshock demo. (Check out the later parts of the demo too).

There's this cool part where the player approaches one of the resident creepies (a Big Daddy) who protects one of the little sisters. Normally, the Big Daddys go about their own business, and actually leave you alone if you do nothing threatening. If you approach one of the Big Daddys while he's with a little sister, he sort of grumbles and frightens you, and tries to continue on about his business. He doesn't immediately launch into an attack, he doesn't run away, he instead does something that exhibits personality.

In general, it's clear that there's a problem space between making a planning agent, i.e. being able to make a plan so as to optimzie some criteria, and a virtual creature, i.e. actually appearing to be a resident of a virtual world.

Posted
Authorddini

Sleep driving is a very, very bad idea. For some reason, while I plow forward in my dream machine on the road, in a sleep depravation induced stupor, the prospect of dying in a horrible crash is not enough to immediately snap me out of it.

But you know what is? Improving my Brain Age score.

I'm currently preparing a great entry on search + planning.

Posted
Authorddini
CategoriesUncategorized

As promised, following is some very insightful advice from the one and only Jeff Orkin on choosing to go for a PhD for anyone wanting to do AI in the game industry. I have certainly been struggling mightily with this decision recently. hi Don,

I'd be glad to offer you any advice that I can. Feel free to ask me questions about AI in the game industry.

I can give you brain dump on my thoughts about working as a "gun for hire", or for a game company, or pursuing a PhD...

The good news is there's lots of opportunity for AI specialists in the game industry right now. Pretty much every game company is looking for help in this area, and finding skilled/experienced people is hard. And none of the existing middleware fully does the job. Graphics guys are a dime a dozen now, and AI is differentiating one game from another, but AI guys are hard to find.

Deciding on the best career route to take depends on your priorities. Are you more interested in innovating, or making money? Do you want to travel much? Do you want to work alone, or be part of a team? Are you more interested in games, or technology?

Most game companies will require that the AI engineer works on- site, to work closely with the animators and level designers. I've only heard of one example of someone doing contract game AI from home -- after Brian Long worked for Monolith on the first No One Lives Forever, he left the company and worked from home on the navigational mesh pathfinding system for "Savage". I think it may have been hard to find additional contracts, and possibly lonely, so now he works for a dot com. It is definitely easier to find contract work with credit on recognized titles under your belt - so you might consider starting by working directly for a game company.

In general, the drawbacks of working as an external "gun for hire" are that you will not share in the profits of the game, and you might work on more isolated, less innovative stuff. I was able to innovate on the FEAR planning system because I had earned the managers' trust by working with the same team on No One Lives Forever 2.

Most games aren't very profitable for the developers anyway though, so you may not be sacrificing much in that department. It's a tough business, and publishers often rope developers in unfavorable multi-game deals before they have a hit. For example, FEAR was the end of a multi-game deal penned before the first No One Lives Forever, so Monolith saw relatively little profit on a game that sold close to a million copies for PC.

When you are "in the trenches" working for a developer, the demands of keeping up with bugs, designer requests, and the competition can consume so much of your time that it's hard to think about long-term innovation. I decided to do the PhD to get some new perspectives, explore some more far-fetched ideas, plus do some traveling and networking at conferences.

Eventually I hope to work with the game industry again, and apply ideas that I develop through research.

A PhD is definitely not needed to work in game AI, but there are a growing number of PhDs in the field now. John Hancock at Lucas Arts did a PhD at CMU. Ian Davis founded MadDoc after a PhD at CMU and then worked as CTO at Activision. Dr. Paul Krezewski founded AI.Implant. Former Media Lab Prof Bruce Blumberg now works at Blue Fang. Etc, etc...

Then there are also lots of academic opportunities for research with games, like obviously ICT, plus similar projects at Stottler-Henke's AI consulting firm.

I think my experience in industry has given me valuable perspectives to apply to my current academic research, ... but I also feel like a pretty old grad student. So, if you are going to do a PhD, you don't want to wait too many years to start -- but at least you've finished the masters.

Wow, this has gotten pretty long. Hope it's helpful!

jeff

Thanks Jeff!

Posted
Authorddini

As I talked about in an earlier post, wandering behavior is an almost always necessary component of AI for any virtual, persistently populated environment.

Why Wandering is Important

To briefly summarize from the earlier post, the resource control (RC) model is a very popular scheme for implementing control of a virtual population. The gist of RC is, agents need resources, and their need for them increases over time. When the need crosses a threshold, agents seek out something in the environment that provides that resource. In this way, RC provides computationally cheap, reasonably realistic looking behavior. The essential problem is, then, how do agents find out about things in the world that satisfy resources?

Seeking: You could pre-load all the information about the world into a universally accessible dictionary structure of some sort. This way, when an agent needs some food, he queries the world for the nearest burger joint and goes there. Exploring/Wandering: Alternatively, you could tell the agents nothing, and then have them randomly walk around the virtual world finding things. Eventually, they find out where everything is, and the agents act just as they do in the first option.

For those of you who are into reinforcement learning, the second option should sound a lot like model-less Q-Learning. In Q-Learning, the agent starts off knowing little to nothing about the world, and through randomly sampling the environment, figures out what/where the “rewarding” things are. At some point, the agent decides to stop exploring, and starts exploiting its knowledge about the environment. If new information is constantly appearing in the world, however, you can’t ever turn off that exploring component. Thus, if the resource-bearing environment is always gaining/losing/altering resource values, wandering is always necessary. Wandering, in other words, is the solution to the constant appearance of new information (if not relying entirely upon Seeking, see this earlier post).

The Wandering Problem

The constant appearance of new information introduces two problems:

  1. Finding out about the new information
  2. Planning and acting based on changing data

When I say an agent performs “wandering”, it is equivalent to saying he has a (reasonably efficient) solution to these two problems.

The Wandering Solution

I’ll consider the first problem here, and the second in a later post. To consider how an agent should find out about a bunch of new information, let’s consider an example. You’re at a hotel, α, and you want to find a restaurant in the area. You don’t know where anything is, so you decide to take a stroll, and you’ll stop when you find something interesting. Let’s represent this scenario like so:

wandering_fig1.png

To “find out about new information”, one has to visit all the nodes, or a subset of them (until you’ve found a nice enough place to eat or something sufficiently distracting). The question is, is there an “optimal” way to visit them, and if so what is it?

First, the above picture can be resolved into the following graph to indicate which nodes (restaurants) are adjacent to each other:

wandering_fig2.png

In this graph, a solid line means the nodes are adjacent to each other (i.e. it makes sense for someone to visit F after visiting E. The dotted lines means technically, the nodes are adjacent, but whether or not someone would visit one after the other is up for debate. For example, L is adjacent to E, but it’s across the street, so a person might not really go from E straight to L.

The first thing off the top of your head when designing the algorithm might be to visit them in Depth First Search order, the outcome of which is:

α D E L K H J C I M B F G A

Is that the order you would visit the buildings in? I didn’t think so.

Clearly, any path of visitation is not satisfactory. One criteria to capture is the “naturalness” of the path of visitation. I suppose there are many possible definitions of a “natural” path. But for this post, I will hypothesize that a natural path of visitation is one that minimizes the deviation from initial heading. Slightly more formally, a path becomes less natural the more zig-zagging you do. More precisely:

A path of visitation for graph G = (E,V), is a string of nodes P = e1 e2 ... en such that ∀ei , ei ∈ E ei and ej appear next to each other only if they are adjacent to each other in G. en is a node at the periphery of G.
The deviation of a path of visitation P (in the depicted graph) is the number of horizontal edges traversed.

Thus, the deviation of a path is how “zigzaggy” it is. Just count the number of left arrows and right arrows in a given graph traversal.

Given this definition, a visitation path of minimum deviation (and maximum naturalness) is:

P0 ≡ α D E L K

which has a deviation of 0.

P0 is definitely natural, although it does not allow you to visit very many nodes. You might have a certain threshold number of buildings that you insist on visiting before making a decision. In this way, there is a second criteria to keep in mind: number of nodes visited.

Note that, for path P, the number of nodes visited by P, and the deviation of P are directly proportional to each other: Increasing the zig-zagging of P means that you will hit more nodes.

The agent designer must therefore solve the following problem. Given a certain minimum number of nodes the agent must visit, what are the paths of minimum deviation?

For the graph above, these paths are:

Nodes agent must visit Path of minimum Deviation (cost)
4 αDEF (1)
5 αDELK (0)
6 αDEBHI, αDELHI (1)
7 αDELHBC, αDEBHLK (2)

More on the second problem regarding planning and acting based on changing data later.

Posted
Authorddini

I recently had an extremely valuable email conversation with the one and only Jeff Orkin (of FEAR acclaim) on the topic of the compatibility of a PhD with a career in creating AI for commercial games. Extremely valuable advice to anyone interested in this matter. Coming soon!

Posted
Authorddini
CategoriesUncategorized