Published paper in SCPE

Hi everyone,

As promissed, I am attaching my first published paper:

FAIRNESS OF TASK SCHEDULING IN HIGH PERFORMANCE COMPUTING ENVIRONMENTS

You can download it here: Fairness-sedighi

 

Abstract:

We claim that the current scheduling systems for high performance computing environments are unable to fairly distribute resources among the users, and as such, are unable to maximize the overall user satisfaction. We demonstrate that a user can game the system to cause a temporal starvation to the other users of the system, even though all users will eventually finish their job in the shared-computing environment. Undesired and unfair delays in the current fair-shared schedulers impede these schedulers from wide deployments to users with diverse usage profiles.

 

Source:

http://www.scpe.org/index.php/scpe/article/view/1020

 

Art Sedighi

Approach

Game theory can be used to explain behaviour in shared environments such as Grid or Cloud.

One could argue that a Cloud environment is much like non-cooperative game setting where one player is unaware of the other players.  An internal Grid, on the other hand, *may* still be non-cooperative but the other players are known.

Nash Equilibrium requires coordination between and amongst the parties in order to achieve an optimal solution.  The ratio of the solution achieved in NE vs. what really happens in the real-world is the Price of Anarchy.  The goal is reduce PoA for a shared environment.

 

Game Theory Building Blocks

Just a recap today…

A game is composed of players or the interacting parties, rules or “who can do what”, and results or outcomes.  Game theory is build atop of other theories, with each representing the three components of a game:

– decision theory

– representation theory

– solution theory

 

Decision Theory – is an extension of the von Neumann-Morgenstern decision theory that relate to making a decision under uncertainty.  Decision theory relates to game theory in that it provides a way to represent preferences.  Games are all about preferences and as such utility of the chosen preference.

Representation Theory – allows a formal way of representing rules of a give game.  Two representation theories pertain to our discussion: normal form and extensive form.

Normal form: complete plan that considers all contingencies are presented at the start of the game.  It has a static view of the game in how it is played.

Extensive form: it is like a decision tree or a flow chart in that each level of the tree is build only when the node has been reached.  Players have choices and those choices are made in real-time

Solution Theory – it deals with how to assign solutions to games.  It works based on the premise that each player is looking out for itself.  One of the main solution concepts that we will talk about is Nash’s Equilibrium which basically states that no one player can change its strategy and improve its utility.

Game theory puts these three theories together to try and explain a social interaction.  A solution prescribes how rational players should behave and not how they would actually behave in the real-world. 

 

Congestion Games

The concept of Congestion Games is a very interesting one as it pertains to Scheduling.  Rosenthal (1973) coined the term and defined a congestion game to be a scenario where the payoff of a given player depends on the resource it ends up on, and the how busy that resource may be based on how many other players are also on that resource.  Rosenthal, however, focused on having identical machines, and same-size jobs.

A scheduler may schedule a number of tasks on a given machine, if the number of pending tasks is greater than the total number of resources.  As the number of tasks scheduled on a given machine increases – they are all competing for the same resource (CPU, memory, etc) and therefore the utility perceived by a given client could change based on how “congested” a node gets.

The fact of the matter is that most schedulers schedule one task per CPU, so avoid congestion.  I do agree that there could be other processes running on that machine as part of the Operating System (for example, time sync daemon runs on a box every few minutes), but these processes in a dedicated node scenario consume very minimal resources and can be ignored.

In unrelated machines where the infrastructure is heterogeneous, Nir Andelman argues in Strong Price of Anarchy (2006),  that congestion does not take place since the load of a given task is different on different machines.  I strongly disagree with this sentiment.  Based on one of my previous postings called Utility of Master-worker Systems, we must start thinking about the utility of the node itself.  A node (CPU) desires a high utility as well, and that utility is to be idle.  Based on this, unrelated machines cannot be treated any different than related machines.   A system of nodes breaks down to its components of individual nodes with each node desiring a level of utility that matches the utility of all the other nodes.

In short, there is an intrinsic commonality between related and unrelated machines.  This commonaility is the fact that each CPU can achieve the same level of utility by doing the same thing (taking on the same strategy in GT lingo), and that is to finish the task that it was given as soon as possible.

Furthermore, we need to look at scheduling a job on a CPU as being two seperate and sometimes conflciting games:

– Macro-level: a game played by the job submitters trying to minimize their runtime (makespan)

-Micro-level: a game played by the job and the CPU where the CPU wishes to be idle and the job wishes to be compeleted.

 

Reference:

Strong Price of Anarchy N. Andelman, M. Feldman, Y. Mansour, 2006

 

The Matching Pennies Problem

So what is the Matching Pennies Problem anyway (MPP)?

Matching Pennies is a simple game but has very interesting properties that can be applied to many other types of complex games.

There are two players, and each select either Head or Tail.  Here are the rules of the game:

– if the choices differ (one player chooses Head while the other Tail), Player 1 pays Player 2 $1

– if the choices are the same (both players select Head or both select Tail), Player 2 pays Player 1 $1

 

Head Tail
Head 1,-1 -1,1
Tail -1,1 1,-1

This is a Zero-sum game in that one player’s win would mean a loss for the second player.  In another words, the choices are diametrically opposed to one another.  This game is a strictly competitive game and no amount of collaboration will help the situation.

Recap: To be in Nash’s Equilibrium, no player has an action yielding an outcome that he prefers to that of the current action, given that every other player chooses his equilibrium action.  In other words, no player can profitably deviate, given the actions of the other players.  In the MPP, if Player 2 and Player 1 have chosen Head, Player 2 end up paying Player 1.  If now Player 2 chooses to change his choice to Tail, so can Player 1 and the results are the still the same.  You can see what happens from Player 1’s perspective if the users’ selections were not the same to start.

But there is not a single solution that “stands” out in that there is no solution (or strategy) that has a better outcome for one player over the current action.

 

Art Sedighi

 

The Grid Problem

Determining if a problem/game has a Nash Equilibrium is not an easy problem.  In fact, it is harder than the classical NP problems and it is considered to be PPAD.  In this posting, I will look and classify the Grid Problem to be the following:

Grid is a shared infrastructure with a limited number of resources.  The limit could be high, but it must be less than what two players (users of the Grid) are able to consume.  We want to look at the Grid Problem and see if there exists a configuration where all the users/players are satisfied.  In short, we want to see if there exists a Nash’s Equilibrium point for the Grid Problem. 

Lets take a step back, and look at the classes of games that exist.  I want to do this to level-set and clarify my position about where various infrastructure types will fit into the different classes of games.

We have the following:

  1. Strategic Games (a.k.a Games in Normal Form)
  2. Extensive Games with perfect information
  3. Extensive Games without perfect information
  4. Coalitional Games

Strategic games are rather interesting in that both players (or N-players) make the decision in advance, and stick to that decision for the duration of the games.  Extensive games, however, players can modify their strategy between moves.

Lets look at a Grid infrastructure:

A. Resource are limited
B. Jobs are submitted to the Queue to be processed
C. Queues are most likely FIFO
D. A Player wants to get results asap without regards to the other players

Based on the above properties, we can deduce the following:

  • Property A makes this game a Zero-sum game.
  • Property B makes this game an imperfect information game
  • Properties C + D makes this game a non cooperative strictly competitive game

In a Strategic game, the player chooses his plan of action once and for all and independently of the other players.  Each plan is made based on a set of preferences, but those preferences *must* be selfish as we are dealing with a Zero-sum game.

If this problem can be reduced to the Matching Pennies problem, we can deduce that the Grid Problem does not have a Nash Equilibrium.

 

Art Sedighi

 

Not Fair

We can assume, as mentioned before, that the users of a Grid are playing a zero-sum game against each other.  In that one’s gain is another one’s loss.  Or more explicitly, if one user gets more CPU’s, the other[s] get less as a result.

This is fairly easy to prove as we have a set of CPU’s that does not change.  In a zero-sum game, there always exists a Nash’s Equilibrium.  Finding Nash’s Equilibrium is not easy; in-fact it falls under a calls of problems call PPAD, but we know that one does exist.

Two comments can therefore be made:

1. If we assume that Nash’s Equilibrium is in-fact the standard fairness point, then if the scheduler deviates from NE, it is not a fairshare scheduler.  Proving that fairshare is in-fact, not fair.

2. If a user by changing its strategy is able to increase its utility (i.e. get a higher portion of the available resources), then the system is not in Nash’s Equilibrium and the scheduler is not dividing up resources in a fair manner.

and 2.1. The user is inclined to change its strategy so it can get a larger portion of the system.  This is where we see “Price of Anarchy” developing.

Assumption:

One assumption we have made throughout is that Nash’s Equilibrium point presents the only fair point at which point all participants are treated fairly.

 

Art Sedighi

 

 

Zero-sum game

It is conceivable to think of a Grid Scheduler as the mediator to a zero-sum game.  The number of resources do not change — not taking into account the hype that is cloud these days with elastic computing.  If the number of resources available to a scheduler and in turn the clients/players of the system is constant, the number of CPU’s that one client gets assigned directly affects the number of CPU’s that has been “taken” away from the pool of resources available to the second client.

As the total number of resources does not change, and one players action based on fairshare scheduling affects the number of resources that it gets assigned, one can conclude:

Based on client’s task submission strategy, a client realizes a utility that is directly a result of the number of resources it received – or got assigned to by the scheduler.   

Fairness in scheduling

What does it mean to be fair when scheduling tasks across a Grid?

Depending on the perspective of the affected entity, fairness could mean different things.  For  a heavy user of the system, “fair” could mean:

“I need (should read “neeeed”) more, so it is fair for me to get more”

From a casual user’s perspective:

“As long as I get to do my work, it is fair”

From a light-user:

“I don’t use the system that often, so when I do, I should have higher priority”

There can be cases made for each of these scenarios.  The first scenario, the heavy user, is the one which more schedulers tend to please.  It is an implied favouritism in that in order to drain the pending queues the fastest, the scheduler schedules more tasks from the heavy user as it had a higher percentage of pending tasks.

What is fairshare?  how can schedulers pony up resources in a shared manner?

 

Art Sedighi

 

 

Games we play

In the previous post, we wrote about the utility of a Client vs. the Worker (CPU).  Let us now take that one more step and see what the utility of these two participants be throughout the course of a given job. 

Simple tertiary (-1,0,1) can be used to depict all the possibilities – a truth-table, but with three states. 

Client CPU Comment
0 0  This state is a very special state — which we will get to later
0 1  This is a state we want to stay away from. What it says is that the work is submitted, but it has not been “delivered” to the CPU yet for processing.  This state is caused by delay, and we will assume that we do not want it to exist.
0 -1  This state is valid and one of our primary states.  Client has submitted a job, and the CPU is working on it.
1 0  Not a valid state
1 1  The final and most desired state – the finish line.
1 -1  Not a valid state
-1 0  Not a valid state
-1 1  This state basically starts the process.  It says that the client has some work to be completed, and the CPU is idle.
-1 -1  Not a valid state

What we need to do at this point to draw these states on a two-dimensional plane, depicting various states of the system. 

 

 

We will cover the dash-line in a later posting, but what I wanted to focus on here is the area between the two solid-lines.  The “perfect” scenario is when we move from (-1,1) –> (0,0) –> (1,1).  This line represents the maximum utility that these two players can achieve during the game.  The goal is to minimize the area between the two lines – this will move this system to an equilibrium point, better known as Nash’s Equilibrium.

 

Art Sedighi

Utility of master-worker systems

For a small — very small — master-worker system, there are two players: one master and one worker.  The master is the user or the client that requires the worker to get something done, and the worker is the actual processor that does the work.

Let’s break each of these into their various state:

Mater:

1. Work pending

2. Work submitted for completion

3. Work completed and results returned

Worker:

1. Working

2. Work completed

3. Idle and results returned

Based on our earlier discussion around utility, we can further assign an arbitrary utility to each of these states.  We can then model the utility as a function of workload for the two participants.

 you can see that each of the states have been assigned a utility.  When the processor is working, it has the lowest utility of -1, and when it is idle, it has a highest utility of +1.  For a client, when there is work pending, the client is at its lowest utility, and when all the work has been completed, it has the highest utility.

There is an intrinsic time built into this model.  Imagine, if you will, that a given client goes from its lowest utility to its highest utility as more work is submitted and they completed.  This obviously is time dependant.

The same is true for the worker; a given CPU is at its lowest when it is working but as it completes the work given to it, it will move towards a higher utility.

 

Art Sedighi

Perfect Information

In a “perfect game”, there is perfect information.  What this means is that all the players are aware of the current state of the game and are fully aware of their options.  Chess is an example of such game.

There are very few real-life scenarios that follow this pattern.  More commonly, not all the information nor the state of the system is available to all the players/users.

Under the most basic scenario – known as the Normal Game – there are ‘n’ players, each of which have perfect information and each player is aware of the pay-off  function and striving to win.  The pay-off function, however, depends on how the other participants play the game and their strategies.

 

Back to utility

Wikipedia defines utility as:

utility is a measure of satisfaction, referring to the total satisfaction received by a consumer from consuming a good or service” (REF: http://en.wikipedia.org/wiki/Utility)

(Kuhn 1953) further explains that for each player, there is a linear utility function over all the possible outcomes of a given game.  So if a game is depicted using a tree, and the end leaf node is one of the possible outcomes of that game, there exists a utility function that defines these outcomes. It is important to metion here that the utility function defines and explains all the possible outcomes not just one.

This will be very useful to our research as we aim to maximize a utility function – which itself is a function of currnet state of the system (functional analysis).

 

game theory vs. graph theory in scheduling

Most scheduling systems, resource managers, HPC schedulers, etc, use some sort of fair share algorithm where the portion of share resource being scheduled (divided up) depends on a very simple ratio analysis of “the person who ‘deserves’ more, gets more”.  The amount that one deserves usually depends on the number of tasks pending to be completed.  Essentially, the scheduler takes a snapshot of the queue sizes, and makes a determination that in order to drain the queues faster, the large queue must get a larger portion of the available resources.

Without the loss of generality, assume that we have two users (A and B).  ‘A’ has a queue size of 10 and ‘B’ has a Queue size of 90.  ‘A’ ends up with 10% of the resources and ‘B’ ends up with 90% of the resources.  If you have only 8 machines available at a time, there is a good chance that ‘A’ is starved.

Scheduling systems follow ‘a’ graph algorithm to make a determination.  Whether that’s implicit or explicit, a graph algorithm is somehow used.  Graphs are essentially decision trees, and based on the number of levels and/or the fan-out, they could be either complicated or a simple binary decision tree of “if not this way, then it must be the other way”.  That is why a two-node system scheduling system is deterministic, and anything above a two-node system falls under the nondeterministic realm (exception is a 3-node system).

Graph algorithms do not consider history — how did i get here?  That is the main reason why most fair-share algorithms are not fair.

Games depend solely on the history of events or strategies.  There is an implicit history that is built in to each strategy that allows one to determine all the previous steps.  As such, one’s current state is not the only state required to make a decision about the next state.

As opposed to using a fair-share algorithm, if schedulers treated each transaction as a game, and each event a strategy of that game, schedulers would actually be fair. 

Art Sedighi

 

Games — all about Utilities

it has been a while…

 

A game’s outcome is measured in its utility;  i.e. user 1 utility vs. user 2 utility.  Utility is the classification of decision making when it comes to games.  This goes back to the early days of Game theory and it was put forth (coined really) by von Neumann and Morgenstern.  “Utility” is a very overused term, but it has specific meaning in this context.  Others have also contributed to this concept, and one can refer to Savage (1954) for a great history of this term. 

As with anything else – one must consider the risk vs. reward for a given action.  For a game, this notion is taken one step further as a given strategy or action may have one of the following clasifications:

– Certainty – or – Certain outcome: each action is known to invariably lead to a specific and set outcome.  “you know exactly what you will receive if you employ Strategy S1”.  This class of decision making is very popular – it turns out.  much of formal theory in economics, psychology and management can be classified here.   

– Uncertainty – or – Uncertain outcome: either player’s strategy is could lead to one of many outcomes, but the distribution of these outcomes is not known prior.  “Strategy s2 could lead to outcome x, y, and z – but not sure if the % for each is the same”

– Risk: each action leads to one of many outcomes, *and* the probability of occurrence of each outcome is known.  A coin toss could lead to a reward of $10 if it comes up heads, and $5 if it comes up tails. 

 

Utility relates to classes of games that fall under uncertainty and risk.  von Neumann and Morgenstern claimed that a person’s affinity towards risk and its behavior or actions to a given game relates to the utility of the expected value.  More specifically speaking, if one is able to have (and continue to have) a preference between two outcomes, then one is guided entirely by the utility of the expected outcome.  In other words, utility can be measured when one is “acting in accord with his true taste”. 

 If the payout for a game is $5 and $10 – according to the level of risk of each action, one person might be risk averse, and continue to “bet” on $5, whereas the other person might be willing to risk for a $10 outcome.  Person 1’s action leads up to believe that his utility for $5 is higher than the second person’s utility for $10.  In other words, person 1 might not be financially capable of risking $5, where as the second person is more willing to give up $5 to have the chance of winning $10. 

We will come back to this term over and over again thruout.

Art Sedighi

 

 

 

Game Theory with Graphs

As you can imagine, you can pretty much represent anything that has any type of flow as a graph.  It might not a nice DAG, but it is a graph that allows different states.

A game can obviuosly be presented as a graph as well.  There are states, and based on a decision or strategy, one moves from one state to the next.  One thing that is different about a game is that two graphs are NOT the same if the way to arrived at a given node differs between the two graphs. 

For example, if it took one game 20 moves to get to a node ‘z’, and another game 3, then these two games are said to be unequal even though every move from ‘z’ onward is the same for both games.

Very important concept of game theory: history matters.  How can a fair-share methodology be accurate if history does not play a role?  How can I make a decision soley based on my current queue size without taking history into account?

 

Art Sedighi

 

Selfish Computing

There are many papers written on this topic, and there is even a book on a similar topic called Selfish Routing and the Price of Anarchy.

I am not going to cover what the research is working on, but rather a simple scenario that pertains to systems.

Imagine a system that does not provide any feedback to its users or requesters.  In other words, there is no congestion control built-in.  What this basically means is that the system is used and utilized on a FCFS basis.  The first user that gets on the system can utilize and use up all the resources, and not allow any one else to use the system.

Oddly enough, this scenario is not too far fetched; consider a computer virus for example.  If your computer is infected with a virus, it takes over and you are unable to do anything else.

Consider an HPC environment.  A user gets up early in the morning and submits a job that takes over the environment.  If there are no controls in place, no one other than the first user can use the system.  The system becomes “available” when the first user logs off or is finished.

Selfish computing is everywhere.

Art Sedighi

The Prisioner’s Dilemma

This is the typical scenario used to explain, in simple terms, the concept of Game Theory, and it goes something like this:

Two prisoners are on trial for crime and each one faces a jail sentence (or not) based on the options given to them:

– confess

– do not confess

Simple, but to the point.

If they both do not say a word, there is not enough evidence to convict either with a criminal act, and they each get a sentence of say two years.  If one of them confesses, that prisoner gets a reduction in the sentence to 6 months, while the other gets 5 years.  If they both confess, they each get a break and a sentence of 4 years each.

The optimal strategy is obviously for both not to say a word.  The selfish strategy is for one to confess and hope that the other one will not do the same.

This might seem like a very simple example, but this simple example demonstrates the complexities of dealing with a system with multiple users with each user aiming for a selfish strategy.

Art Sedighi