Top Qs
Timeline
Chat
Perspective

Kolkata Paise Restaurant Problem

Mathematical game of resource allocation From Wikipedia, the free encyclopedia

Remove ads

The Kolkata Paise Restaurant Problem (KPR Problem) is a mathematical game for competitive resource allocation without any coordination. Its name is drawn from the once-common "Paise Restaurants" in the Indian city named Kolkata. These were affordable eateries from the early 1900s to the 1970s that offered fixed-price meals at extremely low costs (see[1] for references to the few that still exist today; Paise is the smallest denomination of the Indian Rupee). The KPR problem is an anti-coordination game that models how a large number of individuals (players) compete for limited resources without direct communication or coordination.

The problem becomes trivial—yet optimally efficient—if a non-playing coordinator or dictator intervenes. By simply instructing all players to form a queue and visit the restaurant matching their position in the line on the first day, and then rotate to the next restaurant each subsequent day (following periodic boundary conditions), full resource utilization is achieved immediately. This ensures food for all customers, maximum revenue for all restaurants, and requires no learning or convergence time.

However, the true complexity of the problem arises when individuals act independently, each making decisions based on personal experiences of past success or failure, or available information about previous crowd sizes at the restaurants. In this decentralized setting, players aim to maximize their own payoff, which incidentally also drives optimal utilization and revenue at the system level—but only through emergent, self-organized behavior.

The KPR model generalizes the El Farol Bar problem (see [2] for the initial formulation), extending it from binary choice (go or stay home) to multiple options. For foundational work on KPR, see [3][4][5][6] and for some early reviews see.[7][8][9][10] When reduced to two players, the game aligns with classic anti-coordination models like the Chicken Game or Hawk–Dove Game.[11] Tamir[10] argued, following Anderson's "More is different",[12] that this extension to large number of choices for all the players make KPR game much more complex and appropriate for decentralized optimization problems, than the finite option/choice games. For a study on the emergence of distributed coordination in the KPR problem with finite information, see.[13] Algorithmically, KPR shares traits with the Gale–Shapley algorithm in decentralized matching contexts.[14] Broader connections to the "Kolkata Game" or "Kolkata Algorithm" appear in studies such as Refs.[15][16]

Remove ads

Problem Definition

  • There are N restaurants and λN players (prospective customers); typically λ=1. N can be arbitrarily large.
  • Each day, customers independently choose a restaurant based on their past experience of success or failure. Players do not know each other’s choices but have access to historical data of past selections.
  • At each restaurant, only one of the players (prospective customers) arriving there is randomly chosen and served (payoff = 1). The rest of those arriving at that restaurant leave without food for that day (payoff = 0; no time/money left for another search).
  • The ideal outcome is perfect coordination (like that achieved in the presence of a dictator), where each player picks a different restaurant in zero or short convergence time. However, this becomes difficult (in the absence of parallel communication exchanges or of a dictator) in the KPR game. This leads to inefficiencies (some restaurants can serve one from the crowd, others are empty) or partial utilization. The KPR objective is to evolve the collective ‘parallel learning’ algorithms for maximizing the utilization fraction in minimum (preferably less than lnN order) time.
Remove ads

Strategies & Optimization

Summarize
Perspective

In KPR problem, strategies are evaluated on the basis of their aggregate payoff or the utilization fraction. For , this is given by the average fraction of attended restaurants or by the average fraction of agents getting food on any day.

In the random choice case of the KPR problem, each of the agents randomly selects one of the restaurants. The probability that exactly agents choose the same restaurant is:

,

giving a Poisson distribution in the limit :

The fraction of restaurants not chosen by any agent is then , giving the utilization fraction equal to[3] . This becomes equal to for , meaning about 63% of restaurants are utilized or about 63% agents get food on any day in this case. The convergence time is zero.

For the random choice case discussed above, the agents do not have any memory, nor do they employ any learning strategy. A minimal learning stochastic strategy, with utilization fraction ~0.79,[4] gives each customer a probability of choosing the same restaurant as yesterday's one varying inversely with the crowd size there (number of players who chose that restaurant) yesterday, while choosing among the other restaurants with uniform probability. This is a better result than the above-mentioned simple random choice (or noise trader) case, though the convergence time seems to grow very weakly, perhaps logarithmically or even more weakly, with [17][18] (see also[19]).

Increased utilization fraction for customers, each having a fixed low budget allowance for local search using Traveling Salesman Problem (TSP) type algorithm, has also been studied by Kastampolidou, Papalitsas and Andronikos.[20] Employing a locally clustered structure (of size determined by the amount of the little travel budget allowed for each of the agents here) of the restaurant distribution (which is of course globally uniformly over the entire city), each agent can visit multiple restaurants to search optimally for a vacant one and get food there. This approach is effectively equivalent to salesmen each solving a local TSP to complete visiting cities. Of course, some agents will still fail to get a vacant restaurant within his/her local neighborhood, allowed by the little budget, and full utilization will not be possible for any finite budget allowed. This approach helps derive a probabilistic formula, leading to increased utilization fraction, confirming its clear advantage.

Strategies without an external dictator (such as caste and turn-taking strategies) are available to achieve full utilization of resources if objective orderings of players and resources are available to all players.[21] In some cases, a social norm (e.g., alphabetical sort) could provide that ordering; in others, history of play would eventually provide it (e.g., sorting resources by popularity and sorting players by winnings, skill-estimate, or debt-ranking).

Extensions of KPR for on-call car hire problems (restaurants effectively having also the options for moving to their chosen places) have been explored in[22][23] by Martin et al. and see[24] for the mean field solution of a generalized KPR problem in the same resource competition in spatial settings of the vehicle-for-hire market. For application of KPR to optimized job allocation problems in Internet-of-Things, see[25] by Park and Saad. Stability of the KPR, induced by the introduction of dining clubs have also been studied.[26] One can see[27] for a study on the impact of expert opinions (or even of faiths) on such evolving mutually competing search strategies for the resources. See also[28] for application of KPR model to anthropological and sociological analysis of the models of polytheism, and for an algorithmic application to cancer therapy, see.[29]

Extensions to quantum games for three player KPR have been studied, where, dimensionality permitted, each player can achieve much higher mean payoff.[30][31] For some recent studies in the context of three player KPR Game on the advantages of higher payoffs in quantum games when the initial state is maximally entangled, see.[32][33][34] See[10] for a general introduction to econophysics, sociophysics, classical KPR, quantum games and quantum KPR, and see[35] for a later review on classical and quantum KPR games.

One may see[36] for an early attempt to exploit KPR type strategies for developing Artificial Intelligence or AI models. For a study of KPR-like approaches related to graph partitioning problem, see.[37] Recently the KPR game has been extended (to Musical Chair type games) and that has been exploited[21] for creating a new benchmark for evaluating the AI algorithms.

Remove ads

Applications

The KPR model can describe various real-life resource allocation problems, such as:

  • Ride-hailing services: Passengers and the taxi drivers compete for on-call-hire taxis, sometimes overwhelming some areas while others remain unpopulated, unless properly matched dynamically (based on anticipated crowd distribution or past crowd size records) by the on-call service managers.[22][23]
  • Online resource access: Users competing for limited computing resources (e.g., booking slots, bandwidth allocation, etc). Here the computer management needs appropriate job allocation dynamically, anticipating/extrapolating from the sizes of the incoming jobs and of the queues, in the context of Internet of Things.[25]
  • Dining Clubs in KPR: Where the clubs can proxy (anticipating the choices) on behalf of the players, who are not willing to take the risk of overcrowded restaurant choice and consequent failure to get the food.[26][11]
  • AI benchmarking: Algorithms optimizing decentralized resource utilization generated employing KPR-inspired strategies.[14][15][16][21]

Quantum Games and Quantum Strategies

Summarize
Perspective

We start with a simple example of a quantum game demonstrating the use of quantum entanglement in the definition of the game strategy. The setup was introduced by Cluse, Horn, Shimony and Holt[38] in the context of the hidden variable theory, and later became a prototype example in quantum game theory.[39] The ‘game’ was also used as a test for quantumness in the context of quantum computers.[40]

Consider a game where Alice and Bob are asked a yes-no question, there are 4 possible pairs of questions (C,C), (C,D), (D,C) or (D,D), and these are chosen randomly. In the (C,C) question they get a reward 1 for uncorrelated answer (1,0) or (0,1) and in the other cases they get a reward 1 for correlated answers (1,1) or (0,0). The game is repeated. Alice and Bob are not allowed to communicate during the game, however they can decide in advance on a strategy. They could agree to always answer (1,1) and such a strategy will yield a 75% success. Alternatively, they can each toss a coin, and it is easy to show that such a strategy will yield a 50% success. Suppose now they share a maximally entangled two qubit state:

.

Define the locally unitary operator:

Alice and Bob can apply such unitary operators on and then measure their entangled state in the basis. Alice uses , , Bob uses . It is easy to show that for Alice and Bob will win at 85% of the cases. The unitary operators on the entangled state produce un-correlated pair of "coins" for the (C,C) questions, and a correlated pair of "coins" for the other questions.

Therefore, using entangled states, unitary operators, and quantum measurement can improve the success of such games.

We can use the above example to define a quantum strategy of a game. The players are sharing an entangled state, and they are allowed to use local unitary operators. Following the local operations, a quantum measurement of the state will yield a probability distribution of the payoffs defined by the game. The entanglement is the "strategic" part of quantum strategy, and the local unitary operators are the "tactical" part of the quantum strategy.[41]

Quantum strategies could change the Nash equilibria landscape, it could erase classical Nash equilibrium strategies, and build new quantum Nash equilibrium strategies. Moreover, one can get quantum Nash equilibria with possible different payoffs than the classical ones.

For example, in Eisert's quantum Prisoner's Dilemma,[42] the old Nash equilibrium is no longer valid, there exists a new Nash equilibrium strategy, in the form of two identical local unitary matrices. Moreover, in the new Nash equilibrium, both players have a better payoff than in the classical case, the payoff equals the classical payoff for mutual cooperation, and therefore the players escape the dilemma.

In the above setup of Eisert's quantum Prisoner's Dilemma, there is the possibility of "stirring", where one player can undo the other players actions, by playing a strategy that "commutes through" J - the entangling operator. This looks like the player is operating on the space before it was entangled, destroying the temporal order of the game (entangle --> strategy --> unentangled --> measure). However, if we restrict the local unitary operations (such as using a subset of SU(2)-matrices), we can prevent stirring.[43]

In general, player A's local operations can always affect the joint state and hence the reduced density matrix for B may change. Clearly, if we allow non-local operations, then stirring is possible.

A mixed quantum strategy (defined as in a similar classical case) is a strategy where each of the player has a probability distribution over local unitary operations which are applied to a density matrix that is shared among the players. The mixed quantum state should not be separable, otherwise the whole game resembles a classical one. In general, a mixed quantum strategy will yield Nash equilibria.

Flitney and Abbott [44] introduced a quantum version of the Monty – Hall game. It was shown that introducing a quantum strategy erases the classical Nash equilibrium, and a mixed quantum strategy over maximally entangled states brings back the classical Nash equilibrium.

In a repeated game, introducing quantum strategies could have effect on the stability of the classical Nash equilibrium, in the Evolutionary Stability Strategy sense.[45] A stable Nash equilibrium could turn into a non-stable one by an invasion of ‘quantum mutants’, and visa-versa, an evolutionary unstable Nash equilibrium could become stable under quantum ‘mutations’.[46]

In a repeated CHSH game Reichardt, Unger and Vazirani showed that the above quantum strategy is robust; if Alice and Bob play the CHSH game many times and if they win in a about 85 percent of the times, then we can conclude that they are using a close form of the above quantum strategy. Moreover, the above quantum strategy is generating: if Alice and Bob are using strategies which could depend on the success of previous games, and if they win in about 85 percent of the times, then we can conclude that there are enough independent games played with the above quantum strategy.

Quantum Minority and KPR Games

KPR is a minority game, and for minority games, we can use quantum theory to improve the success probability. The secret lies in writing the conditions to win in an entanglement.

For example, suppose there are 4 players and 2 roads A and B to go to work. We let the players use the following entanglement:

The players will measure the state in the basis and interpret them as using roads A or B. In two cases out of 8 the first player will be in a minority; the same is true for the other players. Classically, each player will win with probability 1/8. Therefore, writing the rules of the game into an entanglement improves the probability of winning.

Similarly, Sharif and Heydari[47] used a qutrit to formulate the entanglement in a 3 player and 3 roads or 3 choice () KPR game; see also Ramzan.[31] Sharif and Heydari also generalized their results to multi-player multi-choice quantum games[48] (see also Tamir[10]).

The problem will arise in the complexity of building the entanglement or scaling it to a KPR game. Also, the quantum minority game implicitly assumes all players cooperate in using the measurement, which will become hard to achieve when the group becomes large.

Remove ads

Dining Clubs and Evolutionary Cooperation

Summarize
Perspective

Harlalka, Belmonte, and Griffin[26] and Harlalka and Griffin[11] introduced the concept of dining clubs into the KPR problem. A dining club consists of agents who agree to manage their restaurant choice centrally, but who have no control or interaction with individuals outside the dining club. Thus, no two club members can possibly collide with one another, but they may collide with non-club members. When one club is available and agents are part of it while agents are non-club (or free) members, we have total agents (with ) and the probability that a club member eats is,

,

while the probability a non-club member eats is,

.

Letting and assuming , that is assuming an infinitely large agent population yields asymptotic probabilities,

and .

When and there are no dining club members, the probability that any agent eats returns to the expected . Generally speaking for , we have . If the agents are aware of this and have a choice, this creates an incentive to form a type of coalition in the sense of cooperative game theory.

Harlaka, Belmonte and Griffin[26] rephrase the problem of coalition formation as an evolutionary game by letting,

,

be the proportion of agents currently playing the dining club strategy, and assuming that this proportion must follow the replicator equation,

,

where is the probability that an agent in the dining club eats, written in terms of given as,

,

and

.

Thumb
A solution curve for the replicator equation in the Kolkata Paise dining club formulation.

The solution curve for the replicator dynamics is shown to the right. Like a logistic equation, the dynamics have an unstable fixed point at and a stable fixed point at . Under these dynamics, the population will evolve naturally toward a centrally managed dining club. Harlalka and Griffin show a similar result for imitation dynamics.[11]

Meal Sharing and Freeloading

If the dining clubs impose a food tax of proportion on members that successfully eat (i.e., those that do not collide with a non-club member), then the optimal tax rate[26] is,

.

Collected food is then placed into a common pot and equally shared. This tax rate ensures that no member eats less than the expected amount for the club size (given by ). When non-club members can freeload on this communal pot of food, it has the potential to destabilize the club, causing the members to leave the club in favor of the non-club group. As a consequence, the dining club collapses.[26] Conditions for this collapse are given by Harlalka, Belmonte and Griffin[26]

Multiple Dining Clubs

The problem can be extended to the case of multiple dining clubs, with the expressions for probability becoming more complex as the number of dining clubs grows. In the case of two dining clubs of size and and a free (non-club) population of size , the probability that a member of dining club 1 eats is,[11]

,

where,

.

Exchanging and produces the , the probability a member of dining club 2 eats. A non-club member eats with probability,

,

where,

Despite the complexity, it is known that systems with multiple dining clubs evolve to a single dining club in the absence of perfectly symmetric initial conditions[11] or exogenous forces like freeloading.

Remove ads

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads