ISSN: 2229-371X

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

GAME DECISION MAKING IN MULTI AGENT SYSTEMS

Temani Moncef*1, Rekik Ali2, and Gabsi Mounir3
  1. Department of Computer Sciences, Faculy of Sciences, Tunis, Tunisia
  2. Department of Computer Sciences, Higher Institute of Technological Studies, Sfax, Tunisia
  3. Department of Computer Sciences, Higher Institute of Technological Studies, Nabeul, Tunisia
Corresponding Author: Temani Moncef, E-mail: monc.tem@gmail.com
Related article at Pubmed, Scholar Google

Visit for more related articles at Journal of Global Research in Computer Sciences

Abstract

The system multi-agent (SMA) is a compound system of a set agents, situated ina certain environment and interacting to certain relations. The theory of the games is a formalism which aims at studying the interactions between agents knowing that such interactions can extend from the cooperation to the conflict . The game decision making in a system multi-agent require the players to manage a high number of units placed in a very sophisticated environment where complex real-world problems are posed and state of the artificial intelligence techniques are ineffective. This article describes the problem of game decision making in multi-agent systems . Then we present the game model. After that, we propose the formalization of the game problem. Finally, we quote the methods for solving the game problem.

INTRODUCTION

The problem of effective decision-making is one of systems of command the most using in the artificial intelligence [1]. The manufacturing process and the decision-making are formulated according to the purposes decided by the interaction of the intellectual in agent's environment of problem [2], under the decision in discreet systems understand(include) the choice of an agent with alternative actions based in control of appropriate environmental effects and the knowledge a priori of an agent on sectors of problem.
In distributed systems, the process of decision-making is the game of agents [3]. It complicates the choice of effective solutions because their combination of options grows exponentially with the increasing number of agents. In this respect, the construction of the models and the methods based on the elements of the theory of artificial intelligence makes the reduction of space to find effective solutions for the scientific problems and the real practical cases.
Using multi-agent systems provide several advantages, which include the following:
To speed operations working by parallelism in systems that allow decomposition global problem of subtasks that can be performed by individual agents.
· Increase the stability of the system in case of failure of individual agents through their replacement by other agents.
· Scaling the system by introducing new agents, possibly with new features.
· The best technical and economic performance compared with systems based on full-blown agents, because the cost of individual agents is usually less due to their specific functions.
The agents of decision-making for the systems to multi-agent are made individually and maybe independently the current reaction of the means which is decided by the collective actions of many agents. The game full of options is decided by the collective of the Cartesian product of agents' individual actions. In the process of interaction with the agents of environment can resell their own versions having many decisions.
In practice, we choose options in the manufacturing of a priori uncertain environment as the free game of the actions of individual agents [4]. In such a circumstances, the system to multi-agent has no necessary data to calculate and implement the solutions in the optimal stage. Agent's coherent interaction with the environment chooses then finally a protocol, which is based put a mechanism of probability decision-making. Agents' effective method of decision-making in front of the uncertainty is the used in the rules of Markov to treat the current information [5, 6]. In agents of system Markov, ignoring the context , we calculate the solution following the base of the data in the current moment realizing such an environment of effects, which will optimizes finally the edition of the means of reaction, as the maximization of the average return we shall reduce at least secondary losses.
The global purpose of multi-agent's system is probably the coordination of the agents [7]. The coordination is the process of determination, which supplies a global purpose of the system to multi-agent in the functions of objective, concerned the optimization of local agents. In multi-agent's optimization the global solution, as a general rule, is a compromise on the best local solutions. The coordination extends in the imposition of limitation for the agents’ possible actions. The Known manners to resolve such of problems of coordination: communication, collective agreements and agents training using the communication which allows every agent to inform the other agents of possible actions, so limiting their choice. Collective agreements have to determine the rules of the common behavior of agents. If the agents reveal their intentions for the new action, in every situation, the rule (ruler) of priority chose some actions from the others.
The problem of coordination is compounded by the exponential expansion of space combined options multi-agent systems and collective decision-making under conditions of parametric or structural and functional uncertainty. In practice, as a rule, agents are characterized by locally-defined connections, and therefore there is no need to optimize their objective functions for all space solutions. Uncertainties in decision-making system are compensated using adaptive selflearning methods.
In a priori uncertainty for the efficiency and time-consuming decision-making to a large extent depend on the ability of agents to learn and adapt to the uncertainty of decision-making [8 - 10]. Education agents are to establish restrictions on ineffective actions can be performed calculation of dynamic weights (or priorities) options. Obtained by the reaction medium on completed actions, the agents perform recursive computation of new weights of discrete options, taking into account the current value of prizes. At this point in time, the option is implemented with a maximum weight. Implementation of options may be done by chance, based on the probability choices, which can be calculated valuation weights of possible actions of agents. Education agents are based on models, methods and algorithms, control theory and artificial intelligence - machines with constant and variable structure [11], networks of M-automata [12], models of adaptive random search [13], Markov decision-making processes [6], various variants of Q-learning [14], artificial neural networks [15], selective and genetic algorithms [16], Bayesian networks [17], heuristics [18], models of stochastic games [19, 20], etc.
The Study’s coordination of agents in an uncertain environment appropriate to perform model-based stochastic games, which make it possible to investigate the processes of competition and cooperation in a team of agents to identify compromise solutions, decentralized system. The aim is to construct stochastic game models of decision making in multi-agent systems, development of methods for solving stochastic games, and determining conditions of their convergence to equilibrium states of the Collective.

GAMEMODEL OFMULTI-AGENT SYSTEM

In a game of interpretation agents are players who are characterized by vectors of pure and mixed strategies. Elements of mixed strategies determine the probability of selecting the appropriate pure strategies in discrete moments of time. Players choose pure strategies, regardless of time and independently of each other in accordance with a probabilistic mechanism, built on the basis of mixed strategies. After the implementation of collective strategies players receive a random realization of this winning with a priori unknown distribution law [21]. Winning strategies for each player is determined by the local subset of its neighboring players. In the game without the exchange of information the players do not notify each other of the realized strategy and the value obtained payoff. In the game with the exchange of information, each player said the players from the local subset of the importance of this win. In the game, players pass the current failures moves with a certain probability. As a result, each player will receive information from many players, winnings, which depend on its strategies. The convolution of the current win-weighted positive coefficients is an assessment of the strategy chosen by the player in the current moment. Sequences of selected variants estimated averaged over the prehistory of the functions of the average payoff of players. The aim of each player in the asymptotic time to maximize the average payoff function. Interchanges vector optimization problem looking at the set of Nash equilibrium points (for the game without the exchange of information) or Paretooptimality (for the game with the exchange of information) [22].
Asymptotic goals are achieved through self-learning of Markov recursion methods for forming vectors of mixed strategies [23], which can be obtained by the method of stochastic approximation [24] on the basis deterministic formulation of adequate matrix game problem [22]. Sufficient conditions for convergence of the method concerning the game are based on upper bounds for the conditional expectation of error for the current the problem’s solution of averaging the estimates obtained for the realization of the events and results concerning the theorems on recursive numerical inequalities [21, 25].

FORMALIZATION OF THE GAME PROBLEM

Assume there is no empty set of players D Assume there is no empty set of players D, each iÎD performs at discrete time’s n =1,2,... independent choice of one of their own pure strategies
image
We consider such models play: no sharing and the exchange of information, without failures and failures of the players. In the game without the exchange of information the players do not notify each other of the realized strategy and the value received winning i n x .
In the game players are characterized by a failure probabilityh i Î[0,1) . Let y i Î{0,1}- sign of participation of i- player in the game. Ify i = 0 , then the player refuses the
image

METHODSOF SOLVING THE GAME PROBLEM

Asymptotic goals (3) or (4) are achieved through self-learning recursive methods for forming vectors of mixed strategies i n p , elements of which are conditional probabilities of selecting the appropriate pure strategies,
image
To construct a method with the desired properties, which are determined by conditions (3) or (4), is considered the matrix formulation of asymptotically adequate coalition-free game problem with the functions of the average payoff of players
image
Asymptotic decoupling vector optimization problem (2) must be sought in the set of points in the Nash equilibrium, Pareto optimality, etc. [21, 22]. Achieving a specific solution is determined by the type of method (5) and a way to change its adjustable parameters. Based on the language game problem under uncertainty and a deterministic matrix game problem, using stochastic approximation conditions are not complimentary stiffness (5) is constructed such recurrent Markov game methods:
image
image
Determine sufficient conditions for the convergence of gaming practices (6) and (7) to the asymptotic optimal solutions in certain environments and sign the general form of media. Based on the upper estimates of the conditional mathematical expectation of the current error of
imageconditions are not complimentary to the rigidity of a fixed background of events and consequences of theorems on recurrent numerical inequalities [21] the conditions of convergence with probability 1. Based on estimates obtained by averaging realizations of events obtained conditions of convergence in the quadratic medium. Estimation of the asymptotic order of convergence holds for
image
Among the general form of method (6) maximizes the asymptotic order of convergence rate equal to n-1/ 3 and is achieved at a = 2 / 3,b =1/ 3 . For the method (7) the maximum order equals n-1/ 2 , attained fora = 2 / 2,b ³1/ 2 .
From these results it follows that under no certain method (7) provides a higher order rate of convergence to the set of optimal solutions than the method (6) - as a sign of certain environments and in environments of general form. According to the results of additional studies found that the failures of the players lead to the e -optimal solutions of problem gaming and to slow the rate of convergence. With a decrease in the threshold game y rate of convergence playing techniques is growing. Methods without sharing and information exchange have the same asymptotic order of convergence rate q in the affine-equivalent environments. Exchange of information between players leads to an increase of the speed of convergence (reducing the valueJ ). These methods provide stability of solutions to the game problem in mixed strategies.

CONCLUSION

Developed game models make it possible to investigate the processes of coordination of collective interaction in multiagent system decision-making. In the game formulation of coordination is the process of harmonizing the actions of agents to achieve compromise collective decisions. Coordination is achieved by the imposition of restrictions on the actions of rational agents in order to implement effective solutions. Construction of the compromise solutions by coordinating the actions of agents carried out the methods of learning and adapting to the uncertainty of decision-making. The model of game problems of adaptive selection of options in making a priori no certainty to the exchange of information and denial allow players to expand applications of the theory of stochastic games class of problems of distributed control of random processes with local interaction. For models built on the basis of supplementary conditions not rigidity and properties of equalizing strategies using stochastic approximation synthesized a new class of recurrent playing methods, which helped with the uniform system position examine the conditions of their disability. Established that the exchange of information between players increases the rate of convergence of gaming practices and it is a prerequisite for finding the Pareto-optimal solutions of games with no a priori certainty of payoff functions. Sufficient conditions for convergence are determined by the structure of this exchange, the type of recurrent practices and restrictions on their parameters.
Found that restorative failure of players lead to slower rate of convergence of gaming practices and the the asymptotic solutions rejection of the game from the optimal values obtained at the absence of failures players. The overall analysis of the game selecting the choices made in conditions of certainty indicates that further theoretical and empirical research in this area should focus on building and setting terms of efficiency as educational gaming techniques with elements of cognitive processing.

ACKNOWLEDGMENT

The authors gratefully acknowledge the research in Tunisia was conducted under scientific agreements between the Research Group SOIE for (ISG Tunis). We thank our colleagues Khaled Ghedira, Dmytro Peleshko, for their inputs.in the unnumbered footnote on the first page.

References

  1. Fudenberg D., Levine D. K. The Theory of Learning in Games. – Cambridge, MA: MIT Press, 1998.
  2. M. L. Puterman. Markov Decision Processes. John Wiley & Sons, New York, NY, 1994.
  3. S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, 1995.
  4. W. C. Stirling, M. A. Goodrich, and D. J. Packard. Satisficing equilibria: a non-classical theory of games and decisions. International Journal of Autonomous Agents and Multi- Agent Systems, (this issue), 2000.
  5. R. Vane and P. Lehner. Using hypergames to increase planned payoff and reduce risk. International Journal of Autonomous Agents and Multi-Agent Systems, (this issue), 2000.
  6. J. O. Kephart and A. R. Greenwald. Shopbot economics. International Journal of Autonomous Agents and Multi- Agent Systems, (this issue), 2000.
  7. Gmytrasiewicz, P. J., & Doshi, P. (2005). A framework for sequential planning in multi-agent settings. Journal of Artificial Intelligence Research, 24:49-79 (ref. on IPOMDP)
  8. Berger, T., Schreinemachers, P., and Arnold, T. Mathematical programming-based multi-agent systems to simulate sustainable resource use in agriculture and forestry. A software manual 2007b Available from http://www.uni-hohenheim.de/mas/software.
  9. T., Schreinemachers, P., and Woelcke, J. 2006. Multiagent simulation for the targeting of development policies in less-favored areas. Agricultural Systems 88:28- 43.
  10. D'Aquino, P., Le Page, C., Bousquet, F., and Bah, A. 2003. Using self-designed role-playing games and a multi-agent system to empower a local decision-making process for land use management: the selfcormas experiment in Senegal. Journal of Artificial Societies and Social Simulation 6 (3). Available from http://jasss.soc.surrey.ac.uk/6/3/5.html.
  11. Happe, K., Kellermann, K., and Balmann, A. 2006. Agent-based analysis of agricultural policies: anIllustration of the Agricultural Policy Simulator AgriPoliS, its adaptation and behavior. Ecology and Society 11 (1):49.
  12. Huigen, M.G.A. 2004. First principles of the MameLuke multi-actor modelling framework for land use change, illustrated with a Philippine case study. Journal of Environmental Management 72 (1-2):5-21.
  13. Manson, S.M. 2005. Agent-based modeling and genetic programming for modeling land change in the Southern Yucatan Peninsular Region of Mexico. Agriculture, Ecosystems and Environment 111 (1):47-62.
  14. Railsback, S., Lytinen, S., and Jackson, S. 2006. Agentbased simulation platforms: review and development recommendations. Simulation 82 (9):609-623.
  15. Schreinemachers, P., and Berger, T. 2006. Land-use decisions in developing countries and their representation in multi-agent systems. Journal of Land Use Science 1 (1):29-44.
  16. Selten, R. 2001. What is bounded rationality? In Bounded Rationality, The Adaptive Toolbox, ed. G. G. a. R. Selten, 13–36. Cambridge, MA: The MIT Press.
  17. Aune, J.B. and Lal, R. 1997. Agricultural productivity in the tropics and critical limits of properties of oxisols, ultisols and alfisols. Tropical Agriculture 74:96-103.
  18. G. Brewka, I. Niemela, and T. Syrjanen. Implementing ordered disjunction using answer set solvers for normal programs. In Proceedings of the 8th European Conference on Logics in Artificial Intelligence (JELIA 2002), volume 2424 of LNAI, pages 444–455. Springer, 2002.
  19. F. Buccafurri and G. Caminiti. A social semantics for multiagent systems. In C. Baral, G. Greco, N. Leone, and G. Terracina, editors, LPNMR, volume 3662 of Lecture Notes in Computer Science, pages 317–329. Springer, 2005.
  20. S. Maalal, M. Addou, “A practical application of a method of designing multi-agent systems based on the AUML language and the MDA approach”, Proceedings of the Fourth Workshop on Information Technologies and Communication WOTIC11, Casablanca, Morocco, p.104, 2011.
  21. S. Maalal, M. Addou, “A practical application of a method of designing multi-agent systems based on the AUML language and the MDA approach”, Proceedings of the Fourth Workshop on Information Technologies and Communication WOTIC11, Casablanca, Morocco, p.104, 2011.
  22. S. Maalal, M. Addou, “A practical application of a method of designing multi-agent systems based on the AUML language and the MDA approach”, Proceedings of the Fourth Workshop on Information Technologies and Communication WOTIC11, Casablanca, Morocco, p.104, 2011.