*SSR/ETSI Telecom-UPM, Ciudad Universitaria, 28040 Madrid-Spain

**DTC/EPS-Univ. Carlos III, Butarque 15, 28911 Leganés (Madrid) –Spain

Extended Summary


Genetic and evolutionary algorithms have been successfully used as global optimization tools to many different analytical problems and practical applications; on the other side, they also have been studied (in particular, the first) as a experimental field to discuss and validate theoretical conceptions on biological aspects of evolution.

Since Genetic Algorithms (GA) are founded on Neo-Darwinian evolution theory, there has been not much room to consider learning aspects in this framework: the lack of scientific basis for Lamarckism has discouraged this kind of explorations. However, there is a possible role of learning in optimization-evolution that does not go against scientific evidence: the Baldwin effect [1], which argues that learning has a role in optimization-evolution through the competition phase (i.e., the fitness of the individuals) of the series of generations. This solid argument is deserving more and more attention [2]; and even it has proved superior adaptation capabilities than standard Neo-Darwinian formulations in some selected "difficult" [3] and non-stationary [4] situations.

On the other side, iterated non-zero sum games are reasonable models for many situations: the best known example being the Iterated Prisoner Dilemma (IPD) which has been considered as representative for the Cold War period and also for pricing policies in a duopolistic market. GA designs have been developed, studied and commented by Axelrod in different papers: [5] is a classical synthesis of these results, which is remarkably interesting from many points of view. So, selecting this kind of games to test the Baldwinian procedures against Neo-Darwinian standard processes seems to be a good way to get representative conclusions.

In this paper, we just present the design and execution of a first series of experiments along this line: two optimization strategies for playing the IPD, the one based on Neo-Darwinian search, the other including several simple (local) learning mechanism (therefore this been a Baldwin effect based design), are applied to different non-stationary IPD environments (the parameters of the games changing in different ways). The results are analyzed and discussed: showing, in general, an advantage of the Baldwin effect based procedures both from the points of view of convergence speed and performance, and more important as the dynamics of the situations are faster.

Some conclusions about the relevance of learning as a complementary element for global search and optimization, effects of relative rates for learning and evolution, constructive/destructive interferences, etc., are emphasized; and, finally, suggestions of further research along this direction are presented.


[1] J. M. Baldwin: A New Factor in Evolution; American Naturalist, Vol. 30, pp. 441-451, 1896.

[2] Special Issue on Baldwin Effect, Evolutionary Computation, Vol. 4, No 3, 1996.

[3] P. M. Todd, G. F. Miller: Exploring Adaptive Agency: II. Simulating the Evolution of Associative Learning. In J-A. Meyer, S. W. Wilson (Eds.): From Animals to Animats. Proc. 1st Intl. Conf. on Simulation of Adaptive Behavior. Cambridge, MA: MIT Press; 1991.

[4] S. Nolfi, D. Parisi: Learning to Adapt to Changing Environments in Evolving Neural Networks. Adaptive Behavior, Vol. 5, No. 1, pp. 75-98; 1996.

[5] R. Axelrod: The Evolution of Cooperation. New York: Basic Books; 1984.