[ Main Page | Research | Publications | Management | Grants | Industrial contracts | Courses | Conferences | Collaborators ]
Nature



Evolutionary type algorithms are an essential part of the Artificial Intelligence as well as Computational Intelligence and Machine Learning toolkits7,13,29. The first record of the proposal to mutation/selection learning machines is probably that of Alan Turing in the article on Computing Machinery and Intelligence published in 1950. Since this period, Genetic Algorithms (abbreviated GA) and their variants have been used with success in a variety of fields; a detailed list of genetic algorithm real-world applications is provided in Wikipedia . Since 2004, the Genetic and Evolutionary Computation Conference boasts the "Humies" competition awarding prizes to human-competitive results produced by any form of genetic and evolutionary computation. Besides their numerical ability to generate high-quality solutions to mathematical optimization and solution search problems, evolutionary algorithms are very often presented on specific problem instances with limited regard to their rigorous mathematical foundations, without any rigorous analysis of the underlying random evolutionary processes. Their performance often relies on intuition and empirical evidence driven by numerical observations, so that evolutionary computing research sometimes falls in obscure inspired paradigms and a crucial question arise:

To what extent do we accept numerical evidence as a proof when dealing with complex systems of high risk as nuclear plants, health care management, security or defense systems?

In the context of optimization problems this question doesn't seem so critical. Population-based algorithms are used as hole punchers for drilling and perforating complex solution state spaces. If you can move mountains you can move molehills; the central idea here is to "mimic" natural-evolution-type mechanisms "inspired by Darwin's theory of evolution in Nature" to improve gradually step by step the population adaptation without the need of the fitness derivative information. For convex optimization problems, the performance and convergence of the algorithm follows from standard analysis. For more complex optimization problems with local minima, we expect large population explorations to escape from suboptimal or inadequate solutions.

When the population size is sufficiently large but finite, Genetic Algorithms equipped with an inverse temperature pressure and vanishing mutations behave as a generalized simulated annealing algorithm on product spaces4, see also sections 4 and 5 in the review article on genetic algorithms7. This approach allows to develop a global optimization theory based on generalized simulated annealing theory on product spaces, see for instance the articles4,7 and the pioneering article by Cerf published in 1995. Leaving aside computational cost considerations, the population size can be seen as a precision parameter, larger population increases the chance to find the optimal solution and prevent being stuck in local optima. Increasing the computational power with the population size should significantly improve the optimization performance. The concentration properties of the infinite population GA are discussed in the article on annealed Feynman-Kac Models26.

Most of genetic type evolutionary computing algorithms can be seen as an advanced sophisticated Monte Carlo method that converges towards some particular probability distributions (a.k.a. the infinite population model) as the number of individuals/samples tends to $\infty$. For instance, as shown in the first rigorous article on this subject1 published in 1996, when the number of samples tends to $\infty$, the occupation measures of the simple Genetic Algorithm , the most popular type of evolutionary type algorithm , converge towards a conditional probability measure w.r.t. a series of conditioning events that depends on the fitness functions. Simple GA with mutation/selection transitions can be interpreted as a mean-field particle methodology to sample Feynman-Kac probability measures . These probability measures represent the distribution of the paths of a reference Markov process, weighted by a collection of potential/fitness functions5,12,16. These functional models are natural mathematical extensions of the traditional change of probability measures, commonly used in importance sampling. For regulation problems, and open loop optimal control problems, these two viewpoints can also be encapsulated in a single mathematical framework8.

This GA Monte Carlo sampling methodology and their limiting Feynman-Kac probability measures (in discrete time as well as in continuous time5,16,17,21,23) encapsulates in a single mathematical framework a variety of algorithms known by many other names and often used as natural heuristics since the beginning of the 1950s in different scientific disciplines, to name a few:

Most of the terminology encountered in the literature arises from the application domains as well as with the branching/genetic evolution of the Monte Carlo methodology. As underlined in the articles25,30, from a mathematical viewpoint, only the terminologies "Fleming-Viot" and "Moran" are misleading. The reasons are two-fold: firstly, the Moran particle model is a finite population model that converges as the number of particles tends to infinity towards a stochastic Fleming Viot super process and secondly, the genetic noise arising in the limit requires a Moran finite population process with symmetric-selection jump rates. In the context of GA, the selection/killing-jump rates are clearly far from being symmetric, the empirical measures of the finite population model are biased and the limiting Feynman-Kac semigroup, as the number of particles tends to infinity, is purely deterministic.

Contrary to much of what is written in the literature on evolutionary algorithms, the GA methodologies discussed above belong to the class of (mean field type) Monte Carlo methods. They involve generating random samples/particles from a probability distribution. As the law of large numbers or as the ergodic theorem describing the behavior of sample averages, these GA Monte Carlo methodologies cannot be classified as metaheuristics.

These GA Monte Carlo techniques have been successfully applied in many different fields including biology, risk analysis, computer vision, signal processing, tracking, optimal control, econometrics, finance, robotics, and statistical inference. For a more detailed discussion we refer to the books16,17,18, see also the applications to statistical learning and rare event simulation discussed in the slides: MCQMC-2012 slides and the ENS-Lyon-2012 slides.

This webpage presents a brief pedagogical introduction to some of the probabilistic concepts and convergence results currently popular in discrete generation GA Monte Carlo particle-type samplers. It is by no means exhaustive and it is mainly based on our own work with close collaborators. More detailed references can be found on the complementary sites on Feynman-Kac probability measures and their genetic-type particle interpretations. The material covered includes unnormalized and unbiased measures, infinite population models, limiting quasi-invariant measures, parallel island models, genealogical tree models as well as backward ancestral measures. We also present some illustrations related to Markov chain conditioning, filtering and Boltzmann-Gibbs measures. The mathematical concepts described below provide a very natural and unifying mathematical basis for a large class of genetic type Monte Carlo algorithms. We hope that this unifying point of view will help to develop fruitfully this field further.




A Simple Genetic Algorithm with Mutation & Selection

Suppose we are given some Markov process $X_{n}$ evolving at every time step $n\in \mathbb {N} $ in some state/solution space $S_n$ which may vary with the time parameter $ n\in \mathbb {N}$. This first basic ingredient represents the mutation transitions of the genetic algorithm. Without selection, the GA with $N$ individuals behaves as $N$ independent copies of the Markov chain $X_{n}$.

Let $ g_{n}~:~x\in S_n\mapsto g_{n}(x)\in \mathbb {[} 0,\infty ]$ be some collection of bounded fitness functions on the solution space $S_n$ indexed by $ n\in \mathbb {N} $. This second ingredient represents the fitness of the selection transitions of the genetic algorithm.

Consider a population of $N$ individuals represented by $N$ independent copies $ \xi _{0}:=\left(\xi _{0}^{i}\right)_{1\leq i\leq N}\in S_0^N$ of some random variable $ X_{0}$ on $S_0$.

The simple genetic algorithm associated with the Markov chain $X_n$ and the fitness function $g_n$ is defined the selection-mutation transitions given by the synthetic diagram


$$\displaystyle \xi _{n}:=\left(\xi _{n}^{\,i}\right)_{1\leq i\leq N}\in S_n^{N}~~~~{\stackrel {\mbox{selection $~~~~~~~~~~~~$}}{-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!\longrightarrow }}~~~~{\widehat {\xi }}_{n}:=\left({\widehat {\xi }}_{n}^{\,i}\right)_{1\leq i\leq N}\in S_n^{N}~~~~~{\stackrel {\mbox{mutation $~~~~~~~~~~~~$}}{-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!\longrightarrow }}~~~\xi _{n+1}:=\left(\xi _{n+1}^{\,i}\right)_{1\leq i\leq N}\in S_{n+1}^{N}$$

The upper index $(.)^i$ represents the label of the $i$-th individual, the lower index $(.)_n$ represents the time parameter $n\in \mathbb {N}$, and $S_n^N=S_n\times\ldots\times S_n$ stands for the $N$-fold Cartesian product space of the set $S_n$.



During the selection transition:

Given the random variables $\xi _{n}:=\left(\xi _{n}^{\,i}\right)_{1\leq i\leq N}\in S_n^{N}$ at time $n\in \mathbb {N} $, sample N (conditionally) independent random variables $\widehat{\xi}_{n}:=\left(\widehat {\xi}_{n}^{\,i}\right)_{1\leq i\leq N}$ with common (conditional) distribution

$$ Proba\left({\widehat {\xi }}_{n}^{\,i}=\xi _{n}^{\,j}~|~\xi _{n}\right)={\frac {g_{n}(\xi _{n}^{\,j})}{\sum _{1\leq k\leq N}g_{n}(\xi _{n}^{\,k})}},~~~~{\mbox{for each}}~j\in \{1,\ldots ,N\}$$

In other words, if $ g_{n}(\xi _{n}^{\,j})$ is the fitness of the $j$-th individual $ \xi _{n}^{\,j}$ in the population $ \xi _{n}$, its probability of being selected in the next generation $\widehat {\xi }_{n}$ is equal to $\displaystyle\frac {g_{n}(\xi _{n}^{\,j})}{\sum _{1\leq k\leq N}g_{n}(\xi _{n}^{\,k})}$.

This selection transition is also called the fitness-proportionate selection a.k.a. the roulette-wheel selection. A proportion of the wheel is assigned to each of the $N$ possible selected individual $\xi _{n}^{\,j}$ based on their fitness values $g_{n}(\xi _{n}^{\,j})$, with $j\in\{1,\ldots,N\}$. As shown above, the proportion of the wheel is achieved by dividing the fitness of a selection by the total fitness of all the selections. Every random selection is made by rotating the wheel. Several branching variants are developed in the articles7,24, see also chapter 11 in the book on Feynman-Kac formulae18

For $[0,1]$-valued fitness functions an alternative acceptance-rejection type selection transition can be underlined:

For each $ 1\leq i\leq N$, with a probability $g_{n}(\xi _{n}^{\,i})$ we accept the individual; that is, we set $\widehat {\xi }_{n}^{\,i}=\xi _{n}^{\,i}$. With a probability $(1-g_{n}(\xi _{n}^{\,i}))$ we sample a new individual ${\widetilde {\xi }}_{n}^{\,i}$ in the current population with the probability measure $$ Proba\left({\widetilde {\xi }}_{n}^{\,i}=\xi _{n}^{\,j}~|~\xi _{n}\right)={\frac {g_{n}(\xi _{n}^{\,j})}{\sum _{1\leq k\leq N}g_{n}(\xi _{n}^{\,k})}},~~~~{\mbox{for each}}~j\in \{1,\ldots ,N\}\qquad\mbox{and we set}\qquad \widehat {\xi }_{n}^{\, i}=\widetilde {\xi }_{n}^{\, i}.$$ This selection scheme can be interpreted as an acceptance-rejection sampler equipped with a recycling mechanism. It can also be seen as a killing/regeneration transition with killing rate $(1-g_n(x))$. In this interpretation, with probability $(1-g_{n}(\xi _{n}^{\,i}))$ the $i$-th individual is killed and instantly replaced by a new individual ${\widetilde {\xi }}_{n}^{\,i}$ chosen in the pool, with a probability proportional to the fitness of the individuals.


During the mutation transition:

From each selected particle $\widehat {\xi }_{n}^{\, i}\in S_n$ we sample independently a transition $\widehat {\xi }_{n}^{\, i}\in S_n~\longrightarrow \xi _{n+1}^{\, i}\in S_{n+1}$ of the Markov chain $ X_{n}\in S_n~\rightarrow ~X_{n+1}\in S_{n+1}$ starting at $X_{n}=\widehat {\xi }_{n}^{\, i}$, for $i\in \{1,\ldots ,N\}$.







Some Convergence Results

Under some regularity conditions, when the number of individuals (and the computational power) $ N\uparrow \infty $, for any time index $n\in \mathbb{N}$ and any function $ f_n~:~x\in S_n\mapsto f_n(x)\in \mathbb {R}$, we have the almost sure convergence results $$ {\frac {1}{N}}\sum _{1\leq i\leq N}~f_n\left(\xi _{n}^{\,i}\right)~\approx _{N\uparrow\infty }~{\frac {\mathbb{E}\left(f_n(X_{n})~\prod _{0\leq k<n}g_{k}(X_{k})\right)}{\mathbb{E}\left(\prod _{0\leq k<n}g_{k}(X_{k})\right)}}\qquad {\mbox{as well as}}\qquad {\frac {1}{N}}\sum _{1\leq i\leq N}~f_n\left({\widehat {\xi }}_{n}^{\,i}\right)~\approx _{N\uparrow \infty }~{\frac {\mathbb{E}\left(f_n(X_{n})~\prod _{0\leq k\leq n}g_{k}(X_{k})\right)}{\mathbb{E}\left(\prod _{0\leq k\leq n}g_{k}(X_{k})\right)}}$$ In addition, we have the unbiased estimate $$ \frac {1}{N} \sum _{1\leq i\leq N}f_n(\xi _{n}^{\,i}) ~~~\times~~~ \left( \prod _{0\leq k<n}{\frac {1}{N}}\sum _{1\leq j\leq N}g_{k}(\xi _{k}^{\,j})\right)~\approx _{N\uparrow \infty }~\mathbb{E}\left(f_n(X_{n})~\prod _{0\leq k<n}g_{k}(X_{k})\right) $$ The unbiasedness property refers to the fact that the expected values of the estimator coincides with the limiting value; that is, we have that the so-called unnormalized formulae $$ \mathbb{E}\left( \frac {1}{N} \sum _{1\leq i\leq N}f_n(\xi _{n}^{\,i}) ~~~\times~~~ \left( \prod _{0\leq k<n}{\frac {1}{N}}\sum _{1\leq j\leq N}g_{k}(\xi _{k}^{\,j})\right)\right)~=~\mathbb{E}\left(f_n(X_{n})~\prod _{0\leq k<n}g_{k}(X_{k})\right) $$ In statistics and probability theory the limiting expectations in the right hand side displays are called Feynman-Kac formulae. The first rigorous proof of the above convergence results including the unbiasedness property and the design of adaptive selection rules can be found in the article 1 in the context of nonlinear filtering published in 1996, see also the series of articles 2,3,4,5,6,12. Under some stability conditions, the above convergence results can be turned into uniform in time estimates3,5,6, see also the time-uniform exponential concentration inequalities developed in the books12,17,18

.

The infinite population GA model

The limiting Feynman-Kac probability measures discussed above are closely related to the so-called infinite population model arising in evolutionary computation literature. In this connection, we underline that these probability measures and the filtering equations discussed in the articles on the convergence properties of particle filters1,2,3,5,6 as well as the ones in the articles by Gordon-Salmon and Smith on the Bootstrap filter and by Kitagawa on the Monte Carlo filter coincide with the infinite population GA evolution equations (without crossover) discussed by Ermakov and Zhiglyavskii in an article on GA published in 1984 (see equation (6) as well as equations (2.5) and (2.13) in the earlier article by Karlin published in 1979), as well as in the article by Qi and Palmeri (see equation (4)), the one by Vose (see section 6) and more recently in the article by Song and Li . This article also discusses issues and wrong convergence proofs in a widely cited studies on genetic algorithms in continuous space (see equation (1)). The commonly made mistake is to apply de Finetti theorem for finite sequences without ensuring the asymptotic independence of finite sequences (a.k.a. propagations of chaos properties). As expected and as mentioned earlier, Particle filters, Bootstrap filters as well as Monte Carlo filters also coincide with the GA with roulette-wheel selection discussed above as well as in the series of articles presented above.

As the population size $N$ goes to infinity, the occupation measures of the individuals before and after selection converge to deterministic probability distributions denoted respectively by $\eta_n(dx)$ and $\widehat{\eta}_n(dx)$; that is, for any function $ f_n~:~x\in S_n\mapsto f_n(x)\in \mathbb {R}$, we have the almost sure convergence results $$ {\frac {1}{N}}\sum _{1\leq i\leq N}~f_n\left(\xi _{n}^{\,i}\right)~\approx _{N\uparrow\infty }~ \int~f_n(x)~\eta_n(dx) \qquad {\mbox{as well as}}\qquad {\frac {1}{N}}\sum _{1\leq i\leq N}~f_n\left({\widehat {\xi }}_{n}^{\,i}\right)~\approx _{N\uparrow \infty }~\int~f_n(x)~\widehat{\eta}_n(dx) $$ For time homogeneous and finite type state spaces $S$, these probability are represented by vectors $x\in S\mapsto \eta_n(x)\in [0,1]$ and $x\in S\mapsto \widehat{\eta}_n(x)\in [0,1]$ such that $\sum_{x\in S} \eta_n(x)=1=\sum_{x\in S} \widehat{\eta}_n(x)$. In this context if we choose the indicator function $f_n(x)=1_{A}(x)$ of some subset $A$ of the state space $S$, then $$ \eta_n(A):=\sum_{x\in S}~1_{A}(x)~\eta_n(x)\quad\mbox{can be seen as the limiting proportion in the population consisting of individuals corresponding to types in $A$} $$ Obviously, the probability distributions $\eta_n(dx)$ and $\widehat{\eta}_n(dx)$ coincide with the Feynman-Kac measures discussed above. They are described sequentially by so-called infinite population evolution equations $$ \eta_n(dx)~~~~{\stackrel {\mbox{selection $~~~~~~~~~~~~$}}{-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!\longrightarrow }}~~~~ \widehat{\eta}_n(dx)~~~~~{\stackrel {\mbox{mutation $~~~~~~~~~~~~$}}{-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!\longrightarrow }}~~~ \eta_{n+1}(dx)$$ The probability distributions in the $n$-th generation described above are calculated sequentially by the integral formulae $$ \widehat{\eta}_n(dx)=\frac{g_n(x)~\eta_n(dx)}{\int~g_n(z)~\eta_n(dz)} \qquad {\mbox{and}}\qquad \eta_{n+1}(dx)=\int~\widehat{\eta}_n(dz)~M_{n+1}(z,dx) $$ with the transition probability of the mutation process $$ M_{n+1}(z,dx):= Proba(X_{n+1}\in dx~|~X_{n}=z) $$



Invariant/Limiting measures (a.k.a. quasi-invariant measures)

For time homogeneous models $(g_n,M_n)=(g,M)$, the infinite population evolution equations discussed above take the following form $$ \widehat{\eta}_n(dx)=\Psi_{g}(\eta_n)(dx):=\frac{g(x)~\eta_n(dx)}{\int~g(z)~\eta_n(dz)} \qquad {\mbox{and}}\qquad \eta_{n+1}(dx)=(\widehat{\eta}_nM)(dx):=\int~\widehat{\eta}_n(dz)~M(z,dx) $$ In a more synthetic form, we have $$ \widehat{\eta}_n=\widehat{\Phi}\left(\widehat{\eta}_{n-1}\right):=\Psi_{g}\left(\widehat{\eta}_{n-1}M\right) \quad\mbox{and}\quad \eta_n=\Phi\left(\eta_{n-1}\right):=\Psi_{g}(\eta_{n-1})M $$ Under some stability conditions 3,5,6,10,21,22 (see also the books 17,18 and the more recent articles27,30) whenever the time parameter $n\rightarrow\infty$ the above recursions converge (exponentially fast) toward a single limiting probability measure $$ \widehat{\eta}_n \longrightarrow_{n\rightarrow\infty} \widehat{\eta}_{\infty} =\widehat{\Phi}\left(\widehat{\eta}_{\infty}\right) \quad\mbox{and}\quad \eta_n\longrightarrow_{n\rightarrow\infty}\eta_{\infty}=\Phi\left(\eta_{\infty}\right) $$ These limiting probability measures are clearly connected by the formulae $$ \widehat{\eta}_{\infty} = \Psi_{g}\left(\eta_{\infty}\right) \quad\mbox{and}\quad \eta_{\infty}= \widehat{\eta}_{\infty}M $$ Assume that the mutation transition $M(x,dy)$ is reversible with respect to some probability measure $\mu(dx)$; that is, we have that $$ \mu(dx)~M(x,dy)=\mu(dy)~M(y,dx) $$ Also assume there exists some function $h(x)\geq 0$ and some parameter $\lambda>0$ such that $$ g(x)~\int~M(x,dz)~h(z)=\lambda~ h(x) $$ In this situation, as shown for instance in Section 12.4 in the book18 we have $$ \widehat{\eta}_{\infty}(dx)=\frac{h(x)}{\int h(z)\mu(dz)}~\mu(dx) $$

Parallel Island models

The unbiasedness property can be rewritten as follows: $$ \mathbb{E}\left( F_n(\chi_n) \prod _{0\leq k<n}G_k(\chi_k)\right)~=~\mathbb{E}\left(f_n(X_{n})~\prod _{0\leq k<n}g_{k}(X_{k})\right) $$ with the Markov chain $\chi_n=(\xi_n^i)_{1\leq i\leq N}\in S_n^N$, the function $$ F_n(\chi_n):= \frac {1}{N} \sum _{1\leq i\leq N}f_n(\xi _{n}^{i})\quad \mbox{and the fitness function}\quad G_n(\chi_n):= \frac {1}{N} \sum _{1\leq i\leq N}g_n(\xi _{n}^{i}) $$ The simple genetic algorithm associated with the Markov chain $\chi_n$ and the fitness function $G_n(\chi_n)$ can be interpreted as an island genetic algorithm12,13,14 (a.k.a. double bootstrap of $SMC^2$ in filtering theory and Bayesian inference).



Genealogical trees

Tracing back in time the ancestral lines of each individual $\widehat {\xi }_{n}^{i}$ in the population after the $n$-th selection, we define the genealogical tree of the genetic population $$ \widehat {\xi }_{0,n}^{i}\longleftarrow ~{\widehat {\xi }}_{1,n}^{i}\longleftarrow ~\ldots \longleftarrow {\widehat {\xi }}_{k-1,n}^{i}\longleftarrow {\widehat {\xi }}_{k,n}^{i}\longleftarrow \ldots \longleftarrow{\widehat {\xi }}_{n-1,n}^{i}\longleftarrow {\widehat {\xi }}_{n,n}^{i}={\widehat {\xi }}_{n}^{i} $$ The lower indices $ (.)_{k,n}$ represent the level $k$ of the ancestor of $\widehat {\xi }_{n}^{i}$.

When the number of individuals (and the computational power) $ N\uparrow \infty $, for any function $ f_{n}~:~(x_{0},\ldots ,x_{n})\in (S_0\times\ldots \times S_n)\mapsto f_{n}(x_{0},\ldots ,x_{n})\in \mathbb {R} $ we have the Monte Carlo approximation $$ {\frac {1}{N}}\sum _{1\leq i\leq N}~f_{n}\left({\widehat {\xi }}_{0,n}^{i},{\widehat {\xi }}_{1,n}^{i},\ldots ,{\widehat {\xi }}_{n-1,n}^{i},{\widehat {\xi }}_{n,n}^{i}\right)~\approx _{N\uparrow \infty }~{\frac {\mathbb{E}\left(f_{n}(X_{0},\ldots ,X_{n})~\prod _{0\leq k\leq n}g_{k}(X_{k})\right)}{\mathbb{E}\left(\prod _{0\leq k\leq n}g_{k}(X_{k})\right)}}$$

In the same vein, tracing back in time the ancestral lines of each individual $\xi_{n}^{i}$ in the population after the $n$-th mutation, we define the genealogical tree of the genetic population $$ \xi_{0,n}^{i}\longleftarrow ~\xi_{1,n}^{i}\longleftarrow ~\ldots \longleftarrow\xi_{k-1,n}^{i}\longleftarrow \xi_{k,n}^{i}\longleftarrow \ldots \longleftarrow\xi_{n-1,n}^{i}\longleftarrow \xi_{n,n}^{i}=\xi_{n}^{i} $$ The lower indices $ (.)_{k,n}$ represent the level $k$ of the ancestor of $\xi_{n}^{i}$

When the number of individuals (and the computational power) $ N\uparrow \infty $, for any function $ f_{n}~:~(x_{0},\ldots ,x_{n})\in (S_0\times\ldots \times S_n)\mapsto f_{n}(x_{0},\ldots ,x_{n})\in \mathbb {R} $ we have the Monte Carlo approximation $$ {\frac {1}{N}}\sum _{1\leq i\leq N}~f_{n}\left(\xi_{0,n}^{i},\xi_{1,n}^{i},\ldots,\xi_{n-1,n}^{i},\xi_{n,n}^{i}\right)~\approx _{N\uparrow \infty }~{\frac {\mathbb{E}\left(f_{n}(X_{0},\ldots ,X_{n})~\prod _{0\leq k< n}g_{k}(X_{k})\right)}{\mathbb{E}\left(\prod _{0\leq k< n}g_{k}(X_{k})\right)}}$$

In addition, we have the unbiasedness property $$ \mathbb{E}\left( \frac {1}{N} \sum _{1\leq i\leq N}f_{n}\left(\xi_{0,n}^{i},\xi_{1,n}^{i},\ldots,\xi_{n-1,n}^{i},\xi_{n,n}^{i}\right) ~~~\times~~~ \left( \prod _{0\leq k<n}{\frac {1}{N}}\sum _{1\leq j\leq N}g_{k}(\xi _{k}^{\,j})\right)\right)~=~\mathbb{E}\left(f_n(X_{0},\ldots ,X_{n})~\prod _{0\leq k<n}g_{k}(X_{k})\right) $$ Choosing the historical process $$ {\cal X}_n:=(X_0,\ldots,X_n) \in {\cal S}_n:=(S_0\times\ldots \times S_n)\quad \mbox{and the fitness function}\quad {\cal G}_n({\cal X}_n):=g_n(X_n)$$ we clearly have $$ \mathbb{E}\left(f_{n}({\cal X}_n)~\prod _{0\leq k\leq n} {\cal G}_k({\cal X}_k)\right)=\mathbb{E}\left(f_{n}(X_{0},\ldots ,X_{n})~\prod _{0\leq k\leq n}g_{k}(X_{k})\right)\quad \mbox{and}\quad \mathbb{E}\left(f_{n}({\cal X}_n)~\prod _{0\leq k< n} {\cal G}_k({\cal X}_k)\right)=\mathbb{E}\left(f_{n}(X_{0},\ldots ,X_{n})~\prod _{0\leq k< n}g_{k}(X_{k})\right) $$

This shows that the genealogical tree evolution of the genetic population is itself a simple genetic algorithm with a mutation transition on path space representing the evolution of the historical process. In other words, the genealogical tree evolution of the simple genetic algorithm associated with the Markov chain $X_n$ and the fitness function $g_n$ on $S_n$ coincides with the simple genetic algorithm associated with the historical Markov chain ${\cal X}_n$ and the fitness function ${\cal G}_n$ on ${\cal S}_n$.



Backward ancestral lines

To avoid unnecessary measure theoretic considerations, consider a Markov chain $X_n$ on a finite state space $S$ and set $$ p_{n+1}(x_{n},x_{n+1}):= Proba(X_{n+1}=x_{n+1}~|~X_{n}=x_{n})\quad \mbox{and}\quad h_{n+1}(x_{n},x_{n+1}):=g_{n}(x_{n})~p_{n+1}(x_{n},x_{n+1}) $$ Also consider the random backward transitions $$ \mathbb{M}_{n+1,m(\xi_{n})}(x_{n+1},x_{n}):=\frac{m(\xi_{n})(x_n)~h_{n+1}(x_{n},x_{n+1})}{\sum_{z_n\in S}m(\xi_{n})(z_n)~ h_{n+1}(z_{n},x_{n+1})} \quad \mbox{with}\quad m(\xi_{n})(x_n):=\frac{1}{N}\sum_{1\leq i\leq N}~1_{\xi^i_n}(x_n)\quad \mbox{with}\quad 1_{\xi^i_n}(x_n)=\left\{ \begin{array}{rcl} 1 &\mbox{if} & x_n=\xi^i_n\\ 0 &\mbox{if} & x_n\not=\xi^i_n \end{array}\right. $$ Note that $$ \sum_{x_n\in S} \mathbb{M}_{n+1,m(\xi_{n})}(x_{n+1},x_{n})=1 \quad \mbox{and}\quad \sum_{x_n\in S} m(\xi_{n})(x_n)=1 $$ In this notation, given the populations of individuals $(\xi_0,\ldots,\xi_n)$, the conditional probability distribution of a randomly chosen ancestral line $(\mathbb{X}_{0,n},\mathbb{X}_{1,n},\ldots,\mathbb{X}_{n,n})$ in the genealogical tree after the $n$-th mutation is given by the backward Markov chain formula11,15: $$ Proba\left(\mathbb{X}_{0,n}=x_0,\mathbb{X}_{1,n}=x_1,\ldots,\mathbb{X}_{n-1,n}=x_{n-1},\mathbb{X}_{n,n}=x_n~|~\xi_0,\ldots,\xi_n\right)= m(\xi_n)(x_n)~\mathbb{M}_{n,m(\xi_{n-1})}(x_{n},x_{n-1})\ldots \mathbb{M}_{1,m(\xi_0)}(x_{1},x_{0}) $$ In addition, for any function $ f_{n}~:~(x_{0},\ldots ,x_{n})\in (S_0\times\ldots \times S_n)\mapsto f_{n}(x_{0},\ldots ,x_{n})\in \mathbb {R} $ we have $$ \mathbb{E}\left(f_n\left(\mathbb{X}_{0,n},\mathbb{X}_{1,n},\ldots,\mathbb{X}_{n,n}\right)~|~\xi_0,\ldots,\xi_n\right)= \mathbb{E}\left({\frac {1}{N}}\sum _{1\leq i\leq N}~f_{n}\left(\xi_{0,n}^{i},\xi_{1,n}^{i},\ldots,\xi_{n-1,n}^{i},\xi_{n,n}^{i}\right)~|~\xi_0,\ldots,\xi_n\right) ~\approx _{N\uparrow \infty }~\frac{\mathbb{E}\left(f_{n}(X_{0},\ldots ,X_{n})~\prod _{0\leq k< n}g_{k}(X_{k})\right)}{\mathbb{E}\left(\prod _{0\leq k< n}g_{k}(X_{k})\right)} $$



Continuous time GA models

Let $Y_t$ be a continuous time Markov process evolving in some state space $S$, indexed by a continuous time index $ t\in \mathbb{R}_+:=[0,\infty[$. Consider an increasing time mesh $t_0=0< t_1<\ldots < t_{n-1}< t_{n} $ witth a time step $\Delta =t_n-t_{n-1}\approx 0$ and $V~:~x\in S\mapsto \mathbb{R}_+:=[0,\infty[$ some potential function on some state space $S$. We associate with these objects, the Markov chain $X_n$ and the $[0,1]$-valued potential function$g_n$ defined by $$ X_n:= Y_{t_n} \qquad \mbox{and}\qquad g_n(x):=e^{-v(x)\Delta } $$ In this case, choosing $t_n=[t/\Delta]\Delta \approx _{\Delta \downarrow 0} t$ we have the continuous time Feynman-Kac path integral formulae $$ \mathbb{E}\left(f(X_{n})~\prod _{0\leq k<n}g_{k}(X_{k})\right) =\mathbb{E}\left(f(Y_{t_n})~ \exp{\left(-\sum_{0\leq k<n}v(Y_{t_k}) ~(t_k-t_{k-1})\right)}\right)\approx _{\Delta\downarrow 0} \mathbb{E}\left(f(Y_{t})~ \exp{\left(-\int_0^tv(Y_{s}) ~ds\right)}\right) $$ When the time step $\Delta \downarrow 0$, the GA equipped with the killing/regeneration transition converge to a continuous time GA process $\xi_t:=(\xi^i_t)_{1\leq i\leq N}$ with a killing rate $v(x)$. When an individual is killed, instantly another individual in the pool duplicates. More general killing/regeneration models associated with signed potential functions are discussed in the series of studies 5,16,17,21,23. In theoretical physics, these killing/regeneration continuous time GA are often termed killing/cloning models , see the article by Angeli-Grosskinsky-Johansen and the one by Lecomte and Tailleur.

Next we state the continuous time versions of some convergence results stated above, more detailed statements and proofs are provided in5,17,18,21,23,31, see also the article of Rousset on Schrödinger's ground states particle estimates . Under some regularity conditions, when the number of individuals (and the computational power) $ N\uparrow \infty $, for any continuous time index $t\in \mathbb{R}_+:=[0,\infty[$ and any function $ f~:~x\in S\mapsto f(x)\in \mathbb {R}$, we have the almost sure convergence results $$ {\frac {1}{N}}\sum _{1\leq i\leq N}~f\left(\xi _{t}^{\,i}\right)~\approx _{N\uparrow\infty }~{\frac {\mathbb{E}\left(f(Y_{t})~\exp{\left(-\int_0^tV(Y_{s}) ~ds\right)}\right)}{\mathbb{E}\left(\exp{\left(-\int_0^tV(Y_{s}) ~ds\right)}\right)}} $$ In addition, we have the unbiased estimate $$ \frac {1}{N} \sum _{1\leq i\leq N}f(\xi _{t}^{\,i}) ~~~\times~~~ \exp{\left(-\int_0^t \frac {1}{N} \sum _{1\leq j\leq N}V(\xi^j_s) ~ds\right)}~\approx _{N\uparrow \infty }~\mathbb{E}\left(f(Y_{t})~\exp{\left(-\int_0^tV(Y_{s}) ~ds\right)}\right) $$ The unbiasedness property can be rewritten as follows: $$ \mathbb{E}\left( F(\chi_t) \exp{\left(-\int_0^tV(\chi_{s}) ~ds\right)}\right)~=~\mathbb{E}\left(f(Y_{t})~\exp{\left(-\int_0^tV(Y_{s}) ~ds\right)}\right) $$ with the Markov chain $\chi_t=(\xi_t^i)_{1\leq i\leq N}\in S^N$, the function $$ F(\chi_t):= \frac {1}{N} \sum _{1\leq i\leq N}f(\xi _{t}^{i})\quad \mbox{and the potential function}\quad V(\chi_t):= \frac {1}{N} \sum _{1\leq i\leq N} v(\xi _{t}^{i}). $$

Tracing back in time the ancestral lines of each individual $\widehat {\xi }_{t}^{i}$, we define the genealogical tree of the genetic population $$ \widehat {\xi }_{0,t}^{i}\longleftarrow ~\ldots \longleftarrow ~{\widehat {\xi }}_{s,t}^{i}\longleftarrow ~\ldots \longleftarrow {\widehat {\xi }}_{t,t}^{i}={\widehat {\xi }}_{t}^{i} $$ The lower indices $ (.)_{s,t}$ represent the level $s\leq t$ of the ancestor of $\widehat {\xi }_{t}^{i}$. For any function $f_t$ on the space of paths with terminal time horizon $t$ we have $$ \frac {1}{N} \sum _{1\leq i\leq N}f_t\left(\left(\xi _{s,t}^{\,i}\right)_{0\leq s\leq t}\right) \approx _{N\uparrow \infty }~ {\frac {\mathbb{E}\left(f_t\left((Y_{s})_{0\leq s\leq t}\right)~\exp{\left(-\int_0^tV(Y_{s}) ~ds\right)}\right)}{\mathbb{E}\left(\exp{\left(-\int_0^tV(Y_{s}) ~ds\right)}\right)}} $$




Some Illustrations

Feynman-Kac path integrals arise a variety of scientific disciplines, including in computational physics, biology, information theory and computer sciences. Their different interpretations depend on the application domain. We illustrate these path integration models with some concrete examples taken from the slides MCQMC-2012 slides and the ENS-Lyon-2012 slides:

  1. If we choose the indicator fitness function $g_{n}(x_{n})=1_{A_n}(x_{n})$ of some subset $A_n$ of the state space, they represent the conditional distribution of a Markov chain given it stays in a given tube; that is, for any function $ f_{n}~:~(x_{0},\ldots ,x_{n})\in (S_0\times\ldots \times S_n)\mapsto f_{n}(x_{0},\ldots ,x_{n})\in \mathbb {R} $ we have that $$ \mathbb{E}\left(f_{n}(X_{0},\ldots ,X_{n})~|~X_{0}\in A_0,~\ldots ,X_{n}\in A_n\right)=\displaystyle {\frac {\mathbb{E}\left(f_{n}(X_{0},\ldots ,X_{n})\prod _{0\leq k\leq n}g_{k}(X_{k})\right)}{\mathbb{E}\left(\prod _{0\leq k\leq n}g_{k}(X_{k})\right)}}\quad {\mbox{and}}\quad \mathbb{E}\left(\prod _{0\leq k\leq n}g_{k}(X_{k})\right)=P\left(X_{0}\in A_0,~\ldots ,X_{n}\in A_n\right)$$ (as soon as the normalizing constant is strictly positive). In addition, we have the unbiased estimate$$ \prod _{0\leq k\leq n}{\frac {1}{N}}\sum _{1\leq i\leq N}1_{A_n}(\xi _{k}^{i})~\approx _{N\uparrow \infty }~Proba\left(X_{0}\in A_0,\ldots ,X_{n}\in A_n\right)$$



  2. Let $X_n$ be a simple random walk on the $2$-dimensional lattice $S:=\mathbb{Z}^2$ starting at the origin. Choosing the historical process $$ {\cal X}_n:=(X_0,\ldots,X_n) \in {\cal S}_n:=S^{(n+1)}\quad \mbox{and the fitness function}\quad {\cal G}_n({\cal X}_n):=1_{S-\{X_0,\ldots,X_{n-1}\}}(X_n)$$ we clearly have the self-avoiding Markov chain formulae $$ \mathbb{E}\left(f_{n}(X_0,\ldots,X_n)~|~\forall 0\leq p< q \leq n\quad X_p\not=X_q\right)= \frac{\mathbb{E}\left(f_{n}({\cal X}_n)~\prod _{0\leq k\leq n} {\cal G}_k({\cal X}_k)\right)}{ \mathbb{E}\left(\prod _{0\leq k\leq n} {\cal G}_k({\cal X}_k)\right)} \quad \mbox{and}\quad \mathbb{E}\left(\prod _{0\leq k\leq n} {\cal G}_k({\cal X}_k)\right)=Proba\left(\forall 0\leq p< q \leq n\quad X_p\not=X_q\right) $$



  3. In filtering problems, the Markov chain $X_{n}$ represents the signal process evolving on some state space $S$. This Markov chain can represent the evolution of some target (missile, plane, boat, and so on). Suppose the state $X_{n}$ is partially observed by some sensor of the form $$ Y_{n}=h(X_{n})+V_{n}$$ for some function $h~:~x\in S\mapsto h(x)\in \mathbb {R} $ and some sequence of random variables with a probability density $g(v)$. We fix the observations $Y_{n}=y_{n}$ and we set $ g_{n}(x)=g\left(y_{n}-h(x)\right)$. In this situation, the Feynman-Kac probability measures defined above coincide with the conditional distributions of the signal given the observation sequence; that is, we have that $$ \mathbb{E}\left(f_{n}(X_{0},\ldots ,X_{n})~|~Y_{0}=y_{0},~\ldots ,Y_{n}=y_{n}\right)=\displaystyle {\frac {\mathbb{E}\left(f_{n}(X_{0},\ldots ,X_{n})\prod _{0\leq k\leq n}g_{k}(X_{k})\right)}{\mathbb{E}\left(\prod _{0\leq k\leq n}g_{k}(X_{k})\right)}}$$ In addition, we have the unbiased estimate $$ \prod _{0\leq k\leq n}{\frac {1}{N}}\sum _{1\leq i\leq N}g_{k}(\xi _{k}^{i})~\approx _{N\uparrow \infty }~p(y_{0},\ldots ,y_{n})$$ where $p(y_{0},\ldots ,y_{n})$ stands for the probability density of the random variable $(Y_{0},\ldots ,Y_{n})$ evaluated at $(y_{0},\ldots ,y_{n})$. In signal processing and Bayesian inference, the mutation-selection genetic algorithm is also termed particle filter or Sequential Monte Carlo1,5.


  4. Consider a sequence of Gibbs probability measures $ \pi _{\beta }$ on a finite solution space S indexed by some inverse temperature parameter $ \beta >0$ and defined by $$ \pi _{\beta }(x)={\frac {e^{-\beta V(x)}}{\sum _{y\in S}e^{-\beta V(y)}}}\quad {\mbox{for some function}}~V~:~x\in S\mapsto V(x)\in [0,\infty [$$ These probability measures concentrate to the set of global minima of the function V(.). More precisely, we have $$ V_{\star }:=\inf _{x\in S}V(x)\quad {\mbox{and}}\quad {\mathcal {V}}:=\{x\in S~:~V(x)=V_{\star }\}\Rightarrow ~\forall x\in {\mathcal {V}}\quad \pi _{\beta }(x)={\frac {e^{-\beta \left(V(x)-V_{\star }\right)}}{\sum _{y\in S}e^{-\beta \left(V(y)-V_{\star }\right)}}}={\frac {1}{{\mbox{Card}}({\mathcal {V}})+\sum _{y\in S-{\mathcal {V}}}e^{-\beta \left(V(y)-V_{\star }\right)}}}\approx _{\beta \uparrow \infty }~{\frac {1}{{\mbox{Card}}({\mathcal {V}})}}$$ and for $ x\not \in {\mathcal {V}}$ $$ \pi _{\beta (x)}\approx _{\beta \uparrow \infty }0$$ This implies that $$ \pi _{\beta }(x)~\approx _{\beta \uparrow \infty }~{\frac {1}{{\mbox{Card}}({\mathcal {V}})}}~1_{\mathcal {V}}(x)$$ In other words, sampling a random variable with probability distribution $ \pi _{\beta }(x)$ is almost equivalent to that of sampling uniformly one of the global minima of the function V(.).
    Let $ \beta _{n}\geq \beta _{n-1}$ be a increasing sequence of parameters with $ \beta _{0}=0$. Consider a Markov chain $X_{n}$ starting with the uniform distribution $ \pi _{0}(x)=1/{\mbox{Card}}(S)$ on $ S$, and such that $ \pi _{\beta _{n}}$ is an invariant probability measure of the transition $X_{n-1}\rightarrow X_{n}$ at time n; that is, for any $n\geq 1$ we have that $$ \pi _{\beta _{n}}(x)=\sum _{y\in S}~Proba(X_{n}=x~|~X_{n-1}=y)~\pi _{\beta _{n}}(y)$$ These Markov transitions can be designed using Markov Chain Monte Carlo (MCMC) methods such as the Metropolis-Hasting algorithm, or the Gibbs sampler. We consider the fitness function $$ g_{n}(x):=\exp {\left[-(\beta _{n+1}-\beta _{n})V(x)\right]}=\frac{\gamma_{\beta_{n+1}}(x)}{\gamma_{\beta_{n}}(x)} \quad\mbox{with}\quad \gamma _{\beta_n }(x)=e^{-\beta_n V(x)}~\lambda(x) \quad\mbox{and the counting measure}\quad \lambda(x):=\frac {1}{\mbox{Card}(S)} $$ In the above displayed formulae $ \mbox{Card}(S)$ stands for the cardinality of the set $S$. In this situation, for any function $f~:~x\in S\mapsto f(x)\in \mathbb {R} $, we have $$ \frac{ \mathbb{E}\left(f(X_{n})\prod _{0\leq k<n}g_{k}(X_{k})\right) }{ \mathbb{E}\left(\prod _{0\leq k<n}g_{k}(X_{k})\right) } = \sum _{x\in S}~f(x)~\pi _{\beta _{n}}(x) \quad \mbox{and}\quad \mathbb{E}\left(\prod _{0\leq k<n}g_{k}(X_{k})\right) =\frac {1}{\mbox{Card}(S)}~\sum _{x\in S}~e^{-\beta _{n}V(x)}=\sum _{x\in S}~\gamma_{\beta_{n}}(x) $$ The corresponding genetic algorithm can be seen as an interacting simulated annealing algorithm. It is defined in terms of MCMC mutations, the selection transitions consists of selecting the particles which are better adapted to the change of temperature. At every time step, we have the particle estimates $$ \frac {1}{N}\sum _{1\leq i\leq N}~f\left(\xi _{n}^{i}\right) ~\approx _{N\uparrow \infty } ~\sum _{x\in S}~f(x)~\pi _{\beta _{n}}(x) \quad \mbox{and the unbiased approximation}\quad \prod _{0\leq k<n}\frac {1}{N}\sum _{1\leq i\leq N}\exp { \left[-(\beta _{k+1}-\beta _{k})V(\xi _{k}^{i})\right] }~\approx _{N\uparrow \infty }~ \frac {1}{\mbox{Card}(S)}\sum _{x\in S}~e^{-\beta _{n}V(x)} $$ In particular, this implies that $$ \lim _{n\uparrow \infty }\beta _{n}=\infty ~\Longrightarrow ~\lim _{n\uparrow \infty }\lim _{N\uparrow \infty }{\frac {1}{N}}\sum _{1\leq i\leq N}~f\left(\xi _{n}^{i}\right)={\frac {1}{{\mbox{Card}}({\mathcal {V}})}}\sum _{x\in {\mathcal {V}}}~f(x) $$ The choice of the parameters $\beta_n$ as well as the number of MCMC/mutation moves can be done in an adaptive way20.
  5. Let $\lambda(x)=Proba(Z=x)$ be some probability distribution of some random variable $Z$ on a finite set $S$, that is $\lambda(x)\geq 0$ and $\sum_{x\in S}\lambda(x)=1$. Choose a decreasing sequence of subsets $A_{n+1}\subset A_n\subset A_0=S$ such that $\sum_{x\in A_n}\lambda(x)>0$, denote by $1_{A_n}$ the indicator function of the subset $A_n$ and set $$ \gamma_n(x):= 1_{A_n}(x)~\lambda(x)\qquad \eta_n(x):=\frac{ \gamma_n(x)}{\sum_{y\in S} \gamma_n(y)}=\frac{1_{A_n}(x)~\lambda(x)}{\sum _{y\in A_n}\lambda(y)}=Proba(Z=x~|~Z\in A_n) \quad \mbox{and the fitness function}\quad g_{n}(x):=\frac{\gamma_{n+1}(x)}{\gamma_{n}(x)}=1_{A_{n+1}}(x) $$ Consider a Markov chain $X_{n}$ starting with the uniform distribution $ \eta _{0}(x)$ on $ S$, and such that $ \eta_n(x)$ is an invariant probability measure of the transition $X_{n-1}\rightarrow X_{n}$ at time n; that is, for any $n\geq 1$ we have that $$ \eta_n(x)=\sum _{y\in S}~Proba(X_{n}=x~|~X_{n-1}=y)~ \eta_n(y)$$ These Markov chains are sometimes shakers of the sets $A_n$; starting from a random variable $X_{n-1}$ distributed with $\eta_n$ on the set $A_n$, after a local random move the random variable $X_{n}$ is also distributed with $\eta_n$ on the set $A_n$. In this situation, for any function $f~:~x\in S\mapsto f(x)\in \mathbb {R} $, we have $$ \frac{ \mathbb{E}\left(f(X_{n})\prod _{0\leq k<n}g_{k}(X_{k})\right) }{ \mathbb{E}\left(\prod _{0\leq k<n}g_{k}(X_{k})\right) } = \sum _{x\in S}~f(x)~\eta_n(x) \quad \mbox{and}\quad \mathbb{E}\left(\prod _{0\leq k<n}g_{k}(X_{k})\right) =\sum _{x\in S}~\gamma_{n}(x)=\sum _{x\in A_n}\lambda(x)=Proba(Z\in A_n) $$ The corresponding genetic algorithm can be seen as an interacting MCMC/subset shaker. It is defined in terms of mutation/shakers of the set $A_n$, the selection transitions consists of selecting the particles which have succeeded to enter into the next level $A_{n+1}$. At every time step, we have the particle estimates $$ \frac {1}{N}\sum _{1\leq i\leq N}~f\left(\xi _{n}^{i}\right) ~\approx _{N\uparrow \infty } ~\sum _{x\in S}~f(x)~\eta_n(x)= \mathbb{E}\left(f(Z)~|~Z\in A_n\right) \quad \mbox{and the unbiased approximation}\quad \prod _{0\leq k<n}\frac {1}{N}\sum _{1\leq i\leq N}~1_{A_{k+1}}(\xi^i_k)~\approx _{N\uparrow \infty }~Proba(Z\in A_n) $$ The above methodology applies to non necessarily finite state space models and general probability measures on decreasing subsets. For instance, we can choose a multidimensional random variable $Z$ on $S= \mathbb{R}^d$ and some observable function $\varphi~:~z\in \mathbb{R}^d\mapsto \varphi(x) \in [0,\infty[$ such that $S=\{z~:~\varphi(z)\geq 0\}$. Given a increasing sequence of parameters $\beta_0=0< \beta_n< \beta_{n+1}$ we set $$ A_n=\{z\in S~:~\varphi(z)>\beta_n\} $$ The above rather elementary example coincides with the so-called Subset simulation method discussed in reliability engineering, uncertainty propagation literature as well as in rare event simulation (see also the article14, section 1.5 in the article19, section 2.2. in the article 12 as well as section 4.3.4 in the book17) and multi-level splitting in rare event simulation. The choice of the parameters $\beta_n$ as well as the number of MCMC/mutation moves can be done in an adaptive way20.







  6. 1. Non Linear Filtering: Interacting Particle Solution. Del Moral, Pierre. Markov Processes and Related Fields, (4): pp. 555--580 (1996). [preprint]

    2. Measure Valued Processes and Interacting Particle Systems. Application to Non Linear Filtering Problems. Del Moral, Pierre. Annals of Applied Probability, vol. 8 , no. 2, pp. 438-495 (1998).

    3. On the stability of Measure Valued Processes with Applications to filtering. Del Moral, Pierre; Guionnet Alice. C.R. Acad. Sci. Paris. t. 329, Série I, pp. 429--434 (1998).

    4. On the Convergence and the Applications of the Generalized Simulated Annealing. Del Moral, Pierre; Miclo, Laurent. SIAM Control and Opt., vol. 37, no. 4, pp. 1222--1250 (1999).

    5. Branching and Interacting Particle Systems Approximations of Feynman-Kac Formulae. Del Moral, Pierre; Miclo, Laurent. Séminaire de Proba. XXXIV. Lecture Notes in Maths, Springer, Vol. 1729, pp. 1-145 (2000).

    6. On the stability of interacting processes with applications to filtering and genetic algorithms. Del Moral, Pierre; Guionnet Alice. Annales de l'Institut H. Poincaré Prob. & Stats., vol. 37, No. 2, pp. 155--194 (2001).

    7. Asymptotic Results for Genetic Algorithms. Del Moral, Pierre; Miclo, Laurent. (Natural Computing Series) - Theoretical Aspects of Evolutionary Computing, Springer (2001).

    8. Open-loop regulation and tracking control based on a genealogical decision tree. Najim, Karim, Ikonen, Enzo; Del Moral, Pierre. Neural Computing and Applications. vol. 15, no. 3/4, pp. 339--349 [ IFAC (2005). ]

    9. Genetic Genealogical models in rare event analysis. Cérou, Frederic; Del Moral, Pierre; Le Gland Francois; Lézaud, Pascal Alea 1, pp. 181-203 (2006).

    10. The convergence to equilibrium of neutral genetic models Del Moral, Pierre; Miclo, Laurent; Patras, Frédéric; Rubenthaler, Sylvain. Stochastic Analysis and Applications. vol. 28, no. 1, pp. 123-143 (2009).

    11. A Backward Particle Interpretation of Feynman-Kac Formulae. Del Moral, Pierre; Doucet, Arnaud; Singh, Sumeet. M2AN. vol 44, no. 5, pp. 947--976 (2010).

    12. On the concentration properties of Interacting particle processes. Del Moral, Pierre; Hu, Peng; Wu, Li Ming. Foundations and Trends in Machine Learning, Vol. 3, No. 3 - 4, 225--389 (2012).

    13. On the Foundations and the Applications of Evolutionary Computing Del Moral, Pierre; Tantar, Alexandru-Adrian; Tantar, Emilia. Studies in Computational Intelligence, Springer vol 447. pp. 3-89 (2013).

    14. An island particle Markov chain Monte Carlo algorithm for safety analysis. Vergé, Christelle; Morio, Jérôme; Del Moral, Pierre. Reliability Engineering & System Safety. vol. 149, pp. 63-75 (2016) .

    15. On particle Gibbs samplers. Del Moral, Pierre; Kohn, Robert; Patras, Frédéric. Annales de l'Institut Henri Poincaré Probab. & Statist. vol 52, no. 4, pp. 1687--1733 (2016).

    16. Stochastic Processes: From Applications to Theory. Del Moral, Pierre; Penev, Spiridon. Chapman & Hall/CRC Press (2014). [some slides]

    17. Mean field simulation for Monte Carlo integration. Del Moral, Pierre. Chapman & Hall/CRC Press (2013).

    18. Feynman-Kac formulae. Genealogical and interacting particle approximations. Del Moral, Pierre. Springer, series: Probability and Its Applications (2004).

    19. Particle Methods: An introduction with applications. Del Moral, Pierre. ESAIM Proceedings, Journées MAS (2012).

    20. On Adaptive Resampling Procedures for Sequential Monte Carlo Methods. Del Moral, Pierre; Doucet, Arnaud; Jasra, Ajay. Bernoulli. vol. 18, no. 1, pp. 252--278 (2012).

    21. Particle approximations of Lyapunov exponents connected to Schrödinger operators and Feynman-Kac semigroups. Del Moral, Pierre; Miclo, Laurent. ESAIM: Probab. and Stats. t. 7, pp. 171--208 (2003).

    22. Particle Motions in Absorbing Medium with Hard and Soft Obstacles. Del Moral, Pierre; Doucet, Arnaud. Stochastic Analysis and Applications. vol. 22, no. 5, pp. 1175--1207 (2004). [ [Preprint version]

    23. A duality formula and a particle Gibbs sampler for continuous time Feynman-Kac measures on path spaces. Arnaudon, Marc; Del Moral, Pierre. Electronic Journal of Probability, 25, pp. 1--54 (2004).

    24. Discrete Filtering Using Branching and Interacting Particle Systems. Crisan, Dan; Del Moral, Pierre; Lyons, Terry. Markov Processes and Related Fields. vol. 5, no. 3, pp. 293--318 (1999).

    25. A Moran particle system approximation of Feynman-Kac formulae. Del Moral, Pierre; Miclo, Laurent. Stochastic Processes and their Applications, vol. 86, no.2, pp. 193--216 (2000). [preprint]

    26. Annealed Feynman-Kac Models. Del Moral, Pierre; Miclo, Laurent. Communications in Mathematical Physics (2003). [preprint]

    27. A Lyapunov approach to stability of positive semigroups: An overview with illustrations Arnaudon, Marc; Del Moral, Pierre; Ouhabaz, El Maati . Stochastic Analysis and Applications (2023). [preprint]

    28. Branching and interacting particle interpretation of rare event probabilities. Del Moral, Pierre; Lézaud, Pascal Stochastic Hybrid Systems. Springer-Verlag, Heidelberg. Theory and Safety Critical Applications, eds. H. Blom and J. Lygeros (2006).

    29. Modeling genetic algorithms with interacting particle systems. Del Moral, Pierre; Kallel, Leila; Rowe, Jonathan E. Revista de Matematica, Teoria y aplicaciones, vol.8, no. 2 (2001).

    30. On the Stability of Positive Semigroups. Del Moral, Pierre; Horton, Emma; Jasra, Ajay. Annals of Applied Probability (to appear). (2023).

    31. Strong propagations of chaos in Moran's type particle interpretations of Feynman-Kac measures. Del Moral, Pierre; Miclo, Laurent. Stochastic Analysis and Applications. vol. 25, no. 3, 519--575 (2007).

    32. Sequential Monte Carlo Samplers. Del Moral, Pierre; Doucet, Arnaud; Jasra, Ajay. Journal of the Royal Statistical Society, Series B., vol. 68, no. 3, pp. 411--436 (2006).

    33. On a class of genealogical and interacting Metropolis Models. Del Moral, Pierre; Doucet, Arnaud. J Séminaire de Probabilités. Lecture Notes in Mathematics. no. XXXVII, 1832, Springer, pp. 415--446 (2003).

    34. The Monte-Carlo Method for filtering with discrete-time observations. Del Moral, Pierre; Jacod, Jean; Philip Protter Probability Theory and Related Fields. vol. 120, no. 3, pp. 346--368 (2001).

    35. Some recent improvements to importance splitting. F Cerou, P Del Moral, A Guyader, F Le Gland, P Lezaud, H Topart Proceedings of 6th International Workshop on Rare Event Simulation, Bamberg (2006).