# Isolation By Distance Hypothesis Statement

Characterizing patterns of genetic differentiation within a species is a recurring task in population genetics. Wright (1943) introduced the model of isolation-by-distance (IBD), which assumes that differences of allele frequencies between populations accumulate under the assumption of local spatial dispersal. Because of local dispersal, IBD models predict the *pattern of IBD* where population differentiation increases with increasing geographic distances (Slatkin 1993; Rousset 1997). This pattern is observed in many model and nonmodel organisms as well as in humans suggesting that local dispersal is a leading evolutionary force (Sharbel et al. 2000; Ramachandran et al. 2005; Hardy et al. 2006; Hellberg 2009).

However, the pattern of IBD can mask complex variations of demographic parameters resulting in differential increases of genetic differentiation in different regions of the habitat. Variations of demographic parameters can arise when population densities or migration rates vary across space (Slatkin 1985). With the advent of landscape genetics (Manel et al. 2003; Manel and Holderegger 2013), the spatial variation of demographic parameters is an important topic because spatial heterogeneity (or landscape characteristics) is now recognized to be a key factor to explain population differentiation and gene flow (McRae and Beier 2007). Examples of spatial heterogeneity influencing population differentiation include varying local subpopulation size (Serrouya et al. 2012) as well as fragmented landscapes in urban and agricultural area where there are “corridors” for gene flow (Arnaud 2003; Munshi-South 2012). Barriers to gene flow, which can be caused by anthropogenic or geographic factors, are also emblematic examples of spatial heterogeneity influencing population structure (e.g., Castella et al. 2000; Epps et al. 2005; Riley et al. 2006; Gauffre et al. 2008; Zalewski et al. 2009). Because the identification of barriers to gene flow has attracted considerable attention (Storfer et al. 2010), there is a large variety of statistical methods to detect them (Barbujani et al. 1989; Bocquet-Appel and Bacro 1994; Dupanloup et al. 2002; Manni et al. 2004; Cercueil et al. 2007; Crida and Manel 2007; Manel et al. 2007; Safner et al. 2011). Here, we propose a more general method that characterizes *non-stationary* patterns of IBD. A nonstationary pattern of IBD occurs when the rate at which differentiation between individuals or populations accumulates with distance depends on space. Nonstationary patterns of IBD arise for instance when there is a barrier to gene flow because genetic differentiation accumulates more rapidly with distance around the barrier but they can also occur on different situations such as continuous variations of gene flow across the species range.

To characterize nonstationary patterns of IBD, our approach provides a measure of local differentiation at each location where genetic data are available. The principle of the method is to estimate for each sampled location , a local pairwise measure of population differentiation or of dissimilarity between the population located at the sampled location and fictive neighboring populations located at a fixed distance *d* of (see Fig. 1). Considering for instance as a pairwise measure of genetic differentiation, the method provides estimates of for pairs of populations separated by a distance *d* and located in the vicinity of the sampling sites. The distance *d* has to be set in advance and should be small compared to the dimension of the region under study. Fictive neighboring populations are introduced as a mean to provide measures of local genetic differentiation— between populations separated by a distance *d* here—that are comparable between sampling sites. Compared to common tests for IBD (Hardy and Vekemans 1999), the method is more informative because it quantifies how local genetic differentiation varies across space; the rate at which genetic differentiation increases with distance may vary across space and the proposed approach provides a quantitative assessment of this variation. To determine if variation of local differentiation is sufficiently large to reject stationary IBD, we additionally provide an hypothesis-testing procedure based on simulations. The method is not restricted to pairwise population measurements and can also accommodate individual pairwise measures. Working at the scale of individuals is a desirable feature because using individuals as the operational unit avoids potential bias in identifying populations in advance and offers the opportunity to conduct studies at a finer scale (Manel et al. 2003, 2007). Using a detailed simulation study, we demonstrate that the method can correctly infer local variation of genetic differentiation and we present applications to human single nucleotide polymorphism (SNP) data (Humphreys et al. 2011) and amplified fragment length polymorphism (AFLP) from alpine plants (Gugerli et al. 2008).

## Methods

For the sake of the presentation, we assume that the data consist of allele frequencies in each population and that the method relies on the empirical correlation matrix between populations. In the Results section, we show that the proposed approach is also appropriate with other pairwise matrices such as matrices between populations or correlation matrices between individuals.

To assess local genetic differentiation around a given sampled site, we estimate the correlation of allele frequencies between the sampled population and fictive populations located in the neighborhood of the sampling sites. Neighboring populations are located at a fixed and short distance from the sampled populations, and we measure the expected local correlation (averaged over neighbors) of allele frequencies between the sampled population and the neighboring fictive populations (Fig. 1). Because we aim at providing local genetic differentiation values that should be larger in regions of abrupt genetic changes, we consider one minus the local correlation as a measure of local differentiation.

We estimate local correlation using a Gaussian process approach (Bishop 2006), which is known as *kriging* in geostatistics (Cressie 1993). Kriging refers to a set of interpolation methods where a variable of interest is estimated at unsampled locations based on values measured at the sampling sites. Interpolation relies on a weighted average of the values measured at the sampling sites and the weights depend on a parametric function *C*, which describes how the correlation or the covariance decreases with distance (Cressie 1993). A direct application of kriging would consist of interpolating the allele frequencies at the neighboring sites based on the allele frequencies estimated at the sampled sites. However, the proposed approach is nonstandard and requires methodological developments because we rather aim at estimating the correlation matrix between sampled and unsampled neighboring sites based on the correlation matrix between sampled sites. There is a vast literature of kriging procedures with nonstationary covariance when the function *C* describing the decay of correlation with distance varies in space (Nott and Dunsmuir 2002; Schmidt and O'Hagan 2003; Paciorek and Schervish 2006). The covariance between sampled and unsampled sites is usually estimated using a parametric model (Paciorek and Schervish 2006) or at least using a given functional model for the covariance function (Schmidt and O'Hagan 2003). However, compared to geostatics where only one or a few variables are observed at the sampling sites, we are in a favorable situation in population genetics to estimate how the covariance or the correlation varies across space. Because each locus is a statistical replicate, there is enough information to estimate the correlation between the sampled sites using the empirical correlation matrix for instance. Estimating local correlation amounts at interpolating the correlation between sampled and neighboring sites from the correlation matrix between sampled sites. We explain below how we perform the interpolation step.

### THE KRIGING/GAUSSIAN PROCESS APPROACH

In the following, we denote by **X** and **Y** the vectors of allele frequencies at sampled and unsampled sites. We assume independence between loci and the vectors **X** and **Y** contain allele frequencies for an arbitrary locus. The objective of the kriging approach is to interpolate the covariance (or correlation) matrix between **X** and **Y** based on the empirical covariance matrix between sampled sites. The covariance matrix between **X** and **Y** is denoted , where **m** is a constant mean over the range. The main principle is to use weighted means of covariance values between sampled sites to estimated covariance between sampled and unsampled sites. As usual in kriging, the weights depend on a parametric function *C* that gives the decay of correlation with distance. We explain below how we compute these weights.

The Gaussian process viewpoint of kriging is to model the joint values of the variable at sampled and unsampled sites as a multivariate Gaussian variable (Bishop 2006)

where

where (respectively, ) denote the covariance matrix between the sampled sites (respectively, unsampled sites) and contain the covariances between the sampled and unsampled sites. The interpolation of the variable of unknown allele frequencies *Y* is obtained using the conditional distribution of **Y** given **X**, which can be written in the following regression form

where and **ε** is a residual independent of **X** (Brown et al. 1994). A naive computation of local covariance would consist of simulating with equation (2) the vector **Y** containing the allele frequencies at the neighboring sites and then to evaluate numerically the empirical covariance between allele frequencies at sampled and at neighboring sites. Although it is a valid approach, we can actually derive what is the expected covariance between sampled and neighboring sites using equation (2) and we obtain

In the computations, we replace **m** by the empirical mean so that we estimate the covariance matrix with , where denotes the empirical covariance matrix of **X**. The matrix provides the weights of the weighted means, which are used to interpolate the covariance values between sampled and unsampled sites based on the covariance values between sampled sites.

More generally, we can estimate local similarities by multiplying the weight matrix with a similarity matrix between sampled sites. In the Results section, we consider similarity matrices that are not correlation or covariance matrices, and we use the pairwise matrix of values for instance. When using individuals as operational units, numerical problems can arise if they are multiple individuals by site because the matrix can be difficult to invert. Potential solutions are to consider a population—with one or more individuals—at each sampling site or to add a small perturbation to the geographical coordinates of the individuals.

Providing the correlation instead of the covariance between sampled and unsampled sites requires the standardization of the covariance equation (3) and the renormalization formula is provided in Appendix A. The final estimate for the covariance matrix is finally obtained by averaging equation (3) over posterior replicates of . The parametric model for , which is needed to generate the posterior distribution of , is given below.

### A MODEL FOR THE CORRELOGRAM

To compute the weight matrix , we consider the standard model of *stationary* kriging that assumes that the correlation between two points only depends on the distance between these two points. Using these assumptions, we should model how the correlation decreases with increasing distance. We assume that this function *C*, called the correlogram, decays exponentially

where *d* is the distance between two points, 1 denotes the indicator function, α determines the sill, which measures the limiting value of the correlation function, *r* is the *range* parameter, and λ is the *regularization* parameter. The parameter λ is introduced for numerical reasons because it ensures that the matrix is invertible, which is required for the computation of the weight matrix (Bishop 2006). The range parameter *r* is inversely related to the rate at which correlation decays with distance. Denoting by the geographical distance between the *i*th and *j*th sites, then the entry of at the *i*th row and *j*th column is given by . We sample the triplet from the posterior distribution using an MCMC algorithm that contains both Gibbs and Metropolis–Hastings updating steps, and the details of the algorithm are provided in Appendix B (Handcock and Stein 1993).

### HYPOTHESIS-TESTING PROCEDURE

We introduce two test statistics to test if the variation of local genetic differentiation is significant. The first test statistic is the coefficient of variation of local genetic differentiation values, that is the ratio between standard deviation and mean of local differentiation measures. The second statistic is the distance correlation statistic and it measures the dependence between local genetic differentiation and geographical coordinates. The distance correlation statistic extends Pearson correlation coefficient because it can measure nonlinear dependence (Székely et al. 2007). Because we use two test statistics, we consider the conservative Bonferroni correction and reject stationarity when one the two observed values of the test statistics is larger than the 97.5% quantile obtained for the null distribution of stationarity.

We consider two options for generating distribution of the test statistic under the null hypothesis. In the first option, we consider the parametric model of equation (4). We compute *M* pairwise covariance matrices , using the stationary correlogram of equation (4). The parameters of equation (4), which are used to compute the covariance matrices, are sampled according to the posterior distribution. The values of the tests statistics are then obtained after running the MCMC algorithm (Appendix B) *M* times for each of the simulated covariance matrix , . When the sample size is too large, we have to limit the computational burden of the procedure, and we do not perform *M* MCMC runs. Instead, for the *i*th covariance matrix , we use the *i*th triplet to compute the weight matrix and to obtain values of local genetic differentiation. However, equation (4) is only an approximation of the correlation pattern found for IBD models. It is exact, for instance, in the one-dimensional stepping-stone model with infinite range (Kimura and Weiss 1964). To avoid the approximation of equation (4), we also consider explicit simulations of a stationary stepping-stone model using *ms* (Hudson 2002). We consider uniformly sampled migration rates such as and we choose a sampling scheme that mimics the sampling of the data.

## Results

### SIMULATION STUDY

In the simulation study, we consider two different models for generating nonstationary patterns of IBD. First, we consider nonhomogeneous stepping stone models in one and two dimensions. We simulate with *ms* 2000 independent SNPs using spatially dependent effective migration rate , where *N*_{0} is the population size of each deme and *m* is the migration rate per generation between two neighboring demes. Because we assume independence between SNPs, each SNP is simulated with a coalescent simulation that is conditioned on having one segregating site. The second model is analytic and has been developed for performing nonstationary kriging when the correlogram function (equation (4)) is assumed to vary across space (Paciorek and Schervish 2006). The range parameter *r* of equation (4), which measures the rate at which correlation decays with distance, is assumed to be a function of space. For the second model, zones of abrupt changes such as genetic barriers correspond to regions with a smaller range parameter because correlation decays more rapidly with distance in these regions.

#### Barrier in a one-dimensional model

We investigate an example of a one-dimensional model with a genetic barrier. We simulate a stepping stone model with 100 populations of effective sizes diploid individuals. Depending on the simulations, we sample either 20 equidistant populations or 20 uniformly sampled populations. We consider 20 chromosomes in each of the population. Migrations are constant between neighboring populations and we consider and . The barrier is located between populations 50 and 51 and arose 8 units of time ago () where time is counted in units of 4*N*_{0} generations. As similarity matrix, we consider the pairwise correlation of allele frequencies for 20 sampled populations. For each sampled deme, local genetic differentiation corresponds to one minus the expected correlation between the sampled deme and its two neighbors.

With equidistant sampling, we find that the parameters of the correlogram function (equation (4)) affect the estimated values of local genetic differentiation (Fig. S1). However, for all values of the correlogram parameters we consider, local differentiation is larger in the middle of the range, which is consistent with the presence of a barrier to gene flow. Nonetheless the detailed trajectory of local differentiation as a function of space depends on the correlogram parameters and edge effects can be large for some parameter values (Fig. S1). To account for the uncertainty associated with the parameters of the correlogram function, we integrate the values of local genetic differentiation over the posterior distribution of (Fig. S1). To investigate if a barrier to gene flow is still detectable with irregular sampling, we also sampled randomly 20 populations among the 100 populations. For both intensities of barrier ( or except at the barrier where ) and for each replicate of population sampling, local differentiation is larger around the barrier to gene flow (Fig. 2). However, for the less stringent and more difficult to detect barrier, local differentiation increases less markedly around the barrier when sampling in the vicinity of the barrier is sparse (Fig. 2).

## Isolation-by-distance in landscapes: considerations for landscape genetics

M J van Strien,^{1,}^{2,}^{*}R Holderegger,^{2,}^{3} and H J Van Heck^{4,}^{5}

^{1}Planning of Landscape and Urban Systems, ETH Zurich, Stefano-Franscini-Platz 5, Zurich, Switzerland

^{2}WSL Swiss Federal Research Institute, Zürcherstrasse 111, Birmensdorf, Switzerland

^{3}Department of Environmental System Sciences, ETH Zurich, Universitätsstrasse 16, Zurich, Switzerland

^{4}Earth and Ocean Sciences, Cardiff University, Main Building, Park Place, Cardiff, UK

^{5}Institute of Earth Sciences, Utrecht University, Budapestlaan 4, Utrecht, The Netherlands

^{*}Planning of Landscape and Urban Systems, ETH Zurich, Stefano-Franscini-Platz 5, IRL—HIL H 42.3, Zurich, ZH CH-8093, Switzerland. E-mail: ln.neirtsv@netraam

Author information ►Article notes ►Copyright and License information ►

Received 2013 Oct 15; Revised 2014 May 13; Accepted 2014 May 27.

Copyright © 2015 The Genetics Society

Heredity (Edinb). 2015 Jan; 114(1): 27–37.

Published online 2014 Jul 23. doi: 10.1038/hdy.2014.62

This article has been cited by other articles in PMC.

## Abstract

In landscape genetics, isolation-by-distance (IBD) is regarded as a baseline pattern that is obtained without additional effects of landscape elements on gene flow. However, the configuration of suitable habitat patches determines deme topology, which in turn should affect rates of gene flow. IBD patterns can be characterized either by monotonically increasing pairwise genetic differentiation (for example, F_{ST}) with increasing interdeme geographic distance (case-I pattern) or by monotonically increasing pairwise genetic differentiation up to a certain geographical distance beyond which no correlation is detectable anymore (case-IV pattern). We investigated if landscape configuration influenced the rate at which a case-IV pattern changed to a case-I pattern. We also determined at what interdeme distance the highest correlation was measured between genetic differentiation and geographic distance and whether this distance corresponded to the maximum migration distance. We set up a population genetic simulation study and assessed the development of IBD patterns for several habitat configurations and maximum migration distances. We show that the rate and likelihood of the transition of case-IV to case-I F_{ST}–distance relationships was strongly influenced by habitat configuration and maximum migration distance. We also found that the maximum correlation between genetic differentiation and geographic distance was not related to the maximum migration distance and was measured across all deme pairs in a case-I pattern and, for a case-IV pattern, at the distance where the F_{ST}–distance curve flattens out. We argue that in landscape genetics, separate analyses should be performed to either assess IBD or the landscape effects on gene flow.

## Introduction

Ever since Wright (1943) described isolation-by-distance (IBD), patterns of spatial genetic structure have been extensively studied in population genetic simulation models (Epperson, 2003; Epperson *et al.*, 2010) and in natural populations (Crispo and Hendry, 2005; Jenkins *et al.*, 2010; Storfer *et al.*, 2010). In most of these studies, migration probability is a function of geographic straight-line distance. Recently, landscape genetic studies have incorporated more complex landscape measures than straight-line distance aiming to give a more realistic estimate of the effective distance between demes (Holderegger and Wagner, 2006). In general, the variation in estimates of pairwise genetic distances explained by these effective distances is compared with that explained by geographic distances alone (that is, IBD). The latter is regarded as the most simple landscape genetic pattern that would be obtained even if there were no landscape effects and migration was thus only constrained by distance between demes (Spear *et al.*, 2005; Balkenhol *et al.*, 2009; Jenkins *et al.*, 2010). This notion may have originated from spatially explicit simulation studies of IBD patterns, in which demes or individuals are usually placed in regular lattices throughout homogeneous spaces (Guillot *et al.*, 2009; Epperson *et al.*, 2010). Indeed, distance-constrained migration in such models produces IBD patterns that are not influenced by any landscape elements. However, a heterogeneous landscape will not only affect migration probabilities between demes, but also the spatial arrangement of demes (that is, deme topology). Only few studies have examined the effect of deme topology on patterns of IBD. Doligez *et al.* (1998) concluded that strong spatial clumping of individuals leads to slight increases in spatial genetic autocorrelation, and Ezard and Travis (2006) found fixation time to be greater in long and narrow habitats. These simulation studies thus suggest that the spatial arrangement of habitat patches has an influence on genetic patterns in general and IBD patterns in specific. Although Robledo-Arnuncio and Rousset (2010) found that effective population density (*D*_{e}) and effective dispersal rate ‘depend in a complex way on the spatial and temporal demographic heterogeneities of the population,' in equilibrium situations, their product was still related to the slope of the correlation between genetic distance (that is, F_{ST}/(1−F_{ST})) and log-transformed geographic distance between demes or individuals. However, landscape geneticists usually do not know whether the genetic data sampled from natural populations reflect an equilibrium or non-equilibrium state. In the present study, we therefore assess to what degree habitat configuration influences equilibrium and non-equilibrium IBD patterns. We abandon the regular lattice setup of demes or individuals, as commonly used in population genetic simulation studies (Doligez *et al.*, 1998; Epperson *et al.*, 2010), and instead use irregular habitat configurations and deme topologies, which better reflect the natural study systems typically used in landscape genetics.

Hutchison and Templeton (1999) described hypothetical and empirical IBD patterns and showed that an F_{ST}–distance relationships are not always of ‘case-I' type, which is characterized by monotonically increasing pairwise F_{ST} values with increasing interdeme distance due to an equilibrium of gene flow and random genetic drift (Figure 1; Hutchison and Templeton, 1999). Namely, beyond a certain interdeme distance, gene flow (including indirect gene flow over several generations) can be so limited compared with genetic drift or mutation that there is no significant F_{ST}–distance slope anymore. The corresponding ‘case-IV' type F_{ST}–distance relationship is characterized by monotonically increasing F_{ST} values up to a certain distance, beyond which the plot flattens out and F_{ST} values cease to increase (Figure 1; Hutchison and Templeton, 1999). Because Hutchison and Templeton's example plots and description of potential F_{ST}–distance relationships are simple and intuitive references, we adopt their ‘case-I' and ‘case-IV' terminology in the present study (Figure 1). The case-IV situation described by Hutchison and Templeton (1999) is a transitional state between a situation of panmixia and a case-I situation, which implies that case-IV patterns represent non-equilibrium states, whereas case-I patterns refer to equilibrium states. Such non-equilibrium case-IV situations should not be confused with equilibrium states of F_{ST}–distance relationships that exhibit an asymptotic curve, which can resemble a case-IV IBD pattern (Rousset, 1997). Hutchison and Templeton (1999) explained regional differences in IBD patterns with time-since-colonization and the (historical) presence of barriers (that is, forests). Their results have, however, not been discussed in the light of differences in habitat configuration between study regions. Therefore, the first goal of the present study is to determine if habitat configuration has an effect on the rate at which IBD patterns change from case-IV to case-I.

Figure 1

Two hypothetical F_{ST}–distance relationships modified from Hutchison and Templeton (1999). The case-I relationship (left) is characterized by a monotonically increasing F_{ST} and scatter with interdeme geographic distance. The same can be observed**...**

Most landscape genetic studies measure the level of IBD by quantifying the linear correlation between genetic (for example, F_{ST}) and geographic distances from all deme pairs in a data set (Crispo and Hendry, 2005; Balkenhol *et al.*, 2009; Jenkins *et al.*, 2010). However, to highlight dispersal barriers due to landscape elements, several authors have recently argued that landscape genetic analyses should be restricted to only those pairs of demes between which direct gene flow is possible (Angelone *et al.*, 2011; Keller *et al.*, 2013). As most landscape genetic studies make use of historical gene flow measures (for example, F_{ST}; Jenkins *et al.*, 2010; Storfer *et al.*, 2010), gene flow is not only measured between deme pairs that are within each other's maximum migration distance, but also between demes beyond this distance that experience indirect gene flow (over several generations) via intermediate demes. However, the inhibiting or facilitating effect on movement of certain landscape elements can only be detected between those demes that potentially exchange migrants (that is, deme pairs separated by a distance lower than the maximum migration distance). If there is no possibility for demes to exchange direct migrants (that is, their interdeme geographical distance is larger than the maximum migration distance), the lack of gene flow is due to physical limitations of the focal species and not necessarily caused by any landscape effect on migration. For landscape genetics, it is thus important to differentiate between these two types of limitations. Indeed, in a landscape genetic analysis on a grasshopper species Keller *et al.* (2013) found that, compared with considering all deme pairs, a much higher model fit and a better distinguishability of the most likely migration habitat was obtained when only deme pairs separated up to 3 km were considered. This distance threshold corresponded to the distance at which the F_{ST}–distance plot flattened out in a case-IV IBD pattern (Keller *et al.*, 2013), but it also corresponded to the estimated maximum migration distance of this grasshopper (van Strien *et al.*, 2014). Theoretical population genetic studies have shown that the distance at which an F_{ST}–distance plot flattens out can be larger than the maximum migration distance in equilibrium (that is, asymptotic case-I F_{ST}–distance relationship) and non-equilibrium situations (Slatkin, 1993). However, for the analysis of IBD in landscape genetics, it has yet to be determined at what interdeme distance the highest correlation between F_{ST} and interdeme geographic distance can be measured and whether this distance is determined by the maximum migration distance of a given species. If the highest F_{ST}–distance correlation is measured across all deme pairs separated by distances up to the maximum migration distance, then the distance threshold of an IBD analysis corresponds to the recommended threshold for the detection of landscape effects on gene flow (that is, maximum migration distance). However, if the highest correlation value is measured, for a case-IV pattern, at the distance where the F_{ST}–distance curve flattens out and, for a case-I pattern, across all deme pairs, there is a discrepancy between the two thresholds. The second goal of this study is thus to determine at what interdeme distance the highest F_{ST}–distance correlation is measured.

To address the above two study goals, we set up an agent-based population genetic simulation model. For various habitat configurations and maximum migration distances (MMD), we examined the development of IBD patterns, determined whether these patterns could best be described as case-I or case-IV F_{ST}–distance relationships and at what interdeme distance the highest F_{ST}–distance correlation was measured. In contrast to other population genetic simulation studies that have focussed on IBD patterns from a theoretical point of view (Epperson, 2003; Epperson *et al.*, 2010), we specifically designed a simulation model that accommodated current landscape genetic practice (Balkenhol *et al.*, 2009; Jenkins *et al.*, 2010; Storfer *et al.*, 2010), so that our findings can easily be integrated into future landscape genetic studies. For instance, we do not assess IBD patterns with Moran's *I* correlograms, as often done in population genetics (Epperson, 2003), but use linear correlations between F_{ST} and geographic distance instead, as is done in landscape genetics (Jenkins *et al.*, 2010). Also, the regular lattice setup of landscapes is abandoned, and replaced with habitat configurations that better reflect natural landscapes.

## Materials and Methods

We developed a stochastic agent-based numerical model to simulate genetic differentiation between demes that were placed in the habitat cells of two-dimensional landscape grids. Diploid individuals (agents) were allowed to migrate between demes. Migration probabilities between demes were drawn from Gaussian or exponential migration functions (see below). After a certain number of non-overlapping migration–reproduction cycles (that is, generations), we extracted matrices of pairwise F_{ST} and geographic distances, which were post processed to determine IBD patterns. Genetic patterns in population genetic studies have traditionally been simulated with either discrete demes (that is, stepping-stone models; Kimura and Weiss, 1964) or a more or less continuous distribution of individuals (Wright, 1943; Guillot *et al.*, 2009; Epperson *et al.*, 2010). We chose to structure our individuals in discrete demes, because in more than 80% of landscape genetic studies, individuals were sampled from demes and, subsequently, genetic differentiation was determined between demes (Storfer *et al.*, 2010). We anticipate that this indicates that most landscape genetic studies focus on species that occur in discrete demes. Mutation was not considered as a source of genetic variation. The numerical code, named Concordia, was written in MATLAB (The MathWorks, Natick, MA, USA) and is available online (Dryad data repository).

To create different configurations of habitats, we generated neutral landscapes, which are frequently used in landscape ecology to test hypotheses about habitat configuration and fragmentation on ecological processes (Gardner and Urban, 2007). Neutral landscapes have also been used in landscape genetics (Ezard and Travis, 2006). With the programme QRULE 4.2 (Gardner and Urban, 2007), we generated five binary habitat-matrix landscapes of 128 × 128 cells, of which 50% were classified as habitat and 50% as matrix (±max. 0.51% Figure 2). This resulted in landscapes with ∼8192 habitat cells. The level of habitat fragmentation (measured with a spatial autocorrelation parameter H) was the same in all landscapes (*H*=0.5; ‘next nearest neighbourhood rule' setting in QRULE). The five neutral landscapes were finite; that is, they had edges. Distance units in the present study were expressed as number of cells on the landscape grid.

Figure 2

The five neutral landscapes that were used in the present population genetic simulation study. Simulated demes were randomly placed in habitat cells. As an example, we show a random topology of 100 demes in landscape A (black cells represent demes). All**...**

At the beginning of each simulation, we randomly placed 100 equally sized demes in habitat cells of a neutral landscape. The total density of demes within habitat and within the total landscape was equal in all simulations. The locations of demes remained constant during the course of a simulation. We chose to randomly place demes within the habitat, because we wanted to determine if certain habitat configurations consistently produce a certain type of IBD, regardless of random deme topology. Furthermore, in natural circumstances, species may be bound to a certain habitat type (for example, forests, grasslands, wetlands), but within these habitats exhibit a non-regular deme topology due to, for instance, microclimatic heterogeneity (Corney *et al.*, 2004), competition (Meineri *et al.*, 2012) or stochastic processes (Hubbell, 2001). Each deme consisted of 50 diploid individuals, which were characterized by their genotypes at 10 neutral bi-allelic loci. At the beginning of a simulation, genotypes were defined by randomly allocating 2 out of 10 alleles to each individual's loci, simulating an initial state of panmixia.

The first step in each migration–reproduction cycle was the migration of individuals from their natal deme to other demes in the landscape. For a broad range of species, mark-recapture or trapping studies have shown that a large proportion of individuals is sedentary (or philopatric) and will not disperse far from their location of birth, whereas a smaller group of individuals is rather vagile and migrates much further from the natal location (Paradis *et al.*, 1998; Chapman *et al.*, 2011). We simulated such migration behaviour by, at the beginning of each migration–reproduction cycle, selecting individuals that would leave their natal deme. Each individual had a probability of 0.2 of being selected, so that, on average, 20% of deme members emigrated from their natal deme. The remaining ∼80% were sedentary individuals, which remained in their natal deme and became, together with new immigrants, the parents of the next generation. Each migrating individual had a certain probability to reach other demes as dictated by a probability density function, which determined the migration probability over a certain geographic distance. As we implemented no difference between migration probabilities through habitat or matrix, migration between demes was purely a function of geographic distance. The sum of the probabilities of migration from a natal deme to other demes could be lower than 1, meaning that some migrants never reached a new deme. These migrants were removed from the simulation before reproduction. Note that by applying this two-phase approach of first selecting migrants and only then selecting their destinations, we ensured both a statistically stable emigration rate for all demes and an immigration probability that was only dictated by a distance-dependent migration function, which is in accordance with the metapopulation dynamics theory described by Hanski (1998), who stated that ‘because it is reasonable to assume that mortality within the habitat patches does not depend on isolation, unlike mortality during migration, one can tease apart, at least in principle, the two kinds of mortality'. For every random deme topology, we also checked that the sum of the probabilities of migration from a natal deme to other demes never exceeded 1.

Two types of probability distribution functions (pdf) were used to simulate the migration probability over a certain distance between demes. We used exponential and Gaussian migration functions. Two-dimensional pdfs are used if the migration is simulated in two-dimensional landscapes (Austerlitz *et al.*, 2004). However, preliminary tests of our simulations showed that gene flow with two-dimensional pdfs was generally too low for IBD to emerge. Therefore, we chose to use one-dimensional pdfs, which resulted in a range of immigration rates per generation (see Results) that were comparable to the range typically observed in natural populations (that is, ∼0–10% Bowne and Bowers, 2004). For the sake of clarity, we characterize the different probability functions by a maximum migration distance. We define maximum migration distance as the distance at which probability of migration equalled 0.0001. Thus, we did not set an absolute maximum migration distance, but a distance above which migration is unlikely. This is analogous to natural situations where there is no absolute maximum migration distance and where it is probable that there is variability in the movement capabilities of individuals. For the Gaussian pdf, we experimented with variances of 15, 40, 110 and 200, resulting in MMD of 14.4, 22.7, 36.1 and 47.4 distance units. For the exponential pdf, we experimented with *μ*-values of 2, 3, 4 and 5, which resulted in MMD of 17.0, 24.3, 31.2 and 38.0 distance units.

After the migration step, random mating took place within each deme. Within demes, each individual was randomly paired with another individual to form mating pairs that produced 10 diploid offspring. In case of an uneven number of deme members (resulting from an uneven number of emigrants and immigrants), one random individual did not mate. Mating was thus sexual, in the sense that it was always between two individuals, resulting in Mendelian inheritance. From all offspring in a deme, 50 individuals were randomly drawn to reach maturity and form the next generation of parents, thereby maintaining a constant deme size at the beginning of each cycle. At this point, the end of a migration–reproduction cycle was reached and the next cycle started. Each simulation was stopped after 500 such cycles.

For each combination of the five landscapes and the eight MMD values (that is, four for the Gaussian pdf and four for the exponential pdf), we performed 50 replicated simulations (that is, 50 times a new random topology of 100 demes followed by 500 migration–reproduction cycles). For each simulation, we analysed the genetic differentiation between demes after 50 and 500 migration–reproduction cycles (that is, generations). Genetic differentiation was quantified by F_{ST}, because its formulation is based on two focal microevolutionary processes (that is, gene flow and genetic drift) relevant to IBD patterns and because it is the most frequently used estimates of genetic distance in landscape genetic studies (Jenkins *et al.*, 2010; Storfer *et al.*, 2010). Multiallelic pairwise F_{ST} values were calculated following Nei (1977). Hutchison and Templeton (1999) used F_{ST} values calculated with Weir and Cockerham's (1984) approach (Goudet, 1995). For our analyses, an unbiased estimator of genetic differentiation was not necessary, as we have maintained a fixed number of demes and a fixed deme size throughout our simulations. Furthermore, pairwise estimates of such estimates of historical gene flow have shown to be highly correlated (Van Strien *et al.*, 2012; Keller *et al.*, 2013). We therefore assume that the differences in calculation of both estimates had a negligible effect.

Subsequently, we determined the type of F_{ST}–distance correlation (that is, case-I or case-IV) and the interdeme distance at which the F_{ST}–distance correlation was highest with an approach, in which we created groups of deme pairs and determine the F_{ST}–distance correlations for each group. These groups were defined by selecting all deme pairs that were separated by 0 to *d* distance units. *d* was increased from 1.81–181 distance units in intervals of 1.8 (that is, 99 values of *d*). A *d* of 181 represented the full diagonal length of the landscape, which resulted in all deme pairs being selected. To prevent biased correlation estimates due to small sample size (Montgomery and Morrison, 1973), we only considered *d* values that resulted in groups of 50 or more deme pairs. To quantify the linear F_{ST}–distance correlation in landscape genetics, the Mantel *r* correlation coefficient is usually used (Storfer *et al.*, 2010). We thus determined the value of *d* at which Mantel *r* between F_{ST} and geographic distance was maximized and referred to this *d* as the distance of maximum correlation (DMC; Figure 3). We correlated F_{ST} to untransformed geographic distances and F_{ST}/(1−F_{ST}) to log-transformed geographic distances. The latter transformation is often used in landscape genetics following Rousset (1997). Post processing of F_{ST} values was performed in R (R Development Core Team, 2012).

Figure 3

Demonstration of how the distance of DMC was calculated. From plots of pairwise genetic (F_{ST}) and geographic distances, we selected deme pairs with an interdeme distance up to a maximum distance *d*. For each selection of deme pairs, we then calculated**...**

The distribution of DMC values was displayed as bean plots (Kampstra, 2008), which are analogous to mirrored, smoothed vertical histograms. These plots are suitable for determining whether an F_{ST}–distance correlation was of case-I or case-IV type (see Results section). Furthermore, we displayed the maximum F_{ST}–distance correlation (that is, correlation at DMC) in relation to the correlation that was obtained across all deme pairs.

## Results

The five neutral landscapes generated were labelled A–E (Figure 2). In landscape A, there was a large central area of habitat, surrounded by matrix, whereas in landscapes D and E, the habitat was distributed around a central patch of matrix (Figure 2). Thus in landscape A, most demes were clustered together in the central habitat, whereas in landscapes D and E, most demes were scattered on the periphery of the study area. These differences in deme topology were reflected by the differences in mean geographic distance between the 100 demes randomly placed in habitat (median of the average distances: landscape *A*=55.6; landscape *B*=62.2; landscape *C*=63.0, landscape *D*=73.8; landscape *E*=74.9). The distribution of mean distances was largely overlapping between landscape B and C as well as D and E (Supplementary Figure 1).

We obtained similar results for simulations with Gaussian and exponential migration functions. For the sake of clarity, we only discuss the results from simulations with the Gaussian migration function here. However, results from simulations with the exponential pdf are given in the Supplementary Material (Supplementary Figures 2–4).

Although emigration probability was fixed for each individual (that is, 0.2), the immigration probability of a migrant was dependent on geographic distance (see Methods section) and, thus, the differences in mean geographic distances between all demes in the five neutral landscapes also resulted in differences in the average immigration probability of a migrant. For instance, with the Gaussian migration function and MMD=14.4, the average probabilities that a migrant reached another deme (that is, immigration probability) in landscapes A, B, C, D and E were 0.0938, 0.0922, 0.0938, 0.0871 and 0.0838, respectively, and for MMD=47.4, these immigration probabilities changed to 0.2721, 0.2398, 0.2508, 0.2075 and 0.2014, respectively.

The interdeme distance of DMC was measured at generations 50 and 500 in each simulation. DMC distributions from correlations between F_{ST} and untransformed geographic distances were similar to the DMC distributions from correlations between F_{ST}/(1−F_{ST}) and log-transformed geographic distances (Figure 4 and Supplementary Figure 5). We therefore chose to only show the results from untransformed geographic distance measures (Figure 4). Examining the distributions of DMC values, we distinguished two types of unimodal distributions. First we observed a unimodal distribution with a peak of DMC values just below or slightly above MMD (Figures 4a–c and landscapes D and E in Figure 4d). The second unimodal DMC distribution (landscapes A, B and C in Figures 4f–h) peaked between 140 and 170 distance units. The bimodal distribution of DMC values that was visible in some cases (for example, landscapes A, B and C in Figure 4d), was a combination of both unimodal distributions.

Figure 4

Bean plots showing the distance of DMC for five different neutral landscapes (Figure 2) and a range of MMD at generations 50 (**a**, **c**, **e**, **g**) and 500 (**b**, **d**, **f**, **h**). Gaussian migration probability functions were used in these simulations. The length of the**...**

Next we examined what kind of F_{ST}–distance plot resulted in the two unimodal distributions of DMC values. We regarded the DMC distribution of landscape E at generation 500 and MMD=36.1 (Figure 4f) as characteristic for the first type of unimodal distribution (that is, peak of DMC values just below or above MMD) and that of landscape A with the same settings (Figure 4f) as characteristic for the second unimodal distribution (that is, peak of DMC between 140 and 170 distance units). For both these distributions, we created F_{ST}–distance plots by averaging the median and lower and upper bounds of the scatter of F_{ST}–distance plots from 50 random topologies of 100 demes (Figure 5). The first type of DMC distributions resulted in a monotonic increase of F_{ST} and scatter up to a certain geographic distance, beyond which the plot flattened out and lost its correlative character (Figure 5a). The distance at which the plot flattened coincided with the peak of the first type of DMC distribution (that is, at ∼68 distance units; Figures 4f and 5a). We referred to this F_{ST}–distance relationship as a case-IV type (Hutchison and Templeton, 1999). The second type of DMC distribution resulted in monotonically increasing F_{ST} values and scatter with increasing geographic distance up to the maximum (diagonal) length of the study area (that is, between 128 and 180 distance units; Figure 5b). We termed this type of F_{ST}–distance relationship a case-I type (Hutchison and Templeton, 1999). For both plots, we observed a slightly increasing slope of median F_{ST} values at larger geographic distances (Figure 5).

Figure 5

Two examples of genetic and geographic distance plots obtained after 500 generations. Gaussian migration probability functions were used. The displayed median (black line) and lower and upper bounds of the scatter (grey area) of the F_{ST} values are the**...**

The configuration of habitat in the landscape had a strong effect on the type of IBD pattern (that is, case-I or case-IV). In some landscapes only case-I or case-IV distributions were observed for a certain MMD and generation. For instance, with MMD=36.1 (Figures 4e and f), the random deme topology in habitat of landscape A produced mainly case-I F_{ST}–distance relationships at generations 50 and 500, whereas in landscape E, predominantly case-IV relationships were observed for both generations. Landscapes where both case-I and case-IV relationships were found for a certain MMD and generation (landscapes B and C in Figures 4d and e) indicated that not only habitat configuration, but also deme topology in the habitat had an effect on IBD patterns. Landscapes in which mean geographic distance between all deme pairs was similar (that is, landscapes B and C as well as D and E; Supplementary Figure 1) also showed similar patterns of DMC distributions (Figure 4). An exception were the differences in DMC distributions of landscape D and E for MMD=36.1 at generation 500 (Figure 4f) and MMD=47.4 at generation 50 (Figure 4g).

The distance at which the F_{ST}–distance plot flattened out in case-IV relationships (that is, peak of the DMC distribution) could not be predicted from MMD. In some cases, the DMC distribution peaked below the MMD (Figures 4a and c), whereas in other cases, the peak was located above the MMD (landscapes D and E in Figures 4b, d and e). No consistency in the difference in distance between the MMD and the peak of the DMC distribution could be found between landscapes with similar mean geographic distances between demes (for example, landscapes B and C as well as D and E in Figure 4d). Furthermore, for case-I relationships, the F_{ST} values kept on monotonically increasing far beyond the MMD (for example, landscapes A, B and C in Figure 4h).

MMD had a clear effect on the rate at which case-I F_{ST}–distance relationships appeared for certain landscapes. For instance, in landscape A at generation 50 (Figures 4a, c, e and g) there were hardly any simulations that resulted in case-I relationships with MMD=14.4 or MMD=22.7, whereas practically all relationships were case-I type with MMD=36.1 or MMD=47.4. We observed that case-IV relationships at generation 50 either evolved to a case-I relationship over time or remained in a case-IV state for more than 450 generations. For an MMD of 14.4 (Figures 4a and b), the vast majority of simulations resulted in case-IV F_{ST}–distance relationships at generations 50 and 500. A similar pattern was observed in landscape E with MMD=36.1 (Figures 4e and f), where the peak of the DMC only slightly increased after 450 generation. A case-IV changing to a case-I relationship could be observed in the DMC distribution of landscape A at MMD=22.7 (Figures 4c and d).

The F_{ST}–distance correlation coefficients calculated for all deme pairs were different from the maximum correlation coefficients (that is, the correlation at DMC; Figure 6). For case-IV relationships, negative F_{ST}–distance correlations were measured between all deme pairs for a certain MMD and landscape, whereas the maximum correlation was always positive (Figures 6a and c). It is noteworthy that the range of maximum correlation coefficients for a certain MMD at a certain generation showed a strong overlap between landscapes (Figure 6), regardless of the type of F_{ST}–distance relationship. For instance, at generation 500 and MMD=36.1 (Figure 6f), all landscapes had similar ranges of maximum correlations, whereas we observed case-I relationships in landscape A and case-IV in landscape E (Figure 4f).

Figure 6

Box plots showing Mantel *r* correlation coefficient between genetic and geographic distances measured at the distance of DMC (grey plots) and across all deme pairs (white plots) for each of the 50 random deme topologies. Gaussian migration probability**...**

There was a decrease in average F_{ST} values with increasing MMD at both generation 50 and 500. The average F_{ST} value resulting from a landscape in which the demes were relatively close together (landscape A) was lower than those from landscapes where the demes were relatively far apart at a certain MMD (landscape D and E; see Supplementary Figures 1 and 6).

## Discussion

The results of the present simulation study showed that habitat configuration has a clear influence on patterns of IBD. The rate and likelihood of the appearance of case-I or case-IV F_{ST}–distance relationships (Hutchison and Templeton, 1999) clearly differed between different landscapes with the same amount of habitat and the same degree of fragmentation and was, thus, affected by habitat configuration and/or deme topology. This result holds true for several types of migration functions (that is, Gaussian and exponential). The common assumption in landscape genetics that patterns of IBD are not influenced by any landscape effects and arise through distance-constrained migration alone (Spear *et al.*, 2005; Balkenhol *et al.*, 2009; Jenkins *et al.*, 2010) is debatable, as habitat configuration is an intrinsic aspect of landscapes and is a main determinant of deme topology. The maximum correlation between F_{ST} and geographic distance or F_{ST}/(1−F_{ST}) and ln(distance) was measured across nearly all pairs of demes in case-I F_{ST}–distance plots and at the distance at which the plot flattened in case-IV F_{ST}–distance plots. The maximum migration distance did not correspond to the distance at which the maximum F_{ST}–distance correlation was measured. There is thus a discrepancy between the distance threshold at which the maximum F_{ST}–distance correlation was measured and the recommended distance threshold for determining landscape effects on gene flow (that is, the maximum migration distance) when the response variable is an estimate of historical gene flow (for example, F_{ST}) and landscape predictor variables are measured directly between demes (as in most landscape genetic studies). Below, we recommend approaches to determine these distance thresholds for both the assessment of the presence and strength of IBD and determining landscape effects on gene flow.

### Measuring IBD

To obtain an indication of the presence of IBD, we recommend landscape geneticists to determine the maximum F_{ST}–distance correlation (that is, correlation at DMC), by calculating the correlation at a range of maximum distances (Figure 3). Although in the current study the maximum F_{ST}–distance correlation was measured at different distance thresholds for case-I and case-IV IBD patterns, the correlation coefficients were comparable at both distance thresholds. In some situations, we found F_{ST}–distance correlations to be absent or negative when calculated over all demes, whereas the maximum correlation calculated over more closely located demes was clearly positive. A lack of an F_{ST}–distance correlation can, for instance, be interpreted as a complete lack of gene flow between demes or as the absence of distance-limited gene flow (as in the island model). Whether (meta)populations exhibit an IBD pattern (case-I or case-IV) or are completely isolated, is important information for species conservation.

Here we assessed the correlation between F_{ST} and geographic distance across all pairs of demes up to a certain threshold distance (that is, DMC). However, in the field of spatial population genetics, IBD is commonly assessed with correlograms that depict the genetic similarity or differentiation values for different distance lags (for example, distance classes 0–1 km, 1–2 km, 2–3 km, and so on). Typically Moran's *I* correlograms are used (Sokal and Wartenberg, 1983; Doligez *et al.*, 1998; Epperson, 2003, 2005), but Mantel correlograms could also be used (Borcard and Legendre, 2012). An advantage of using correlograms for assessing IBD is that it could enhance the comparability of results from empirical landscape genetic studies with those from population genetic theory and simulations. However, empirical (Keller *et al.*, 2013) and simulated (Figure 6) F_{ST}–distance plots show that scatter of pairwise F_{ST} values is generally high and increases with interdeme distance. In addition, the number of sampled demes is generally low in landscape genetic studies (mean=11; Jenkins *et al.*, 2010). Therefore, in empirical landscape genetic studies, it is likely that the number of samples is too low and the scatter of pairwise F_{ST} values too high to get correlograms that do not display erratic fluctuations in Moran's *I* or Mantel *r* values, especially in higher distance lags. Such fluctuations would hamper the interpretation of the correlograms.

### Measuring landscape effects on gene flow

The optimal distance threshold to determine landscape effects on gene flow or migration rates should be determined differently to the distance threshold to assess the level of IBD. For such analyses, we support the recommendation of earlier studies (Angelone *et al.*, 2011; Keller *et al.*, 2013) to restrict the analysis to those deme pairs that are within maximum migration distance from each other. This distance can be estimated from, for instance, mark–release–recapture studies (Hassall and Thompson, 2012) or genetic paternity analysis (Kamm *et al.*, 2009). Assuming that the maximum migration distance is determined by physical or behavioural limits of the focal species, intervening landscape should positively or negatively influence migration rates only up to the maximum migration distance. Thus, calculating common interdeme landscape measures, like least cost distances (Rayfield *et al.*, 2010), resistance distances (McRae and Beier, 2007) or quantities of landscape elements within transects (Angelone *et al.*, 2011) between deme pairs that are outside of direct migration distance, would result in landscape information that is irrelevant for migrating individuals and could, thus, bias results. Jaquiéry *et al.* (2011) used a landscape genetic simulation model and found that landscape effects on gene flow were better detected if only adjacent demes were considered in a regular lattice setup of demes. Thus, in cases where no estimate of the maximum migration distance is available, a more feasible solution could be to only consider neighbouring deme pairs in landscape genetic analysis. In natural settings, where demes are not arranged in regular patterns, neighbouring demes could also be defined by, for instance, Delaunay (Goldberg and Waits, 2010) or Gabriel (Keller *et al.*, 2013) triangulation. The latter has the added benefit that only pairs of demes will be selected that have no intermediate demes and fluctuations in gene flow can thus not be caused by increased (or decreased) gene flow via intermediate demes (Keller *et al.*, 2013). These recommendations are only valid if gene flow is measured over several generations with historical estimates of gene flow such as F_{ST}. We expect that deme topology will have less effect on measures of contemporary gene flow, stemming for instance from paternity analysis (Kamm *et al.*, 2009) or assignment tests (Manel *et al.*, 2005).

Based on a simple island model, it can be expected that F_{ST}=1/(4*Nm+*1), where *N* is the effective deme size and *m* is the immigration rate (Wright, 1931; Whitlock and McCauley, 1999). In accordance with this formula, we found that habitat configurations that resulted in higher average immigration probabilities also resulted in slightly lower average F_{ST} values. For the discipline of landscape genetics it is, however, important to realize that these fluctuations in immigration probabilities were caused by deme topology and habitat configuration in the different landscapes and are not analogous to the fluctuations in immigration probability that are either caused by the ‘resistance to movement' values of certain landscape elements or non-linear migration-routes (Spear *et al.*, 2010). The matrix in landscape E (Figure 2) is a ‘barrier' to gene flow because the agents in our model are ‘physically incapable' of crossing it (due to very low migration probabilities at long distances) and not because the landscape in the matrix is unfavourable to migration (that is, matrix and habitat had equal migration probabilities in our simulations).

### Patterns of IBD

Hutchison and Templeton (1999) state that ‘given sufficient time and stability of conditions, the case-IV pattern should come to resemble more and more a case-I pattern of IBD […]'. This statement implies that a case-IV state is a non-equilibrium condition that will change to an equilibrium case-I pattern. We can confirm that a case-IV pattern can be an intermediate state between panmixia (that is, our initial condition in the simulations) and a case-I pattern. However, we also found situations in which case-IV patterns appeared to be in a fairly stable state, still apparent after 500 generations. Perhaps, for these cases, it will take many more generations before case-IV relationships change to case-I relationships. Theoretical population genetic studies have shown that also equilibrium situations can produce asymptotic F_{ST}–distance curves that resemble case-IV patterns. Rousset (1997) determined that, in equilibrium situations, F_{ST}/(1−F_{ST}) plotted against log-transformed geographic distance should theoretically produce close to linear correlations for interdeme distance larger than *σ* and smaller than , where *σ* is the square root of the variance in the parent–offspring distance and μ is the mutation rate (Rousset, 1997; Ehrich and Stenseth, 2001). Beyond this distance, the F_{ST}–distance curve flattens out to form an asymptotic curve. However, given the fact that μ=0 in the present study, we anticipate that our case-IV F_{ST}–distance relationships were non-equilibrium situations and not asymptotic equilibrium situations (that is, with *μ*=0 the distance at which an asymptotic equilibrium curve flattens out is expected to be very large). Interestingly, we generally detected similar values of the distance threshold at which we found a maximum correlation between F_{ST} and untransformed distances or F_{ST}/(1−F_{ST}) and log-transformed distances. Also case-I relationships could result from non-equilibrium situations when demes are sampled from such a small area that even the furthest demes are closer than DMC. It is thus important for landscape geneticists to realize that it is difficult to determine the state of the study system (that is, equilibrium or non-equilibrium) from case-I or case-IV F_{ST}–distance relationships. Especially, because the generally great amount of scatter in F_{ST}–distance plots (Epperson *et al.*, 2010) may make the pattern hard to distinguish and because accurate estimates of *σ* and/or *μ* are usually not available in empirical studies (Ehrich and Stenseth, 2001). Nevertheless, more simulation studies should focus on non-equilibrium states of F_{ST}–distance relationships as these are probably equally important in landscape genetics as equilibrium states (Manel and Holderegger, 2013).

We observed a slightly increasing slope of median F_{ST} values in the F_{ST}–distance plots at larger geographic distances (Figure 5), which might have been caused by some demes that were randomly placed in relatively isolated habitat patches or near the edge of the study area. In our simulation model, these demes got relatively few immigrants from surrounding demes, which probably caused their allele frequencies to be more affected by genetic drift leading to higher genetic differentiation between these and other demes. This peripheral population effect has also been documented for many natural (meta)populations (Eckert *et al.*, 2008). Such an effect will not be detectable in population genetic simulation models that use toroidal landscape grids, where the top and bottom as well as the two sides of the study landscape are connected (Doligez *et al.*, 1998) and perhaps also not in those that use a regular lattice setup of finite demes where there are no isolated demes on the edges of the landscape. On the other hand, results from our simulations are only applicable to areas for which it is unlikely that there are many immigrants from outside the study area.

As we were mainly interested in the effects of habitat shape and maximum migration distance on IBD patterns, we kept other variables, like deme size, deme location and number of demes, constant throughout our simulations. However, we acknowledge that spatial or temporal variation in any of these variables would potentially influence the outcome of simulations. Metapopulation dynamics in natural situations (Harrison, 1991) are more complex than the simple model presented in this article. Classical metapopulation theory predicts that extinction and (re)colonization of demes has profound effects on genetic differentiation among demes (Hastings and Harrison, 1994). Fluctuations in population size may not only affect the number of emigrants leaving a deme, but also the rate at which new alleles establish in a deme, which might complicate the interpretation of IBD patterns (Bjorklund *et al.*, 2010).

## Conclusions

We conclude that the type of IBD pattern that emerges after a certain number of generations is strongly affected by the habitat configuration, deme topology and the maximum migration distance. Therefore, IBD patterns should not be regarded as resulting from only distance-constrained migration, but also from the deme topology and habitat configuration. As the distance at which we measured the highest F_{ST}–distance correlation did not correspond to the maximum migration distance, we recommend that IBD and landscape effects on gene flow are assessed separately and possibly at different distance thresholds. The rate at which case-I IBD patterns emerge is influenced by an interplay of habitat configuration and maximum migration distance. Landscape geneticists should thus (1) be more aware of the effect of the spatial deme topology on gene flow and (2) of the effect that habitat configuration has on this topology. It may be necessary to (3) assess the presence and intensity of IBD by searching for the maximum F_{ST}–distance correlation from a subset of deme pairs. The effect of landscape on gene flow can then separately be assessed by (4) using only those pairs of demes that are within migration range of each other. Considering the importance of deme topology on gene flow, (5) more emphasis should be placed on complete sampling of all demes within a study landscape.

## Data archiving

Simulation programme, Concordia, is available from the Dryad Digital Repository: doi:10.5061/dryad.m8q30.

## Acknowledgments

This article forms part of the ENHANCE project, which was funded by the Competence Centre Environment and Sustainability of the ETH Domain. We thank Dave Jenkins, Corey Anderson, the subject editor and one anonymous referee for valuable comments on the manuscript, which greatly improved its contents.

## Notes

The authors declare no conflict of interest.

## References

- Angelone S, Kienast F, Holderegger R. (2011). Where movement happens: scale-dependent landscape effects on genetic differentiation in the European tree frog. Ecography34: 714–722.
- Austerlitz F, Dick CW, Dutech C, Klein EK, Oddou-Muratorio S, Smouse PE et al. (2004). Using genetic markers to estimate the pollen dispersal curve. Mol Ecol13: 937–954. [PubMed]
- Balkenhol N, Waits LP, Dezzani RJ. (2009). Statistical approaches in landscape genetics: an evaluation of methods for linking landscape and genetic data. Ecography32: 818–830.
- Bjorklund M, Bergek S, Ranta E, Kaitala V. (2010). The effect of local population dynamics on patterns of isolation by distance. Ecol Inform5: 167–172.
- Borcard D, Legendre P. (2012). Is the Mantel correlogram powerful enough to be useful in ecological analysis? A simulation study. Ecology93: 1473–1481. [PubMed]
- Bowne D, Bowers M. (2004). Interpatch movements in spatially structured populations: a literature review. Landscape Ecol19: 1–20.
- Chapman BB, Brönmark C, Nilsson J-Å, Hansson L-A. (2011). The ecology and evolution of partial migration. Oikos120: 1764–1775.
- Corney PM, Le Duc MG, Smart SM, Kirby KJ, Bunce RGH, Marrs RH. (2004). The effect of landscape-scale environmental drivers on the vegetation composition of British woodlands. Biol Conserv120: 491–505.
- Crispo E, Hendry A. (2005). Does time since colonization influence isolation by distance? A meta-analysis. Conserv Genet6: 665–682.
- Doligez A, Baril C, Joly HI. (1998). Fine-scale spatial genetic structure with nonuniform distribution of individuals. Genetics148: 905–920. [PMC free article][PubMed]
- Eckert CG, Samis KE, Lougheed SC. (2008). Genetic variation across species' geographical ranges: the central–marginal hypothesis and beyond. Mol Ecol17: 1170–1188. [PubMed]
- Ehrich D, Stenseth NC. (2001). Genetic structure of Siberian lemmings (
*Lemmus sibiricus*) in a continuous habitat: large patches rather than isolation by distance. Heredity86: 716–730. [PubMed] - Epperson BK. (2003) Geographical Genetics. Princeton University Press: Princeton, USA.
- Epperson BK. (2005). Estimating dispersal from short distance spatial autocorrelation. Heredity95: 7–15. [PubMed]
- Epperson BK, McRae BH, Scribner K, Cushman SA, Rosenberg MS, Fortin MJ et al. (2010). Utility of computer simulations in landscape genetics. Mol Ecol19: 3549–3564. [PubMed]
- Ezard THG, Travis JMJ. (2006). The impact of habitat loss and fragmentation on genetic drift and fixation time. Oikos114: 367–375.
- Gardner R, Urban D. (2007). Neutral models for testing landscape hypotheses. Landscape Ecol22: 15–29.
- Goldberg CS, Waits LP. (2010). Comparative landscape genetics of two pond-breeding amphibian species in a highly modified agricultural landscape. Mol Ecol19: 3650–3663. [PubMed]
- Goudet J. (1995). FSTAT (version 1.2): a computer program to calculate F-statistics. J Hered86: 485–486.
- Guillot G, Leblois R, Coulon A, Frantz AC. (2009). Statistical methods in spatial genetics. Mol Ecol18: 4734–4756. [PubMed]
- Hanski I. (1998). Metapopulation dynamics. Nature396: 41–49.
- Harrison S. (1991). Local extinction in a metapopulation context: an empirical evaluation. Biol J Linnean Soc42: 73–88.
- Hassall C, Thompson DJ. (2012). Study design and mark-recapture estimates of dispersal: a case study with the endangered damselfly
*Coenagrion mercuriale*. J Insect Conserv16: 111–120. - Hastings A, Harrison S. (1994). Metapopulation dynamics and genetics. Annu Rev Ecol Syst25: 167–188.
- Holderegger R, Wagner HH. (2006). A brief guide to landscape genetics. Landscape Ecol21: 793–796.
- Hubbell SP. (2001) The Unified Neutral Theory of Biodiversity and Biogeography. Princeton University Press: Princeton, USA.
- Hutchison DW, Templeton AR. (1999). Correlation of pairwise genetic and geographic distance measures: inferring the relative influences of gene flow and drift on the distribution of genetic variability. Evolution53: 1898–1914.
- Jaquiéry J, Broquet T, Hirzel AH, Yearsley J, Perrin N. (2011). Inferring landscape effects on dispersal from genetic distances: how far can we go? Mol Ecol20: 692–705. [PubMed]
- Jenkins DG, Carey M, Czerniewska J, Fletcher J, Hether T, Jones A et al. (2010). A meta-analysis of isolation by distance: relic or reference standard for landscape genetics? Ecography33: 315–320.
- Kamm U, Rotach P, Gugerli F, Siroky M, Edwards P, Holderegger R. (2009). Frequent long-distance gene flow in a rare temperate forest tree (
*Sorbus domestica*) at the landscape scale. Heredity103: 476–482. [PubMed] - Kampstra P. (2008). Beanplot: a boxplot alternative for visual comparison of distributions. J Stat Softw Code Snippets28: 1–9.
- Keller D, Holderegger R, Van Strien MJ. (2013). Spatial scale affects landscape genetic analysis of a wetland grasshopper. Mol Ecol22: 2467–2482. [PubMed]
- Kimura M, Weiss GH. (1964). The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics49: 561–576. [PMC free article][PubMed]
- Manel S, Gaggiotti OE, Waples RS. (2005). Assignment methods: matching biological questions techniques with appropriate techniques. Trends Ecol Evol20: 136–142. [PubMed]
- Manel S, Holderegger R. (2013).

## 0 Thoughts to “Isolation By Distance Hypothesis Statement”