Dugs Papers

A collection of Douglas Racionzer's thinking on a variety of topics including assignments in ethics.

Wednesday, November 15, 2006

Doing the math on DNA

Common ancestors of all humans (using mathematical models)
Main Sources:
Mathematical models:
Chang, Joseph T. (1999), Recent common ancestors of all present-day individuals, Advances in Applied Probability 31(4), 1002-26. Followed by discussion and author's reply, 1027-38. The discussion includes comments by:
Carsten Wiuf and Jotun Hein.
Montgomery Slatkin.
W.J. Ewens.
J.F.C. Kingman.

Neil O'Connell (formerly here)
Branching and Inference in Population Genetics (1994)
N. O'Connell. The genealogy of branching processes and the age of our most recent common ancestor. Advances in Applied Probability, 27:418-42, 1995.

Computer simulations:
Douglas L.T. Rohde
His paper:
Somewhat Less-Recent Common Ancestors of all present-day individuals (title is reference to Chang's paper), draft, 2002.
On the common ancestors of all living humans, updated version, 2003, submitted to American Journal of Physical Anthropology.

Modelling the recent common ancestry of all living humans, Douglas L. T. Rohde, Steve Olson and Joseph T. Chang, Nature 431 (7008), 562-6, (September 30, 2004), "Letters to Nature".
News and views
Supplementary Information
Nature news release
Yale news release

Sources yet to be consulted:
Mathematical models:
A.M. Zubkov. Limiting distributions for the distance to the closest mutual ancestor. Theory Probab. Appl., 20(3):602-12, 1975.
P. Jagers, O. Nerman, and Z. Taib. When did Joe's great ...grandfather live? Or on the timescale of evolution. In I.V. Basawa and R.L. Taylor, editors, Selected Proceedings of the Sheffield Symposioum on Applied Probability, volume 18 of IMS Lecture Notes-Monograph Series, 1991.

Branching Processes
V.A. Vatutin. Distance to the nearest common ancestor in Bellman-Harris branching processes. Math. Notes, 25:378-87, 1979.
Coalescent Theory
Martin Moehle


Summary
Mathematical models of populations are limited by the difficulty they have with modelling in a clean way the complex, non-random mating patterns caused by geography, population movement, religion and social status. It is easier to make an assumption like random mating.
To actually model the quirks of the history and geography of the world you really need a computer simulation.




--------------------------------------------------------------------------------

With random mating, the MRCA would be c.1200 AD
Many mathematical models are of 1-parent genealogies - which is basically like modelling the female-female or male-male CAs.
[Chang, 1999] builds a 2-parent rather than 1-parent model - in pursuit of the real MRCA, rather than just the female-female or male-male one. In his model, if we assume a constant population size, 2 parents per individual, and random mating, then we expect the MRCA to be (log2 of the population size) generations in the past. This is incredibly recent. e.g. Take population size as (a generous) 500 million to estimate the world population over recent history. Then the MRCA is 29 generations ago - say around 1200 AD!

[Ewens, 1999] notes this is basically the reverse analogue of the fact that you only need to go back log2(n) generations to need n separate ancestors.



Non-random mating would push the MRCA further back
Chang says this medieval MRCA is implausible (though as my Royal Descents page illustrates, it is not that implausible at all) and notes that one problem with applying the model to humans is random mating. In reality, mating is of course local. The model does allow for "unlucky" random mating which could push the MRCA back, but notes that it is very unlikely in a large population that you get unlucky enough mating to push it even twice as far back. Perhaps local mating is just unlucky random mating and makes little difference in the long run. But this needs to be proved (by constructing a local-mating mathematical model). It may be that the pattern of local mating is extremely distorted by earth's specific geography, so perhaps only a computer simulation (rather than a general mathematical model) can solve this issue.
An extreme example of earth's geography would be total isolation. Many human populations, especially in Australia, the Pacific, the Americas and the Arctic, seem to have been isolated from each other until modern times. If populations were truly isolated, then the probability of 2 individuals mating either side of the barrier may truly have been zero for thousands of years. In which case the MRCA for the world would be pushed back to thousands of years ago. Apparently [Nei and Roychoudhury, 1982] and [Goldstein et al, 1995] use DNA to estimate ages for the MRCA of 116,000 and 156,000 years ago. One wonders if they are aware that DNA cannot be used to estimate the MRCA.

In theory, cases of extreme religious isolation (or ethnic or linguistic or social isolation) could also push back the MRCA. But we know that extreme religious (or other ethnic or cultural) reproductive isolation simply does not last for hundreds of years. If people share the same territory, some of them will interbreed no matter what. A tiny minority perhaps, but that's all we need to rapidly get everyone descended from an MRCA. The only thing that will stop people interbreeding is total geographical isolation.

Whatever about the world as a whole, Chang's model does suggest that the MRCA for Europe, where populations constantly mixed, may be well within historical times. Quite likely (as is suggested by other independent evidence on my Royal Descents page) the entire population of the West descends from Charlemagne.

One wonders what Chang's model would predict for the most recent strict female-female or strict male-male ancestor. Comparing this with the DNA figures might give us a handle on how unrealistic random mating is as a model.




--------------------------------------------------------------------------------

In the past, ancestor of some means ancestor of all
Chang's second result is that when you go far enough back, every individual is either an ancestor of the whole world today, or else is an ancestor of no one alive today. In nature, it is obvious that this state must be reached as you go back, see [Dawkins, 1992] - just consider ancestral fish. If I am descended from a particular one, then so are all humans.
In Chang's mathematical model this state is reached very quickly, within about 1.77 times the number of generations of the MRCA, i.e. using our numbers above, perhaps c.700 AD. So it would look like this:

Before 700 AD, every single human is either ancestor of no one alive today, or ancestor of everyone alive today. [Rohde, 2002] refers to this as the "All Common Ancestors", or ACA, point. Obviously if someone in this period is a proven ancestor of someone alive today then they must be ancestor of everyone alive today. So, for example, Charlemagne, because he is a proven ancestor of some people alive today, is probably the ancestor of everyone alive today in the West.

Between 700 AD and 1200 AD, every single human is either ancestor of no one alive today, ancestor of everyone alive today, or ancestor of some people alive today.

After 1200 AD, every single human is either ancestor of no one alive today, or ancestor of some people alive today.
Accepting that it is wrong to draw the above conclusions with locally-mating humans - despite that, these figures are in fact quite plausible (if restricted to the Western world at least).



In the past, you are descended from most of the population
In fact, Chang's model predicts that around 80 percent of the population before the ACA point is an ancestor of everyone alive today (and 20 percent are ancestors of no one alive today). But there is no realistic model of mate choice. [Rohde, 2002] has a better model of mate choice, and comes up with a more convincing figure of around 60 percent of those who survive to adulthood and have children. This is discussed below.



--------------------------------------------------------------------------------

Computer simulations
[Rohde, 2002] has run the first ever serious computer simulation of the history of the world's genealogy.
He makes a serious attempt to model non-random mating. He sets up an abstract model of "continents", "countries" and "towns", which can be viewed not merely as geographic position but more abstractly as the pool from which one is more or less likely to choose a mate - whether that pool be geographic, religious or whatever.

He even simulates the historical growth of the world population - adjusting the birth and survival rate so that population growth matches the real numbers over the centuries from 1000 BC to 2000 AD. Interestingly, he found this made little difference to the MRCA date.

Given a reasonable choice of parameters, he estimates the MRCA for the world at c. 300 AD, with bounds of c. 150 BC to c. 800 AD.

The lowest rate of migration (and hence lowest rate of cross-breeding) he tried was: probability of leaving the "country" 0.05 percent and probability of leaving the "continent" 0.001 percent. Even with this extreme local-breeding model he still gets an MRCA for the whole world in historical times at c. 150 BC.




Growing Artificial Societies: Social Science from the Bottom Up by Joshua M. Epstein and Robert L. Axtell.

Before the ACA point, ancestor of some means ancestor of all
Very interestingly, Rohde empirically confirms Chang's model that not long (only a few centuries) before the MRCA, we reach the ACA point, where everyone is either a CA (ancestor of everyone) or else their line is extinct.


Before the ACA point, you are descended from most of the population that has children
Rohde does, however, correct Chang's figure of 80 percent of people being CAs before the ACA point. He uses non-random mating, a realistic birth rate, and a model of male-female mate choice, to get a more convincing figure of around 60 percent for the percentage of people whose lines do not go extinct. This is restricted to those who have lines in the first place, i.e. 60 percent of those who survive to adulthood and have children.
In other words, if you go back before the ACA point, which may be as recent as classical times, you are descended from around 60 percent of any ancient population that has children.

0 Comments:

Post a Comment

<< Home