### Navigation

### Disclaimer

Authors are solely responsible for the content of their articles on PandasThumb.org. Linked material is the responsibility of the party who created it. Commenters are responsible for the content of comments. The opinions expressed in articles, linked materials, and comments are not necessarily those of PandasThumb.org. See our full disclaimer.

### Recent Comments

- Pim van Meurs on August 16, 2004 5:50 PM
- Pim van Meurs on August 16, 2004 5:10 PM
- hmmm on August 16, 2004 4:41 PM
- steve on August 13, 2004 3:29 PM
- Pim van Meurs on August 13, 2004 11:09 AM
- Michael Buratovich on August 13, 2004 9:24 AM

### Recent Trackbacks

### Recommend this entry to a friend

PvM posted Entry 416 on August 12, 2004 06:26 PM.

Trackback URL: http://www.pandasthumb.org/cgi-bin/mt/mt-tb.fcgi/415

**Note that Dembski has uploaded a revised manuscript which now correctly attributes the measure to Renyi and thanks the many critics for their contributions**

I am not a mathematician but let me give it a try and others can amend and revise my comments.

The Kantorovich/Wasserstein distance metric is also known under such names as the Dudley, Fortet Mourier, Mallows and is defined as follows.

d_p(F,G) = \overset{\inf}{\tau_{x,y}} \lbrace E |x-y|^{\frac{1}{p}} \rbrace

where E(x) refers to the expectation of the random variable x and \inf means that the minimum is sought on all random variables X which take a distribution F and random variables Y which take a distribution G.

where \tau_{x,y} is the set of all joint distributions of random variables X and Y whose marginal distributions are F and G.

These metrics define a ‘distance’ between two stochastic distributions and are one of many such metrics that have been mathematically defined. There is a good paper on many of these metrics On Choosing and Bounding Probability Metrics. Different circumstances ask for different distance metrics.

These metrics have found applicability in non-linear equations, variational approaches to entropy dissipation, Phase transitions and symmetry breaking in singular diffusion, random walks, Markov processes and many more. Needless to say these metrics are quite commonly applied in a variety of applications. Applications of this metric to Markov processes may be of interest to evolutionary theory.

Adapted from Central Limit Theorem and convergence to stable laws in Mallows distance

Another way of looking at this is by assuming one has two samples X and Y of the same size X=\lbrace x_1,…,x_n \rbrace and Y=\lbrace y_1,…,y_n \rbrace. The Mallows distance between empirical distributions is

d_p(X,Y)= ( \frac{1}{n} \overset{min}{(j_1,…,j_n)} \sum_{i=1}^{n} \lvert x_i - y_i \rvert )^\frac{1}{p}

where the minimum is taken over all possible permutations of \lbrace 1, …, n \rbrace

Rachev, S. T. (1984), The Monge-Kantorovich problem on mass transfer and

its applications in stochastics, Theor. Probab. Appl., 29, 647-676.

As far as some interesting applications are concerned

Commenters are responsible for the content of comments. The opinions expressed in articles, linked materials, and comments are not necessarily those of PandasThumb.org. See our full disclaimer.

### Comment #6405

Posted by Pim van Meurs on August 13, 2004 11:09 AM (e)

As I am starting to ‘understand’ these issues, the Wasserstein metric presents a weak topology onto space. While the original Renyi measure provides what Demsbki calls “variational information” it needs to be tied in with the nature of the actual path(s) taken.

Thus he attempts to ‘coordinate the variational information with the topology of the underlying probability space”.

Where is Dembski going with this? I see this as working towards a measure that may be helpful in suggesting if there exist “probability paths”. Probably to show that there exist ‘irreducibly complex’ systems to which the probability paths may be lacking. But so far I have failed to see how any of these measures may be helpful here.

### Comment #6418

Posted by steve on August 13, 2004 3:29 PM (e)

Pim, I think the broad outline of the argument is:

1 The dead-ends so vastly outnumber the workable arrangements evolution can find, if it searches randomly, it can’t find them quickly enough

2 Evolution randomly searches all possibilities

3 Therefore, it couldn’t have found all these workable arrangements

His critics seem to have said that 1 isn’t certain, but 2 is just plain wrong. Since lots of smart people like Wolpert say his arguments are failures, I haven’t bothered to study them in depth myself. Life is short.

### Comment #6499

Posted by hmmm on August 16, 2004 4:41 PM (e)

that paper on minimum entropy probability paths between genome families is pretty funny. (I figure you probably googled it and didn’t have the chance to read it).

those guys have a total lack of understanding of biology. their basic idea is to take a DNA sequence, compute an AGTC frequency vector, and then talk about “minimizing the entropy path integral” during a sequence of base pair substitutions en route to a destination DNA sequence.

“Minimizing the entropy path integral”?!?

where did that come from? it seems these guys really think that the overall composition is influenced *not* by horizontal transfer conditions, or, say, the rest of the genome’s base-pair content (reasonable speculations)….but by the idea that base pair frequencies stay away from an entropy maximizing (.25,.25.,25.25) during the move from one sequence to another. DUBIOUS, to say the least. it’s more likely that base pair frequencies will track the genome content as a whole, or that local area of the chromosome.

But the best part of the whole paper is when they claim that the *advantage* of their approach is that “this allows us to compare sequences based on their composition as a whole, rather than by sequence alignment”. jeeez…as if throwing away all the order information is going to *improve* your understanding of evolutionary relationships between sequences! In these guys’ world, AGTC and ACTG are equivalent. Forget sequencing the genome, guys…let’s just go back to chargaff’s rules!

### Comment #6500

Posted by Pim van Meurs on August 16, 2004 5:10 PM (e)

I have looked at the paper and its approaches seem quite interesting.

They state

The variational principle which we are formulating is that the most efficient transition between probability state vectors is the probability path which minimizes the line integral of the entropy function.

An interesting working assumption which makes sense from the perspective that an admissable path is likely not going through a stage in which the genome is at maximum entropy. Seems to me that they are looking at a genome mostly under selective pressures

measures. Preferred functions will take into account the ‘conserved nature’ of the distributions. A distribution is least conserved when each of the t possible alphabet characters occurs with frequency 1/t and more conserved when one or two alphabet characters dominate. Two distributions should be judged closer if they share the same conserved characters than otherwise. A simple example of a function which disregards conserved nature is variational distance, the sum of di®erences in frequency for each letter in the distributions.

They then point out that a variety of entropy based distance measures have been proposed but most of them seem to suffer from a variety of problems.

The idea that a probability path will avoid randomizing the genome quite interesting. Perhaps Hmmm can help us understand his objections better?

### Comment #6503

Posted by Pim van Meurs on August 16, 2004 5:50 PM (e)

See also “A new distance measure for comparing sequence profiles based on path lengths along an entropy surface” by Gary Benson which explains the motivations for this new measure

## Comment #6400

Posted by Michael Buratovich on August 13, 2004 9:24 AM (e)

PvM,

Thanks for that explanation. I quit math after linear algebra and differential equations. which means that I am not a mathematician either. I am still trying to figure out why Dembski brought this up in his reply. I am also completely unsure why this method has more probabalistic power over the Renyi equation that was discussed. I know I’m a math idiot, but can one of you demi-gods out there help on this one.

MB