2 Comments
Aug 31·edited Aug 31

Interesting and through-provoking. I agree with the author that one of the key distinguishing questions of predictive biology is:

Can the outcome of an experiment Y be predicted from observable features X?

However if this is the question that drives predictive Biologists, then the next statement cannot be true:

"Predictive Biologists are more concerned with measuring the mutual information between two biological phenomena than they are with measuring direct causality."

Please let me explain why.

If I have two molecules A and B that have high mutual information, and I perform 2 experiments where I separately perturb A and B, there are four potential outcomes:

1. A changes when B is perturbed, but B does not change when A is perturbed.

2. B changes when A is perturbed, but A does not change when B is perturbed

3. A does not change when B is perturbed, and B does not change when A is perturbed

4. A changes when B is perturbed and B changes when A is perturbed.

I think you would agree that predictions based on mutual information alone cannot distinguish among these 4 outcomes. But I would claim that predictions based on combining mutual information with causal information can.

What is causal information? It turns out those systems biology wiring diagrams that were assembled from those arduously obtained molecular biology experiments provide precisely the causal assumptions needed to distinguish among the 4 potential outcomes.

In other words, without the causal assumptions encoded in those systems biology models, data-driven machine learning alone is insufficient to succeed in predicting the outcome of an unknown experiment.

Therefore I would suggest predicting the outcome of an unknown experiment is fundamentally a causal estimation problem, not a machine learning prediction problem.

Expand full comment

Do you think there are any limits to predictive biology (i.e. subfields of molecular biology that are fundamentally not tractable to this approach)? Similarly, do you have any predictions for subfields that will soon be transformed by predictive biology but haven't been yet?

Expand full comment