Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era
Kell et al. (2003): Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era
This paper is about the scientific method and argues "that data- and technology-driven programmes are not alternatives to hypothesis-led studies in scientific knowledge discovery but are complementary and iterative partners with them".
Many fields are data-rich but hypothesis-poor. Here, computational methods of data analysis, which may be automated, provide the means of generating novel hypotheses, especially in the postgenomic era. [...] Our motivation, in part, is to understand the failure of the prevailing scientific practices to have predicted the existence of so many genes (many of them essential) that were uncovered by the systematic genome sequencing programs, and to rehearse the relative roles of inductive expression profiling methods, technology development and scientific hypothesis testing in post-genomic systems biology.
The authors write that classical genetics and classical genomics assumed that "the phenotype is caused by the genotype, not vice versa, although it is possible to infer the genotype from the phenotype". This hypothesis-driven approach failed to find "the approximately 40% of the genes that were uncovered, even in well-worked model organisms, after whole-genome sequencing methods were applied", and the authors think "that the main reason is that classical molecular genetics was both reductionist and qualitative".
At least two strategies for understanding complex systems can be envisaged. The reductionist view would have it that if we can break the system into its component parts and understand them and their interactions in vitro, then we can reconstruct the system physically or intellectually. This might be seen as a ‘bottom-up’ approach. The holistic approach takes the opposite view, that the complexities and interactions in the intact system mean that we must study the system as a whole. Although these ideas are far from new, such strategies are nowadays often referred to as ‘systems biology’. The molecular biology agenda was explicitly reductionist. The other chief attribute of the molecular biology of the last 50 years is that it was largely qualitative. The aim was to make statements that were either true or false.
The authors state clearly that much of biology is not hypothesis-driven science:
[E]ngineering strategies and (by extension) Systems Biology do not represent hypothesis-driven science. [...] [W]hat is meant by Professor Allen is that there is no specific hypothesis, as clearly one can always cast the hypothesis in terms of a view (‘hypothesis’) that generating such data from a specific set of samples will at least be of value. Thus, throughout, we use ‘hypothesis’ to mean a specific proposition about the behaviour of a (biological or other) system, based on a logical reasoning that leads to an experimentally verifiable prediction that is either confirmed to be consistent with it or otherwise. [...] [Epidemiology and] almost all kinds of data mining equivalently search for patterns, and generalise rules as inductive inferences from associations or patterns that occur regularly. Indeed, data mining is practically synonymous with ‘knowledge discovery’ in databases. To this extent, a significant part of the scientific discovery process involves establishing regularities of this type. [...] In biological chemistry, the development of methods for sequencing proteins and nucleic acids by Sanger or of the polymerase chain reaction by Mullis and of soft-ionisation mass spectrometric methods are three obvious examples [of hypothesis-free science that led to a Nobel Prize winning discovery]. [...] A recent UK initiative in ‘Basic Technology’ explicitly recognises that the results of the technology development that it is promoting are not hypothesis-driven, but that excellent hypothesis-driven science could result from it.
Further, the authors emphasize that "[b]efore the advent of reductionist molecular biology, biology was largely an observational science" and that "much of post-genomic biology is, in this sense, observational in character". They also write that "[m]odern biology rests on three major pillars - the Theory of Evolution by Natural Selection, Mendel’s Laws of Inheritance and the double helical structure of DNA" and examine "how these pillars were built and whether hypothetico-deductive or inductive reasoning was involved".
Regarding the future of biological science, the authors express the following opinion:
Intellectual activity, including that which produces patentable inventions and other outcomes commonly recognised as ‘intellectual property’, can be seen as the navigation of a complex search space or ‘landscape’ in search of ideas or material inventions that are, in a quasi-evolutionary sense, ‘better’ or ‘fitter’ than those pre-existing. The only hypotheses here, then, are that a knowledge of the landscape will help in guiding the search, and that there are tools which can improve the chances of getting to the top of Everest rather than being stuck on Snowdon.
Finally, they mention artificial intelligence projects such as DENDRAL and METADENDRAL which "sought explicitly to enquire as to whether scientific reasoning could be mechanised".