Multi-Scale Complex Genomics (MuG)

Genomics is probably the fastest evolving field in current science. A decade ago our main concern was to obtain the sequence (the 1D code) of the genome; but today the big challenges are to determine how genotype information is transferred into phenotype, and how pathological phenotypic changes can be predicted from genome alterations. While investigating these points, we have realized that a part of the regulation of gene expression is implicitly coded in the way in which chromatin is folded. As technology has advanced and information of the folded state of chromatin has emerged, a new branch of genomics (3D/4D genomics) has emerged. Hundreds of laboratories are now defining a young and active community that, though in the end concerned with the same scientific problem, uses many different approaches to study it that individually target radically different length and timescales. The community faces severe practical problems related to: i) how huge, noisy, and diverse data related to widely different size and time scales can be integrated, ii) the lack of standardized analysis and simulation tools, iii) the complete disconnection of associated informatics databases, and iv) the lack of validated and flexible visualization engines. This timely proposal, born at the critical point in the evolution of this field and developed through a user-driven approach (with the biologists actually suffering these IT problems), will bring the power and potential of High Performance Computing to the development of 3D/4D genomics, and help to give a structure to this new and exciting field.