DNAlive is a tool for the analysis of structural and physical characteristics of genomic DNA. The web server implements:
Massive genomic projects have revealed the sequence of nearly 50 eukaryotic genomes, including several mammals (among them, humans) and many more will become available in the coming years. Analysis of this massive amount of data is demonstrating the location and the one-dimensional structure of genes, but little information is available on their physical nature. It must be remembered that inside the cell DNA is not an infinite, uniform, extended polymer, where all sequences are equally accessible. On the contrary, it displays a complex structure, highly dependent on sequence, epigenetic modifications, and environmental changes or on the presence of proteins or small ligands. Sequence-dependent physical properties of the DNA fiber are crucial in the modulation of DNA functionality: for example specific deformability or helical properties in a given region of DNA facilitate or impair the formation of nucleosomes hundreds of base pairs away, or can affect dimerization of two DNA-binding proteins which might be separated by thousands of bases in sequence. In summary, sequence information needs to be complemented with physical data on the structure and dynamics of DNA fibers.
The DNAlive web server has been developed to give a complete description of the physical properties of genomic DNA in a simple way, thus providing data that can be easily understood by non-structural experts. Among others, DNAlive allows the user to:
Click here to see the table with information on all descriptors
Once a FASTA sequence is entered, the programme computes the profile for the 29 physical properties available for the fiber. All properties are represented in a bidimensional two-dimensional plot using the UCSC Genome Browser when possible and Gnuplot otherwise (see Figure 1 and Supplementary Figure 1).
To combine the visualization of DNA physical properties with public annotations of the human genome, coordinates of the input FASTA sequence can be matched by running a search in our local Blat server. Although the user is able to annotate transcription factor PDB structures on specific positions of the DNA input sequence, we have implemented an automatic method to perform this step using the TFBS Perl library.
The reconstruction of the average three-dimensional structure of DNA is achieved using sequence-dependent base step parameters derived from accurate atomistic MD and making use of a local adaptation of X3DNA script. When structural information on protein-DNA complexes is available, modeled structures in the corresponding segment are substituted by the experimental geometries, and junctions are refined if required.
The visualization of 3D structures is performed by integrating Jmol java applets in the HTML page. All physical descriptors can be mapped into the three-dimensional structure to favor the detection of potential correlations between conformation, functional annotations and physico-chemical properties.
The server also includes unique tools for a rapid representation of chromatin dynamics, which, in extensive analysis performed in our laboratory on our database of more than 100 trajectories, showed a surprisingly high accuracy of the essential deformation pattern of DNA. The method uses a mesoscopic Metropolis Monte Carlo algorithm, where the geometry of each base pair is defined by 3 local rotations (roll, tilt and twist) and translations (slide, shift and rise) and the conformational energy is estimated from the deformation matrix using a harmonic model.
When you open DNAlive's main page, there are three main section. At the top there is a help bar and at the bottom a navigation bar. In the middle, you will see the actual page, where you can interact with the web server.
In this main window, the first thing you should do is enter a sequence into the page. You can either use your own FASTA file or enter some Human Genome coordinates. If you have a FASTA that corresponds to the hg but you do not know the actual coordinates, you can run a Blat Search on it.
More information at Input Help
If you require further information, click on any other help link.
A sequence like:
Generates the outputs:
These examples illustrate the utility of the web server describing the gene structure of RGS11 (a regulator of G-protein signaling), MYADM (myeloid-associated differentiation marker) and SLC4A3 (a solute carrier family gene).
In the human telomeric region (16p13.3) the promoter of RGS11 shows a strong perturbation of DNA helical force-constants. Interestingly, the 5'UTR region overlaps with a 5'UTR region of a gene in the reverse strand (ITFG3). This increase of flexibility on Tilt and Shift (and decrease for Slide) is not associated to a CpG island. Coordinates: chr16:263,727-263,727
The gene MYADM has annotated multiple (but close) transcription start sites (TSS), revealing complex regulatory mechanisms. The DNA-structure based algorithm ProStar reveals the possible presence of a new TSS around the position chr19:59,063,300. Although two CpG rule this region, the presence of Triple-helices and Qudruplex is very rich and may play a key role on regulation (see Go–i et al. 2006). This is more obvious around position chr19:59,065,000
The TSS of SLC4A3 (a CpG promoter) is a nucleosome-free region. The sequences upstream of the 5'-end of genes normally show a low affinity for nucleosomes. DNAlive allows, for the first time, to correlate structural parameters like DNA stacking energies with DNA high-order structures. A decrease of the stacking energy rises up the stability of the DNA double helix molecule. Coordinates: chr2:220,195,530-220,205,531