Google matrix analysis of DNA sequences
by V.Kandiah and D.L.Shepelyansky
arXiv:1301.1626[q-bio.GN]






DNA Google matrix
(left: top PageRank 200 X 200 matrix elements for Homo sapiens at words of 6 letters; right: PageRank index proximity diagram of Canis familiaris vs. Homo sapiens)

      
Article download (here)
One DNA string of ACGT data for Bos Taurus 834 Mb (here)
One DNA string of ACGT data for Canis familiaris 724 Mb (here)
One DNA string of ACGT data for Danio rerio 421 Mb (here)
One DNA string of ACGT data for Homo sapiens 4.4 Gb (here)
One DNA string of ACGT data for Loxodonta africana 960 Mb (here)
The above strings are generated from the release-62
(taken from ftp://ftp.ensembl.org/pub/release-62/genbank/ of http://www.ensembl.org/)
Each species string is cutted in a few intervals stored in seperate files, the whole string is obtained by a junction of files in order of their index growth
This paper and web page should be cited if these data are used

This webpage is created at January 9, 2013 and is maintained
by V.Kandiah and D.L.Shepelyansky