DNA digital data storage

DNA digital data storage is defined as the process of encoding and decoding binary data to and from synthesized DNA strands. DNA molecules are genetic blueprints for living cells and organisms. Although DNA data storage has become a hot topic recently, it is not a modern-day idea. In fact, its origins date back to 1964-65 when Mikhail Neiman, a Soviet physicist, published his works in the journal Radiotehnika. Neiman wrote about general considerations regarding the possibility of recording, storage, and retrieval of information on DNA molecules. The famous physicist explained he had the idea from an interview with Norbert Wiener, an American cybernetic, mathematician, and philosopher, published in 1964.

History

Among early examples of DNA data storage, in 2007 a device was created at the University of Arizona using addressing molecules to encode mismatch sites within a DNA strand. These mismatches were then able to be read out by performing a restriction digest, thereby recovering the data.[1]

On August 16, 2012, the journal Science published research by George Church and colleagues at Harvard University, in which DNA was encoded with digital information that included an HTML draft of a 53,400 word book written by the lead researcher, eleven JPG images and one JavaScript program. Multiple copies for redundancy were added and 5.5 petabits can be stored in each cubic millimeter of DNA.[2] The researchers used a simple code where bits were mapped one-to-one with bases, which had the shortcoming that it led to long runs of the same base, the sequencing of which is error-prone. This research result showed that besides its other functions, DNA can also be another type of storage medium such as hard drives and magnetic tapes.[3]

An improved system was reported in the journal Nature in January 2013, in an article led by researchers from the European Bioinformatics Institute (EBI) and submitted at around the same time as the paper of Church and colleagues. Over five million bits of data, were stored, retrieved, and reproduced. All the DNA files reproduced the information between 99.99% and 100% accuracy.[4] The main innovations in this research were the use of an error-correcting encoding scheme to ensure the extremely low data-loss rate, as well as the idea of encoding the data in a series of overlapping short oligonucleotides identifiable through a sequence-based indexing scheme.[3] Also, the sequences of the individual strands of DNA overlapped in such a way that each region of data was repeated four times to avoid errors. Two of these four strands were constructed backwards, also with the goal of eliminating errors.[4] The costs per megabyte were estimated at $12,400 to encode data and $220 for retrieval. However, it was noted that the exponential decrease in DNA synthesis and sequencing costs, if it continues into the future, should make the technology cost-effective for long-term data storage within about ten years.[3]

The long-term stability of data encoded in DNA was reported in February 2015, in an article by researchers from ETH Zurich. The team added redundancy via Reed–Solomon error correction coding and by encapsulating the DNA within silica glass spheres via Sol-gel chemistry.[5]

In 2016 research by Church and Technicolor Research and Innovation was published in which, 22 MB of a MPEG compressed movie sequence were stored and recovered from DNA. [6]

In March 2017, Yaniv Erlich and Dina Zielinski of Columbia University and the New York Genome Center published a method known as DNA Fountain that stored data at a density of 215 petabytes per gram of DNA. The technique approaches the Shannon capacity of DNA storage, achieving 85% of the theoretical limit. The method was not ready for large-scale use, as it costs $7000 to synthesize 2 megabytes of data and another $2000 to read it.[7][8][9]

In March 2018, University of Washington and Microsoft published results demonstrating storage and retrieval of approximately 200MB of data. The research also proposed and evaluated a method for random access of data items stored in DNA. [10][11]

Davos Bitcoin Challenge

On January 21, 2015, Nick Goldman (EBI), one of the original authors of the 2013 Nature paper, announced the Davos Bitcoin Challenge at the World Economic Forum annual meeting in Davos. [12][13] During his presentation, DNA-tubes were handed out to the audience with the message that each tube contained the private key of exactly one bitcoin, all coded in DNA. The first one to sequence and decode the DNA could claim the bitcoin and win the challenge. The challenge was set for three years and would close if nobody claimed the prize before January 21, 2018. [13]

Almost three years later on January 19, 2018, the EBI announced that a Belgian PhD student, Sander Wuyts of the University of Antwerp and Vrije Universiteit Brussel, was the first one to complete the challenge. [14][15] Next to the instructions on how to claim the bitcoin (stored as a plain text and pdf file), the logo of the EMBL-EBI, the logo of the company that printed the DNA (CustomArray) and a sketch of James Joyce were retrieved from the DNA.[16]

See also

References

  1. Skinner, Gary M.; Visscher, Koen; Mansuripur, Masud (2007-06-01). "Biocompatible Writing of Data into DNA". Journal of Bionanoscience. 1 (1): 17–21. doi:10.1166/jbns.2007.005.
  2. Church, G. M.; Gao, Y.; Kosuri, S. (2012). "Next-Generation Digital Information Storage in DNA". Science. 337 (6102): 1628. doi:10.1126/science.1226355. PMID 22903519.
  3. 1 2 3 Yong, E. (2013). "Synthetic double-helix faithfully stores Shakespeare's sonnets". Nature. doi:10.1038/nature.2013.12279.
  4. 1 2 Goldman, N.; Bertone, P.; Chen, S.; Dessimoz, C.; Leproust, E. M.; Sipos, B.; Birney, E. (2013). "Towards practical, high-capacity, low-maintenance information storage in synthesized DNA". Nature. 494 (7435): 77–80. doi:10.1038/nature11875. PMC 3672958. PMID 23354052.
  5. Grass, R. N.; Heckel, R.; Puddu, M.; Paunescu, D.; Stark, W. J. (2015). "Robust Chemical Preservation of Digital Information on DNA in Silica with Error-Correcting Codes". Angewandte Chemie International Edition. 54 (8): 2552. doi:10.1002/anie.201411378. PMID 25650567.
  6. Blawat, M.; Gaedke, K.; Hütter, I.; Chen, X.-M.; Turczyk, B.; Inverso, S.; Pruitt, B. W.; Church, G. M. (2016). "Forward Error Correction for DNA Data Storage". Procedia Computer Science. 80: 1011–1022. doi:10.1016/j.procs.2016.05.398.
  7. Yong, Ed. "This Speck of DNA Contains a Movie, a Computer Virus, and an Amazon Gift Card". The Atlantic. Retrieved 3 March 2017.
  8. "DNA could store all of the world's data in one room". Science Magazine. 2 March 2017. Retrieved 3 March 2017.
  9. Erlich, Yaniv; Zielinski, Dina (2 March 2017). "DNA Fountain enables a robust and efficient storage architecture". Science. 355 (6328): 950–954. doi:10.1126/science.aaj2038. Retrieved 3 March 2017.
  10. Organick, Lee; Ang, Siena Dumas; Chen, Yuan-Jyue; Lopez, Randolph; Yekhanin, Sergey; Makarychev, Konstantin; Racz, Miklos Z; Kamath, Govinda; Gopalan, Parikshit; Nguyen, Bichlien; Takahashi, Christopher N; Newman, Sharon; Parker, Hsing-Yeh; Rashtchian, Cyrus; Stewart, Kendall; Gupta, Gagan; Carlson, Robert; Mulligan, John; Carmean, Douglas; Seelig, Georg; Ceze, Luis; Strauss, Karin (2018-02-19). "Random access in large-scale DNA data storage". Nature Biotechnology. 36 (3). doi:10.1038/nbt.4079. ISSN 1546-1696. Retrieved 2018-09-08.
  11. Patel, Prachi (2018-02-20). "DNA Data Storage Gets Random Access". IEEE Spectrum: Technology, Engineering, and Science News. Retrieved 2018-09-08.
  12. World Economic Forum (2015-03-10), Future Computing: DNA Hard Drives | Nick Goldman, retrieved 2018-05-19
  13. 1 2 "DNA storage | European Bioinformatics Institute". www.ebi.ac.uk. Retrieved 2018-05-19.
  14. "Belgian PhD student decodes DNA and wins a Bitcoin | European Bioinformatics Institute". www.ebi.ac.uk. Retrieved 2018-05-19.
  15. "A Piece of DNA Contained the Key to 1 Bitcoin and This Guy Cracked the Code". Motherboard. 2018-01-24. Retrieved 2018-05-19.
  16. "From DNA to bitcoin: How I won the Davos DNA-storage Bitcoin Challenge". Sander Wuyts. 2018-01-16. Retrieved 2018-05-19.

Further reading

  • Mardis, E. R. (2008). "Next-Generation DNA Sequencing Methods". Annual Review of Genomics and Human Genetics. 9: 387–402. doi:10.1146/annurev.genom.9.081307.164359. PMID 18576944.
  • Cole, Adam (January 24, 2013). "Shall I Encode Thee In DNA? Sonnets Stored On Double Helix?" (Download article and audio is available). National Public Radio.
  • Naik, Gautam (January 24, 2013). "Storing Digital Data in DNA". The Wall Street Journal. New York City: Dow Jones & Company. Retrieved 2012-01-25.
  • DNA Sequencing Caught in Deluge of Data. The New York Times (NYTimes.com).
  • Aron, Jacob (February 15, 2015). "Glassed-in DNA makes the ultimate time capsule". New Scientist. Retrieved February 19, 2015.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.