The idea was developed with the help of Utopiah, Kanzure, and others in irc://irc.freenode.net/##hplusroadmap
and was added to the awesome (but sadly now historical) Seedea project, here:
where details were thrashed out by various people.
This idea depends on DNA printing technology.
EDIT: Here is the text from the Seedea page (around 2009)
use the dedicated page.
To encode data as DNA, allowing the storage of vast quantities of data 'in a cupboard'. Advantage: we get massive data transfer rates by shipping DNA (e.g using UPS).
New advances in DNA sequencing technology promise to revolutionize the fields of biology and health care. The human genome project, initiated in 1990, took just 13 years to complete at a cost of approximately $3 Bn [cite HGP]. Today, obtaining the complete sequence of an individual costs 100,000 times less at approximately $30,000, and takes approximately 1 day to obtain [cite 1000 GP]. The acceleration in the progress of DNA sequencing technology shows no sign of slowing. It is estimated that capacity increases by a factor of 100 every year [cite Richard Durbin, personal communication].
However, these phenomenal technological advances in the field of molecular biology, they have created a new bottleneck in the scientific discovery pipeline. Namely, the cost of data storage.
DNA can store 1021 bits per gramhttp://www.sciencemag.org/cgi/reprint/296/5567/478.pdf. This compares favorably with conventional storage at around 1014 bits per gram, (blu-ray: 200GB/16g) for a one-million-fold improvement. How to effectively utilize this awesome storage capacity?
Here we propose an alternative storage medium for long term archival of data, DNA. We present a DNA encoding algorithm that is optimized for data recovery, outline a novel design for a microfluidic DNA sequencing chip and describe a DNA protectant that will allow for long term storage of DNA in ambient conditions.
The problem with 'next generation' DNA sequencing (nextGen) is that it is too good. The technologies are generating too much data too quickly. Simply put, we don't have enough hard disks to keep pace with the data storage requirements.
How do you cope with this situation?
In situ example
- Company A gather a sample S from a living organism
- Company A studies it and produces a result R that is a very large amount of data including specific DNA samples (original and modified)
- Company A works for a Client K that requires additional work on R and eventually S by company B
- R+S are information that needed to be shipped as fast as possible by A to B
- We encode R+S in P thanks to our specific method and ship it to B
Design a 'DNA encoding' that maximizes ease of reading
- the DNA encoding - lots of check sums and handling of repeat regions
- the 'DNA protectant molecule that we use to store data at rtp
- design a micro fluidic dna sequencing chamber
microfluidics is getting very cheap, so its easy to design and print a 'chip' that will control the flow of ATCG into a reaction chamber.
- cost optimization to advertize during the difficult time of "pipe cloging" and energy cost (logistic, network congestion)
Market and trends
- Familybuilder DNA on Sale: Familybuilder Introduces Low Cost Testing
- AsperBio How to send DNA samples?
- efficient data compression. One human is much like another at the genetic level. Perhaps we can simply compress the data produced by the personal genomics initiative (for example) against a 'reference' genome.
- "retarded polymerase"
- article A from X written by Y on date Z
- book A from X written by Y on date Z
Important related patents
- GENEART - Excellence in DNA Engineering and Processing : Gene Synthesis, Directed Evolution, Plasmid Services
- US Patent 6,472,184: Method for producing nucleic acid polymers
- Method for producing nucleic acid by Peter Hegemann
- Method for producing a synthetic gene or other DNA sequence by Richard H. Lathrop et al
Relations that would be interested
- Paola (positive about the idea but doubtful about the ethical or moral slippery)
- Laurent (feedback yet to ask)
- Kenza (feedback yet to ask)
- Contacts from KAO (genetic engineering in the BayArea)
- metaDNA doesn't really sums it up.
- DNA bank
- Molecular Storage
- DNA backup
- DNA Storage
- DNA data
- DNA backup
- DNA Logistics
DNA is a fantastic storage medium. It has a track record of 4 billions years.
DNA, it'll store your ass off.
- Is IT ready for the Dreaded DNA Data Deluge? by Andras Pellionisz for Google, October 2008
- Information Theory and Evolution by John Avery
- isbn13: 9789812384003
- World Scientific Publishing Company, August 2003