Information Theory

for DNA Storage

At DNAalgo we leverage information theory for a fast and reliable DNA storage

DNA Data Storage

DNA Storage refers to the ability of storing digital file (i.e., made of 1 and 0 digital bits) into synthetic DNA (made of a sequence of A, C, G, and T nucleobases). Every digital file can be represented with a sequence of 1’s and 0’s which can be encoded (i.e., translated) into a sequence of nucleobases. At this point the DNA synthesis takes place and the resulting DNA filament is ready to retain the information for ages. When needed, the stored information is retrieved through the DNA sequencing, which returns a string of nucleobases (A, C, G, T). The process of going back to the digital domain is called Decoding; in a nutshell, the decoding algorithm rebuilds the original digital file, making sure that all the sequencing errors are fixed. 

What we offer

Thanks to more than a century of cumulative experience in Error Correction Codes (ECC) design, data analysis, data protection, and development of electronic systems, DNAalgo has developed DNAssim, a full-system simulator for a fast and reliable DNA storage. By leveraging the full design-space exploration enabled by DNAssim, DNAalgo can develop the most efficient Encoding/Decoding IPs for each specific combination of Synthesis and Sequencing technologies.

We build stochastic models of the storage errors associated with any Synthesis/Sequencing technology; these models can be used to run software simulations instead of expensive and long Synthesis/Sequencing experiments. 

Using synthetic DNA for data storage implies two steps of digital data processing: Encoding and Decoding. We combine a full set of error stochastic models with a proprietary simulator (DNAssim) to develop the most efficient IPs for both Encoding and Decoding.

Because of the intrinsic statistical behavior of the storage errors, a simulator is required for figuring out the impact of Encoding/Decoding algorithms. We have built from scratch a full-system simulator for DNA storage which enables a complete design exploration of Encoding/Decoding: DNAssim.

About Us

DNAalgo is a start-up located in the Milano area, in the north of Italy. A team of veterans of the storage industry, including mathematicians, data scientists, and engineers is paving the way for the application of the most sophisticated tricks of the Information Theory to the DNA storage. At DNAalgo we believe that data “manipulation” is the only way for making DNA storage reliable and fast enough for the storage industry; without reliability and speed, DNA storage won’t go too far from Today’s proof-of-concept stage.

Latest News

  • October 3, 2022: DNAalgo releases the second generation of its own simulator DNAssim (Release 2.0) which significantly improves performances and adds support for GPUs.

  • September 12, 2022: DNAalgo presents DNAssim and how decoding can be accelerated through a hardware-based computation of the edit distance at the Storage Developer Conference
  • September 12, 2022: Alessia Marelli, DNAalgo CTO, co-presents the “Rosetta Stone Initiative” at the Storage Developer Conference

  • August 4, 2022: DNAalgo presents “DNAssim: a full system simulator for DNA storage” at the Flash Memory Summit in the DNA Data Storage track