Electricity and CRISPR are used to write data to bacterial DNA

False color image of bacteria

In recent years, researchers have used DNA to encode everything from an operating system to malware. Instead of being a technological curiosity, these efforts were serious attempts to exploit DNA properties for the long-term storage of data. DNA can remain chemically stable for hundreds of thousands of years, and we’ll probably not lose the technology to read it, something you can not say about things like ZIP drives and MO disks.

But so far, writing data to DNA has involved converting the data to a series of bases on a computer and ordering a chemical synthesizer – living things are not really in the picture. But separately, a group of researchers determined how to record biological events by modifying the DNA of a cell so that they could read the history of the cell. A group at Columbia University has now figured out how to combine the two attempts and write data to DNA using voltage differences applied to live bacteria.

CRISPR and data storage

The CRISPR system was developed as a means of editing genes or cutting out DNA entirely. But the system first came to the attention of biologists because it inserted new sequences into DNA. For all the details, we see Nobel coverage, but know for now that part of the CRISPR system involves identifying DNA from viruses and inserting copies of it into the bacterial genome to recognize it if the virus were to return. appear.

The group in Columbia figured out how to use it to record memory in bacteria. Suppose you have a process that activates genes in response to a specific chemical, such as sugar. The researchers diverted it to activating a system that makes copies of a circular piece of DNA called a plasmid. Once the copy number was high, they activated the CRISPR system. Given the circumstances, it was very likely that a copy of the plasmid DNA would be inserted into the genome. If the sugar was not present, it would usually have inserted something else.

Using this system, it could be seen whether a bacterium had been exposed to the sugar in the past. This is not perfect, as the CRISPR system does not always insert something if you want it to, but it works on average. You therefore only need to sequence enough bacteria to determine the average sequence of events.

To adapt it for data storage, the researchers used two plasmids. One of them is the same as described above: present at low levels when a specific signal is not, and present at very high levels when the signal is round. The second is always at moderate levels. When CRISPR was activated, it tended to insert sequences of whatever plasmid was at higher levels, as shown in the diagram below.

Left, without any signal, is the red plasmid at low levels.  When CRISPR is activated, the sequence of the blue plasmid is more likely to be inserted into the genome.  On the right side, if the signal is present, there are many more red plasmids, and therefore it is more likely to place it in the genome.
Enlarge / Left, without any signal, is the red plasmid at low levels. When CRISPR is activated, the sequence of the blue plasmid is more likely to be inserted into the genome. On the right side, if the signal is present, there are many more red plasmids, and therefore it is more likely to place it in the genome.

John Timmer

On its own, it only saves a little. But the process can be repeated, creating a piece of DNA, a series of inserts derived from the red and blue plasmids, and the identity determined by whether the signal was present or not.

It’s a shock

It’s a neat system, but quite far removed from the kind of things we normally associate with data production – the output of a sensor reading or calculation is rarely a sugar or antibiotic mixed with a bunch of bacteria. Making bacteria respond to an electrical signal appears to be relatively simple. E coli is able to alter the activity of genes depending on whether they are in an oxidizing or reduced chemical environment. And the researchers can change the environment by applying stress differences to a specific chemical in the culture with the bacteria.

More specifically, the stress difference will change the oxidative state of a chemical called ferrocyanide. This led to the bacteria changing the activity of genes. By designing the plasmid so that it responds to the same signal as these genes, the researchers were able to control the plasmid levels by applying different voltages. And they were then able to record the level of the plasmid by activating the CRISPR system in these cells.

It’s pretty easy to see how each of the inserts in a series can be considered a zero or one, depending on the identity of the insert. But remember that this system is not perfect; fairly frequently, CRISPR would not insert anything if activated, which would move all subsequent bits. Since this process is random, the greater the chance that at least one of them will be skipped, the longer the series of bits you are trying to encode.

To limit this problem, the researchers kept their data at three bits per bacterial population. Even then, they had to train a supervised algorithm to reconstruct the most probable series of bits, based on the average of the series found in the population. And even with that, the system could not recognize the series of bits about six percent of the time. Eventually, they decided to use a parity bit that was the sum of the first two to allow error correction, and then they edited many populations in parallel.

(By giving each population’s plasmids a unique sequence called a ‘barcode’, it was possible to mix many of them into a single population after the bits were coded and to unravel everything once the DNA was recovered. is.)

With everything in place, they have “Hello world!” Successfully stored and read. They even put the bacteria in a potting soil for a week and showed that they could repair the message. (Storing it in the freezer, of course, works better.) They estimate that the message could be preserved for at least 80 generations of bacteria.

Let’s be clear: as a storage medium, in its current form, it’s pretty awful. If you want to put data into DNA, it would be better to have the DNA synthesized chemically. But it’s interesting to think that we can go directly from electrical signals to altered DNA, and there may be some ways to improve the system now that it’s set up.

Nature Chemical Biology, 2021. DOI: 10.1038 / s41589-020-00711-4 (About DOIs).

Source