Wednesday, October 24, 2007

Information comes full circle

DNA--most of us think of it in some vague way as the stuff of life, some enormous string of data encoding every characteristic of the individual wrapped around it as some impossibly long Gödelian nightmare of a quarternary number. (The human genome is about 3 billion digits long, which allows on the order of 101800000000--almost two billion googol--unique codes.) It's that mystery stuff that is the blueprint for everything an organism is, from which proteins its cells express, to which organs it has and how many limbs, to its behavioral tendencies. The genome dictates how organisms process external information, which is to day that it influences everything the it does, whether it's at a high level, with brains (the human brain has about two orders of magnitude more neurons than a DNA strand has base pairs, but unlike natural DNA, they connect with one another nonlinearly) or at a low level, bending toward the sun, say. The organism's reaction to the outside world is part of its extended phenotype, and that response to the environment may include changing the environment, creating a cyclical relationship: the code creates external information that changes the environment, that the code must reinterpret. In the metaphysical extreme, the expression of DNA on an ecosystem level, or on an evolutionary level (over as many years as there are digits in the human genome) is a metastable means for the universe to dissipate energy into uselessness, information into meaninglessness. That sort of determinism doesn't lessen the burden on our freakish consciousness. In order to achieve that metastable species-wide state, or one of them, we still have to act, even if it's just the genes talking, and it's hard to tell where the story is going.

But at the bottom of that somewhat comprehensible tangle of life is a somewhat comprehensible tangle of sugars and phosphates. Actually, the molecule itself isn't quite so exotic as all that. The information capacity is vast, and many aspects of stringing it together remain mysterious, but it's held together according to some well-known chemical rules. It's composed of a phosphate/carbohydrate polymer chain with interacting base groups hanging off the side at regular intervals. Each of these bases is specific to only one other base, and for your single DNA strand to find a partner and make a double helix, then the whole sequence of bases has to correspond in the right order. A chain of bases will only stick to its complement (A and a in the figure), rejecting all others. Even a DNA strand of only ten base units can hold about a million different combinations, each of which has only one destined partner. If you're a chemist, that's as specific a reaction as exists, and it's where the fun with DNA begins.


Over the last couple decades, artificial synthesis techniques have become good enough to manufacture short DNA strands (called oligonucleotieds) in a controllable way and in high quantity. That oligomer can capture a small and characteristic portion of a longer chain, and this specificity can be used to test for the presence of longer chains of interest. Oligonucleotides can be anchored (more chemistry), so that one end is permanently attached to a solid surface such as gold or glass. The sequence of the oligonucleotide is engineered to complement a small portion of a DNA strand of interest, for example from a known pathogen. An assay can be constructed to detect that target DNA. To do this, an solution which is thought to contain the molecule of interest (anaylyte) is passed over the immobilized DNA oligomer, which will grab the target sequence and anchor its whole strand to the surface. A second oligonucleotide solution, which is complementary to another portion of the target molecule is then passed over the surface. Instead of being anchored to a surface this second molecule pulls along a label at its end, which can be viewed with instrumentation. If the label is detected on the surface after being exposed to the analyte and the reporting molecule, then the test is positive. Chad Mirkin's group at Northwestern is using tiny metal particles as labels in this type of assay, which when properly treated are visible to the naked eye or can be measured electrically. The goal is to make spot tests for a variety of dangerous or important microorganisms.


You can further manipulate oligonucleotides to make branched structures, as shown in the third figure. Here, the single strands are designed so that they complement only part of another strand. If you arrange the pattern just so (and if you have an army of grad students with brilliant lab skills), then you can make branched DNA structures. If you arrange the overlap of sequences just a little differently, then you can make branched structures with overhanging sticky ends, each of which will bind only to complementary sticky ends, much like in the assay experiment. The four-armed beastie in the lower figure will bind to itself, and should make a random sort of network (and will probably turn the mixture into a little vial of snot). You're not restricted to three or four arms, and you can have any variety of sticky ends. People have made closed shapes, three dimensional objects, and "tiles", which spontaneously form as each sticky end finds its own soulmate.


Self-assembled DNA structures are pretty cool, but they aren't just laboratories curiousities. If a DNA code is used as a code, that is used by people to store information, then self-assembled DNA structures can be used to read it, pinning objects of known shape or with known labels onto a more complicated molecular scaffold, the sticky end of the known unit attracted to a piece of the code DNA.

You can go further than this and imagine branched DNA structures as a series of jigsaw tiles, such that each edge is attractive only to certain other edges. In this situation, the identity of a given tile depends on the identity of adjacent tiles. Given an initial soup of molecular Legos, or for a given "input" series that starts concatenating a pattern, tile structures can emerge such that a unique structures assemble from given inputs. Known inputs-->organization-->reproducible outputs. You've made a computer. It doesn't necessarily matter how that organization occurs, you may not even need to know what edges you have in the mix of tiles, but Wang's carpet has just done your thinking for you. Molecular computers using DNA have already been shown to solve known mathematical problems that are computationally intensive by conventional means. Can any pile of jelly hold a thought? I've got to go with yes, Stan.

I've been entertaining myself lately to think properties of the universe as purely informational. Entropy can be considered as the availability of alternate states of being for any datum you can name, and the second law of thermodynamics says that nature pushes for all states to eventually become interchangable, and therefore meaningless. It may be more accurate to say, however, that information is a property of universe. Information requires the existence of the stuff of nature configured a certain way--if information is encoded in atoms then the nuclear force and electrical forces control the arrangement of its components. If it's stored in the motion of the spheres, then it's masses arranging themselves gravitationally, if it's in molecules, the thoughts in your head, or the words on your screen, then dynamic patterns of electrical motion are holding your information. Human thoughts (and computers too) may in a broad sense be considered expressions of a few billion molecular codes, each a few billion spaces long. The human genome has reached an oddly recursive point where it can manipulate itself as well as its environment. Information comes full circle.

Some additional online references:

No comments: