Theoretical Glycobiology: The Search for a Third Language

From Santa Fe Institute Events Wiki

Workshop Navigation

The “lock and key” concept introduced into biochemistry by Emil Fisher over a century ago has now been reified by proteins (lectins) which are specific recognition molecules for highly ordered sugar multimers (glycans). This permits the large array of interactions seen in morphogenesis and immunology. Glycans formed from complex arrays of sugars thus constitute a language for multicellular biology that some have designated as ‘third language of life and morphogenesis.’ The past fifty years have witnessed a substantive and growing field of structural analysis and sequencing of glycans as well as chemical and enzymatic synthesis of these molecules. The time is at hand for the emergence of a theoretical glycobiology and search for the third language.

The central dogma of molecular biology describes DNA, RNA and proteins as the key molecules in biological information flow. It tends to ignore other essential biomolecules including glycans (multimers, oligomers, or polymers of sugars) that are the most plentiful molecules on the planet in terms of mass. Every cell is covered with glycans and most of the secreted proteins in higher organisms are glycosylated. In the latter half of the last century significant progress was made in genomic and proteomic sciences yielding the sequencing of genomes and proteomes of several organisms. However, this information alone has been unable to decipher and define the complexity of multicellular life. In the past decade, glycans have received more attention because of their universal presence in biological systems and their involvement in intermolecular and intercellular communication in almost all biological/physiological processes including morphogenesis, immune regulation and diseases such as cancer, inflammation and microbial infections.

Glycans are found in every organism and are used as energy production and storage molecules (glucose, starch, glycogen) and structural molecules (cellulose, peptidoglycans and proteoglycans) by all life forms. In addition, because of their structural diversity, they also act as key information molecules and chemical messengers (glycoconjugates and lectin receptors). There are four nucleotides and twenty amino acid monomers that form all known nucleic acids and proteins, respectively, seen in living organisms. On the other hand, the total number of sugar monomers (which can exist from three-carbon to nine-carbon forms) in living organisms is much larger. Unlike nucleic acids and proteins, sugars are polyfunctional. Therefore, glycans can polymerize in branched as well as linear fashion at a number of linkage positions with different geometries, endowing them with significant complexity and with higher level of structural, functional and informational sophistication and flexibility. Sugars are indirectly controlled by the genome because the monosaccharide diversity is due to conversions mediated by enzymes, which are gene products. These glycans can exist in free form, conjugated to secondary metabolites or to proteins and lipids – conferring distinct and often crucial structural and functional properties. The number and structural diversity of sugars (and their polymers) greatly vary with evolutionary scale (i.e., the pool of monosaccharides and their oligomeric forms are overlapping with several distinct sugars).

In the unicellular organisms, the glycans predominantly play structural roles (in the form of peptidoglycans and cell walls) protecting them from environmental stresses (chemical and physical). However, glycans are also the key determinants of multicellularity and morphogenesis, being a major class of molecules that distinguishes prokaryotes and eukaryotes. Glycans cover the extracellular matrix of every cell and therefore are the chemical keys that carry the information required for regulated intercellular interactions at the cell surface level. Glycans from one cell bind to corresponding glycan-binding proteins (called lectins) on the interacting cells. This newer version of lock-and-key paradigm can be considered as a highly complex ‘Velcro,’ where several different shapes and sizes of complementary molecules interact to exchange chemical messages. It is likely that with the origin of multicellularity and morphogenesis, the need for complex intercellular (and intermolecular) communication arose which was accomplished by covalently conjugating glycans to proteins and lipids on the cell surface and on secreted molecules.

From the point of view of the chemical evolution of function it should be noted that glycans are made of monosaccharides derived from the core metabolites: glucose fructose, and ribose. In addition, synthesis of glycoconjugates requires activated sugar nucleotides using the same purines and pyrimidines found in RNA, again going back to core metabolism. Thus, the language of glycobiology makes use of components already present in the central dogma and its metabolic roots. This reflects a unity of biochemical networks and cellular function that suggests a deeper relationship. Glycans and lectins represent new classes of molecules encoding and expressingbiological complexity. The structural diversity and functional importance of glycans has led some researchers to propose that glycans represent the third language of biology. Cells communicate using glycans (on the cell surface or on secreted molecules) in a chemical language that has so far only been studied by experimental scientists (in a rather incomprehensive fashion). Currently, there is very little knowledge regarding the information content of this newly realized syntax. Because sugars can form both linear and branched ‘words,’ if one were to compare the glyco-code to an existing form of human language representation, we can perhaps consider depicting the language of glycans in Kanji or Hieroglyphic forms.