Thursday, October 20, 2011

Dark matter of the genome revealed through analysis of 29 mammals

An international team of researchers has discovered the vast majority of the so-called “dark matter” in the human genome, by means of a sweeping comparison of 29 mammalian genomes. The team, led by scientists from the Broad Institute, has pinpointed the parts of the human genome that control when and where genes are turned on. This map is a critical step in interpreting the thousands of genetic changes that have been linked to human disease. Their findings appear online October 12 in the journal Nature.

Early comparison studies of the human and mouse genomes led to the surprising discovery that the regulatory information that controls genes dwarfs the information in the genes themselves. But, these studies were indirect: they could infer the existence of these regulatory sequences, but could find only a small fraction of them. These mysterious sequences have been referred to as the dark matter of the genome, analogous to the unseen matter and energy that make up most of the universe.

This new study enlisted a menagerie of mammals – including rabbit, bat, elephant, and more – to reveal these mysterious genomic elements.

Over the last five years, the Broad Institute, the Genome Institute at Washington University, and the Baylor College of Medicine Human Genome Sequencing Center have sequenced the genomes of 29 placental mammals. The research team compared all of these genomes, 20 of which are first reported in this paper, looking for regions that remained largely unchanged across species.

“With just a few species, we didn’t have the power to pinpoint individual regions of regulatory control,” said Manolis Kellis, last author of the study and associate professor of computer science at MIT. “This new map reveals almost 3 million previously undetectable elements in non-coding regions that have been carefully preserved across all mammals, and whose disruptions appear to be associated with human disease.”

These findings could yield a deeper understanding of disease-focused studies, which look for genetic variants closely tied to disease.
“Most of the genetic variants associated with common diseases occur in non-protein coding regions of the genome. In these regions, it is often difficult to find the causal mutation,” said first author Kerstin Lindblad-Toh, scientific director of vertebrate genome biology at the Broad and a professor in comparative genomics at Uppsala University, Sweden. “This catalog will make it easier to decipher the function of disease-related variation in the human genome.”

This new map helps pinpoint those mutations that are likely responsible for disease, as they have been preserved across millions of years of evolution, but are commonly disrupted in individuals that suffer from a given disease. Knowing the causal mutations and their likely functions can then help uncover the underlying disease mechanisms and reveal potential drug targets.

The scientists were able to suggest possible functions for more than half of the 360 million DNA letters contained in the conserved elements, revealing the hidden meaning behind the As, Cs, Ts, and Gs. These revealed:
  • Almost 4,000 previously undetected exons, or segments of DNA that code for protein
  • 10,000 highly conserved elements that may be involved in how proteins are made
  • More than 1,000 new families of RNA secondary structures with diverse roles in gene regulation
  • 2.7 million predicted targets of transcription factors, proteins that control gene expression
“We can use this treasure trove of new elements to revisit disease association studies, focusing on those that disrupt conserved elements and trying to discern their likely functions,” said Kellis. “Using a single genome, the language of DNA seems cryptic. When studied through the lens of evolution, words light up and gain meaning.”
The researchers were also able to harness this collection of genomes to look back in time, across more than 100 million years of evolution, to uncover the fundamental changes that shaped mammalian adaptation to different environments and lifestyles. The researchers revealed specific proteins under rapid evolution, including some related to the immune system, taste perception, and cell division. They also uncovered hundreds of protein domains within genes that are evolving rapidly, some of which are related to bone remodeling and retinal functions.

“The comparison of mammalian genomes reveals the regulatory controls that are common across all mammals,” said Eric Lander, director of the Broad Institute and the third corresponding author of the paper. “These evolutionary innovations were devised more than 100 million years ago and are still at work in the human population today.”
In addition to finding the DNA controls that are common across all mammals, the comparison highlighted areas that have been changing rapidly only in the human and primate genomes. Researchers had previously uncovered two hundred of these regions, some of which are linked to brain and limb development. The expanded list – which now includes more than 1,000 regions – will give scientists new starting points for understanding human evolution.

The comparison of many complete genomes is beginning to offer a clear view of once indiscernible genomic regions, and with additional genomes, that resolution will only increase. “The power of this resource is that it continues to improve with the inclusion of more species,” said Lindblad-Toh. “It’s a very systematic and unbiased approach that will only become more powerful with the inclusion of additional genomes.”

Other Broad researchers who contributed to this work include Manuel Garber, Or Zuk, Michael F. Lin, Pouya Kheradpour, Jason Ernst, Evan Mauceli, Lucas D. Ward, Michele Clamp, Sante Gnerre, Jessica Alföldi, Jean Chang, Federica Di Palma, Mitchell Guttman, David B. Jaffe, Irwin Jungreis, Marcia Lara, Jim Robinson, Xiaohui Xie, Michael C. Zody, and members of the Broad Institute Sequencing Platform and Whole Genome Assembly Team.

This project was supported by the National Human Genome Research Institute,
National Institute for General Medicine, the European Science Foundation, National Science Foundation, the Sloan Foundation, an Erwin Schrödinger Fellowship, the Gates Cambridge Trust, Novo Nordisk Foundation, University of Copenhagen, the David and Lucile Packard Foundation, the Danish Council for Independent Research Medical Sciences, and The Lundbeck Foundation.

29 Mammals Project

Identification of the functional elements in the human genome — including both coding and non-coding — is a key foundation for biomedical research. One of the most powerful ways to discover these elements is through cross-species comparisons with other mammalian genomes — in effect, deciphering evolution's laboratory notebook containing the results of 100 million years of evolution.

The mammalian genome project is a NIH-funded effort to expand the current genome coverage of the mammals (human, chimpanzee, mouse, dog, opposum) by sequencing 24 additional mammals to low-coverage (2x). The goal is to create low coverage genome assemblies and align resulting sequence to the human genome to permit comparative genomic analysis.

The Broad Institute has sequenced 15 mammals, while two other centers have sequenced the other 9 mammals. We have developed algorithms to identify regions of sequence similarity across species, which have persisted through evolution and are indicative of genomic functionality. These regions include genes and smaller regulatory elements, such as transcription factor binding sites, which play key roles in determining the activation of genes and pathways in different cellular contexts.

The mammals receiving low coverage sequence were chosen primarily to maximize the total branch length of the evolutionary tree. Emphasis was also placed on organisms that represent the diversity of the mammalian tree and, where possible, are biologically useful models.

Though effective for use in identifying features of the human genome shared across most mammals, we recognize the inherent limitations associated with low coverage genome analyses. We have obtained higher quality sequence data (6-7X coverage) from a limited set (8 of 24) of mammals picked for low coverage which will significantly aid in the annotation and understanding of the human genome.

For constraint elements and other datasets related to this project, files are available here.
Rhesus Macaque (Macaca mulatta)
Cow (Bos taurus)
Dog, Domestic (Canis familaiaris)
Guinea Pig (Cavia porcellus)
Sloth, Two-toed (Choloepus hoffmanni)
Nine-banded Armadillo (Dasypus novemicinctus)
Kangaroo Rat (Dipodomys ordii)
Tenrec (Echinops telfari)
Horse (Equus caballus)
Hedgehog, European (Erinaceus europeaus)
Cat, Domestic (Felis catus)
Human (Homo sapiens)
Elephant, African Savannah (Loxodonta africana)
Mouse Lemur (Microcebus murinus)
Mouse (Mus musculus)
Little Brown Bat (Microbat) (Myotis lucifugus)
Pika (Ochotona princeps)
Rabbit (Oryctolagus cuniculus)
Bushbaby (Northern Greater Galago) (Otolemur garnetti)
Chimpanze (Pan troglodytus)
Hyrax, Rock (Procavia capensis)
Fruit Bat (Megabat, Flying Fox) (Pteropus vampyrus)
Rat (Rattus norvegicus)
Shrew, Common (Sorex araneus)
Squirrel, Thirteen-lined Ground (Spermophilis tridecemlineatus)
Tarsier (Tarsier syrichta)
Tree Shrew (Tupaia belangeri)
Dolphin, Bottlenosed (Tursiops truncatus)
Alpaca (Vicugna pacos)




About the Broad Institute of Harvard and MIT
The Eli and Edythe L. Broad Institute of Harvard and MIT was launched in 2004 to empower this generation of creative scientists to transform medicine. The Broad Institute seeks to describe all the molecular components of life and their connections; discover the molecular basis of major human diseases; develop effective new approaches to diagnostics and therapeutics; and disseminate discoveries, tools, methods and data openly to the entire scientific community.
Founded by MIT, Harvard and its affiliated hospitals, and the visionary Los Angeles philanthropists Eli and Edythe L. Broad, the Broad Institute includes faculty, professional staff and students from throughout the MIT and Harvard biomedical research communities and beyond, with collaborations spanning over a hundred private and public institutions in more than 40 countries worldwide

Monday, October 17, 2011

The mechanism that gives shape to life

¿Por qué no crecen los brazos desde el centro de nuestro cuerpo? La cuestión no es tan trivial como parece. Vértebras, extremidades, costillas, coxis ... en sólo dos días, todos estos elementos toman su lugar en el embrión, en el lugar correcto y con la precisión de un reloj suizo. Intrigado por la extraordinaria fiabilidad de este mecanismo, los biólogos se han preguntado cómo funciona. Ahora, los investigadores de la EPFL (Ecole Polytechnique Fédérale de Lausanne) y la Universidad de Ginebra (UNIGE) parecen haber resuelto el misterio. Su descubrimiento se publicó en la edición  de octubre 2011 en la revista Science


The spatial and temporal control of Hox gene transcription is essential for patterning the vertebrate body axis. Although this process involves changes in histone posttranslational modifications, the existence of particular three-dimensional (3D) architectures remained to be assessed in vivo. Using high-resolution chromatin conformation capture methodology, we examined the spatial configuration of Hox clusters in embryonic mouse tissues where different Hox genes are active. When the cluster is transcriptionally inactive, Hox genes associate into a single 3D structure delimited from flanking regions. Once transcription starts, Hox clusters switch to a bimodal 3D organization where newly activated genes progressively cluster into a transcriptionally active compartment. This transition in spatial configurations coincides with the dynamics of chromatin marks, which label the progression of the gene clusters from a negative to a positive transcription status. This spatial compartmentalization may be key to process the colinear activation of these compact gene clusters.


The embryo is built one layer at a time
During the development of an embryo, everything happens at a specific moment. In about 48 hours, it will grow from the top to the bottom, one slice at a time – scientists call this the embryo’s segmentation. “We’re made up of thirty-odd horizontal slices,” explains Denis Duboule, a professor at EPFL and Unige. “These slices correspond more or less to the number of vertebrae we have.”

Every hour and a half, a new segment is built. The genes corresponding to the cervical vertebrae, the thoracic vertebrae, the lumbar vertebrae and the tailbone become activated at exactly the right moment one after another. “If the timing is not followed to the letter, you’ll end up with ribs coming off your lumbar vertebrae,” jokes Duboule. How do the genes know how to launch themselves into action in such a perfectly synchronized manner? “We assumed that the DNA played the role of a kind of clock. But we didn’t understand how.”
When DNA acts like a mechanical clock
Very specific genes, known as “Hox,” are involved in this process. Responsible for the formation of limbs and the spinal column, they have a remarkable characteristic. “Hox genes are situated one exactly after the other on the DNA strand, in four groups. First the neck, then the thorax, then the lumbar, and so on,” explains Duboule. “This unique arrangement inevitably had to play a role.”

The process is astonishingly simple. In the embryo’s first moments, the Hox genes are dormant, packaged like a spool of wound yarn on the DNA. When the time is right, the strand begins to unwind. When the embryo begins to form the upper levels, the genes encoding the formation of cervical vertebrae come off the spool and become activated. Then it is the thoracic vertebrae’s turn, and so on down to the tailbone. The DNA strand acts a bit like an old-fashioned computer punchcard, delivering specific instructions as it progressively goes through the machine.

“A new gene comes out of the spool every ninety minutes, which corresponds to the time needed for a new layer of the embryo to be built,” explains Duboule. “It takes two days for the strand to completely unwind; this is the same time that’s needed for all the layers of the embryo to be completed.” This system is the first “mechanical” clock ever discovered in genetics. And it explains why the system is so remarkably precise.

This discovery is the result of many years of work. Under the direction of Duboule and Daniël Noordermeer, the team analyzed thousands of Hox gene spools. With assistance from the Swiss Institute for Bioinformatics, the scientists were able to compile huge quantities of data and model the structure of the spool and how it unwinds over time.

The snake: a veritable vertebral assembly line
The process discovered at EPFL is shared by numerous living beings, from humans to some kinds of worms, from blue whales to insects. The structure of all these animals – the distribution of their vertebrae, limbs and other appendices along their bodies – is programmed like a sheet of player-piano music by the sequence of Hox genes along the DNA strand.

The sinuous body of the snake is a perfect illustration. A few years ago, Duboule discovered in these animals a defect in the Hox gene that normally stops the vertebrae-making process. “Now we know what’s happening. The process doesn’t stop, and the snake embryo just keeps on making vertebrae, all identical, until the process just runs out of steam.”

The Hox clock is a demonstration of the extraordinary complexity of evolution. One notable property of the mechanism is its extreme stability, explains Duboule. “Circadian or menstrual clocks involve complex chemistry. They can thus adapt to changing contexts, but in a general sense are fairly imprecise. The mechanism that we have discovered must be infinitely more stable and precise. Even the smallest change would end up leading to the emergence of a new species.”
 
                    

Monday, March 14, 2005

Genética y Espiritualidad

2004

Gracias a la genética, hoy sabemos que los seres humanos somos similares en un 99.99 por ciento; que solo 0.01% son características heredadas y que, por ejemplo, aún poseemos 223 genes similares a los de las bacterias.
Además, los resultados del Proyecto Genoma Humano y de la empresa Celera Genomics, demostraron la validez de la teoría evolucionista de Charles Darwin, esdecir, la vida como constante reconstrucción o recombinación al azar de las partes.

Por otro lado, gracias a los datos del Satélite Astronómico de Ondas Milimétricas y el Observatorio Espacial Infrarrojo, también sabemos que las condiciones que condujerona la formación de la vida (moléculas complejas de carbono y agua) están presentes enmuchos sitios y no solo, exclusivamente, en la Tierra o, en términos técnicos, que laquímica compleja del carbono no es privativa de nuestro planeta.

Además, ya somos capaces no-solo de clonar especies en extinción sino de crear nuevas especies, como recientemente se dio conocer, cuando científicos de las firmas Stem CellSciences y BioTransplant, inyectaron –en 1999- el núcleo de célula humana en un óvulode cerdo, que generó un híbrido que se desarrolló hasta el estadio de 32 cédulas, aunque ya desde 1998, el argentino José Cibelli había logrado producir, por vez primera, un híbrido humano-vacuno.

Es decir, ahora que sabemos que la vida no es el resultado de ningún designio divino,¿por qué no nos dedicamos a reconstruir las relaciones con nuestro entorno? ¿Por qué no abandonamos, de una vez por todas, la sacralización de la fe y a los burócratas de la divinidad? ¿Por qué no fortalecemos una espiritualidad que reconstruya,responsablemente, nuestro papel en el entorno natural?. ¿No sería acaso más provechoso que seguir insistiendo en culpas y castigos?.