Scientists at Gladstone Institutes are using machine learning to target genetic disorders in so-called genomic “dark matter”.
The computational method being used, called TargetFinder, predicts where non-coding DNA – the DNA that does not code for proteins – interacts with genes. By analysing big data, researchers are abble to connect mutations in genomic “dark matter” with the genes they affect, potentially revealing new targets for genetic disorders.
In the study, published in Nature Genetics, the team from Gladstone Institutes looked at fragments of non-coding DNA called enhancers which act like an instruction manual for a gene, dictating when and where a gene is turned on.
“Most genetic mutations that are associated with disease occur in enhancers, making them an incredibly important area of study,” said the study’s senior author, Katherine Pollard. “Before now, we struggled to understand how enhancers find the distant genes they act upon.”
The new study revealed that, on a strand of DNA, enhancers can be millions of letters away from the gene they influence.
However, using machine learning technology, the researchers were able to analyse hundreds of existing datasets to look for patterns in the genome and identify where a gene and enhancer interact.
They discovered that when an enhancer is far away from the gene it affects, the two connect by forming a three-dimensional loop, like a bow on the genome.
“It’s remarkable that we can predict complex three-dimensional interactions from relatively simple data,” said biostatistician at Gladstone, Sean Whalen. “No one had looked at the information stored on loops before, and we were surprised to discover how important that information is.”
The new computational approach is a much cheaper and a less time-consuming way to identify gene-enhancer connections in the genome as performing experiments in the can take millions of dollars and years of research.
The technology also gives an insight into how DNA loops form and how they might break in disease.
“Our ability to predict the gene targets of enhancers so accurately enables us to link mutations in enhancers to the genes they target,” said Pollard. “Having that link is the first step towards using these connections to treat diseases.”
Gladstone is set to offer all of the code and data from TargetFinder online for free.