July 16, 2021

“There’s a lot of buzz about quantum computing,” says Yale University researcher Mark Gerstein, whose projects traverse biology and informatics. Enthusiasm among his colleagues about the prospects of quantum computing is especially high in the physical sciences, and interest is growing in computational biology and biology more generally.

Gerstein co-authored a paper4 that grew from a series of discussions at the National Institute of Mental Health (NIMH), part of the US National Institutes of Health (NIH). It’s part of the NIH’s way of exploring how to support biologists interested and involved in quantum computing, he says. The wider neuroscience community, for example, is interested in how quantum approaches can be applied to deep learning and machine learning.


July 16, 2021

“Ironically, a lot of these tools are about not having people sit in front of a screen all the time,” says computational biologist Mark Gerstein at Yale University in New Haven, Connecticut. “I don’t think that helps people think.” Instead, he says, researchers spawn creativity when talking and scribbling down ideas together, be that on a phone, tablet, laptop or in person.

Like Brown, Gerstein prizes face-to-face conversation and
collaboration in his group, which works on large-scale analyses of biosensor and wearable data. As such, it attracts “hard-core computer geeks”, he says, so he’s thought deeply about how to entice them out from behind their screens.

“Computers now let us dictate, write and draw with our hands in much more relaxing and natural ways,” he says. Gerstein sets his phone on a nearby table, then uses Google Recorder to capture discussions, and the app (which is available only on Pixel phones) transcribes it in real time. The transcript is coupled to the audio and can be searched by keyword. Another dictation app, known as Rev, offers
quick-turnaround manual transcriptions for $1.25 per minute of recording. Gerstein also uses the app Grammarly to “take the yucky voice-to-text transcript and fix the language up quickly”.

Gerstein describes his group’s use of these tools together as a “stack” to go from conversation to a rough draft of a manuscript in just a few clicks, he says. He estimates that the tools cut the time they spent on that task in half.

Gerstein has also investigated tools that digitally recreate the experience of scientists gathered around a whiteboard. Zoom’s Annotate feature is one option, which he has deployed during remote meetings both before and during the pandemic. Another is Rocketbook, a reusable physical notebook ($16–45) that has whiteboard-like paper paired with a mobile-phone app that converts photos of notebook scribbles, cartoons and diagrams into digital files. Both Rocketbook and Google Lens use optical character recognition to interpret handwriting and translate it into searchable text. “I’ve saved thousands of sheets of paper this way,” says Gerstein.


June 28, 2021

January 16, 2021


The new data ‘sanitization’ technique obscures regions of a
participant’s genome in a dataset to secure her privacy, and may encourage more people to participate in genetic studies, says lead investigator Mark Gerstein, professor of biomedical informatics at Yale University.

“If someone hacks into your email, you can get a new email address; or if someone hacks your credit card, you can get a new credit card,” Gerstein says. “If someone hacks your genome, you can’t get a new one.”

To determine which information and how much of it should remain private to prevent a linkage attack, Gerstein and his colleagues performed linkage attacks on existing genetic datasets. In one sample attack, they compared two publicly available databases and RNA sequencing results to successfully identify 421 individuals.

In another linkage attack, Gerstein’s team sequenced the RNA of two volunteers and shuffled these data into a larger dataset. They then obtained DNA samples from the volunteers’ used coffee cups and sequenced their genomes. Again, they could link the two individuals to their genomes with a high degree of certainty.

Based on what they learned from the mock linkage attacks, Gerstein’s team developed a technique to mask some variants from a person’s genetic data while preserving where those variants are located in the genome. To do this, they replace the genetic variant of concern with one from a reference genome; which variants are removed depend on the genetic conditions or predispositions someone’s genetic data reveals.

Introducing too many of these privacy-masking variants can decrease the usefulness of the data. But Gerstein’s team struck a balance that enables researchers to obtain data on gene-expression values but also enables study participants to dictate how much of their genetic information they wish to keep hidden.


December 22, 2020

Interview with
Mark Gerstein, Yale University

September 20, 2020

August 26, 2020

August 9, 2020

June 2, 2020

