Posts Tagged ‘from’

Highlights of 2016

March 26, 2017

I recently had to complete my 2016 Faculty Activity Report (FAR), summarizing key lab “activities” of the past year.

* Here are dump directories with some excerpts:–cv57/–cv57/

These include:

* A full updated CV describing my lab’s activities (in too much detail):–cv57/M-Gerstein-Public-CV–bld1dec16.cv57.pdf

The CV is based on :

– Compiling the people in the lab, viz:–cv57/cv57-22-bld1dec16-AdaptedFrom–Gerstein_Lab_Personnel_112016.pdf–cv57/cv57-23-bld1dec16–EditOn–Corrected-Past-Postdoctoral-Associates-and-Fellows.notrkchg.pdf

– A dump up to the end of ’16 of all of our scientific papers and our “other writings” too.–cv57/cv57-26-bld1dec16–addendum-Rest–Just-Other-Writings.pdf–cv57/cv57-30-bld1dec16–papers-simple–reformatted.pdf

– There’s also an update on lectures in ’16:

* Finally, I’ve done little write up of some highlights, viz:

During 2016 the lab had a number of research highlights. We have published three interlinked tools: Stress, Frustration, and
Intensification, for assessing the impact of rare genomic variants using knowledge of molecular structure. The tools are of particular interest to the medical genetics community because as they can help explain various cancer mutations as well as variants associated with genetic diseases. Another highlight is our publishing a framework for quantifying privacy risks as a result of linking clinical and phenotype variables. This paper is a timely work given the ongoing debate on data sharing. Apart from these works, we have a few research papers on topics in genomics, such as analyzing allele-specific binding and gene expression analysis, and several review articles on the role of non-coding variants, network comparison, and the cost of sequencing.

Regarding service, I worked on further developing the computational biology program at Yale. In particular, I co-chaired a committee about moving toward a Center for Biomedical Data Science at the Medical School. My lab served the research community in participating in many consortiums, such as PCAWG (the Pan-Cancer Analysis Working Group), the ENCODE consortium, PsychENCODE, 1000 Genomes’ structural variation group (and its follow-ons), and the Extracellular RNA Communication Consortium. In 2016, I gave talks and participated in many meetings, including an important data-science education forum at the Cold Spring Harbor Laboratory.

Regarding teaching, I further developed my course in Bioinformatics by including more practical hands-on materials. For example, we introduced a collaborative programming assignment utilizing the GitHub site.

(Private link, with authentication only for my reference:

For reference, this involved updating a variety of places on the wiki, viz:

Pseudogene derived chimeric CEL-HYB protein

March 14, 2017

Recombined allele of..lipase gene CEL & its #pseudogene CELP confers susceptibility to..pancreatitis CEL-HYB chimera

Is American Pet Health Care (Also) Uniquely Inefficient?

March 11, 2017

Is American Pet Health Care (Also) Uniquely Inefficient? High correlation betw. #healthcare costs for people & pets

NYTimes: The Compost King of New York

March 9, 2017

The Compost King Interesting #biogas business model for trash, revolving around the different grades of garbage

New Haven Crime Rates By Neighborhood

March 5, 2017

Nice demographic #maps (property values, crime &c) of NHV + other cities Related RedZone app

Avoid navigating through dangerous areas
RedZone Map by Zone Technologies Inc.

But if you have to go there keep your finger on the button
SafeTrek – Hold Until Safe℠ by SafeTrek, Inc.

Transmissible Dog Cancer Genome Reveals the Origin and History of an Ancient Cell Lineage | Science

March 4, 2017

Transmissible Dog Cancer Genome Reveals…History of an Ancient Cell Lineage After 11k yrs 2M SNVs & 646 genes KO’ed

Elizabeth P. Murchison1,2,*,†,
David C. Wedge1,*,
Ludmil B. Alexandrov1,
Beiyuan Fu1,
Inigo Martincorena1,
Zemin Ning1,
Jose M. C. Tubio1,
Emma I. Werner1,
Jan Allen3,
Andrigo Barboza De Nardi4,
Edward M. Donelan3,
Gabriele Marino5,
Ariberto Fassati6,
Peter J. Campbell1,
Fengtang Yang1,
Austin Burt7,
Robin A. Weiss6,
Michael R. Stratton1,†

+ See all authors and affiliations

Science 24 Jan 2014:
Vol. 343, Issue 6169, pp. 437-440
DOI: 10.1126/science.1247167

bioarchiv statistics

March 4, 2017

#bioRxiv: a progress report Great stats on the archive’s 1st years: 134 days from deposit until journal publication


“The median interval is 134 days. Authors choose to post preprints at a variety of times in the publication cycle of a manuscript, ranging from first draft to simultaneous submission of a completed paper at bioRxiv and a journal. bioRxiv declines papers that have been published or already assigned a journal DOI.”

TP53 copy number expansion is associated with the evolution of increased body size and an enhanced DNA damage response in elephants | eLife

March 4, 2017

TP53 copy number expansion is associated w…enhanced DNA damage response in elephants 18 p53 retro- & pseudo- genes

IOT asthma inhaler

March 2, 2017

Inferring chromatin-bound protein complexes from genome-wide binding assays – Genome Research

February 26, 2017

Inferring [w. NMF] chromatin-bound protein complexes [of TFs] from [ENCODE ChIP-seq] binding assays, by @ElementoLab

Giannopoulou E, Elemento O. 2013. Inferring chromatin-bound
protein complexes from genome-wide binding assays. Genome Research, Published in Advance April 3, 2013, doi: 10.1101/gr.149419.112.

This study uses nonnegative matrix factorization (NMF) of ENCODE CHIP-seq data (transcription
factors and histone modifications) to predict complexes of
transcription factors that bind DNA
together; it then assesses how these predicted complexes regulate gene expression. It goes beyond
previous studies in that it attempts to treat the TFs as complexes rather than individuals. A handful of
the predicted complexes correspond to known regulatory complexes, e.g. PRC2, and overall, the
complexes were enriched for known protein-protein interactions. Linear regression and random forest
models were then used to predict the effects of the complexes on the expression of adjacent genes. In
both models, the complexes performed better than those predicted from a scrambled TF read count
matrix. Overall, this study provides a large set of hypotheses for combinations of TFs that may
function together, as well as potential new components of known complexes.