Archive for the 'SciLit' Category

Use and mis-use of supplementary material in science publications | BMC Bioinformatics | Full Text

March 23, 2016

http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-015-0668-z

Staying Afloat in the Rising Tide of Science: Cell

March 19, 2016

Staying Afloat in the Rising Tide of Science by @CarlZimmer
http://www.Cell.com/cell/fulltext/S0092-8674(16)30192-1 How can this tide lift all boats & not drown us in Tb?

AlgoRun, a Docker-based packaging system for platform-agnostic implemented algorithms

March 19, 2016

http://dx.doi.org/10.1093/bioinformatics/btw120

http://AlgoRun.org, #Docker-based packaging [w/ web GUI & workflow mgt] for platform-agnostic implement[ations]
http://Bioinformatics.Oxfordjournals.org/content/early/2016/03/02/bioinformatics.btw120

Hosny, A. et al. AlgoRun, a Docker-based packaging system for platform-agnostic implemented algorithms. Bioinformatics Advance Access, Mar 2, 2016.

EM algorithm

March 11, 2016

What’s the EM #algorithm?
http://www.nature.com/nbt/journal/v26/n8/full/nbt1406.html Description of its essence in simple contexts (ie coin toss) & as soft version of kmeans

What is the expectation maximization algorithm? : Article : Nature Biotechnology

Primer
Nature Biotechnology 26, 897 – 899 (2008)
doi:10.1038/nbt1406

Chuong B Do & Serafim Batzoglou

Abstract
The expectation maximization algorithm arises in many computational biology applications that involve probabilistic models. What is it good for, and how does it work?

without too much math

CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription: Cell

March 4, 2016

CTCF-Mediated…3D Genome Architecture
http://www.cell.com/cell/abstract/S0092-8674(15)01504-4 SNPs give different #chromatin topologies, including strong #allelic effects

Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins : Nature Genetics : Nature Publishing Group

March 3, 2016

Gene-gene & gene-env interactions…by #transcriptome…in twins by @dermitzakis lab
http://www.nature.com/ng/journal/v47/n1/full/ng.3162.html Nice model for ASE HT @cjieming

Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins
Alfonso Buil, Andrew Anand Brown, Tuuli Lappalainen, Ana Viñuela, Matthew N Davies, Hou-Feng Zheng, J Brent Richards, Daniel Glass, Kerrin S Small, Richard Durbin, Timothy D Spector & Emmanouil T Dermitzakis

http://www.nature.com/ng/journal/v47/n1/full/ng.3162.html

Circadian patterns of gene expression in the human brain and disruption in major depressive disorder

February 29, 2016

http://www.pnas.org/content/110/24/9950.full

PLOS Genetics: A Simple Model-Based Approach to Inferring and Visualizing Cancer Mutation Signatures

February 27, 2016

Model-Based Approach to Inferring…#Cancer Mutation Signatures http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1005657 Assuming independence betw 3 NTs, 11 v 95 parameters

QT:{{”
The first contribution of this paper is to suggest a more parsimonious approach to modelling mutation signatures, with the benefit of producing both more stable estimates and more easily interpretable signatures. In brief, we substantially reduce the number of parameters per signature by breaking each mutation pattern into “features”, and assuming independence across mutation features. For example, consider the case where a mutation pattern is defined by the substitution and its two flanking bases. We break this into three features
(substitution, 3′ base, 5′ base), and characterize each mutation signature by a probability distribution for each feature (which, by our independence assumption, are multiplied together to define a distribution on mutation patterns). Since the number of possible values for each feature is 6, 4, and 4 respectively this requires 5 + 3 + 3 = 11 parameters instead of 96 − 1 = 95 parameters. Furthermore, extending this model to account for ±n neighboring bases requires only 5 + 6nparameters instead of 6 × 42n − 1. For example, considering ±2 positions requires 17 parameters instead of 1,535. Finally,
incorporating transcription strand as an additional feature adds just one parameter, instead of doubling the number of parameters. “}}

Identification of neutral tumor evolution across cancer types : Nature Genetics : Nature Publishing Group

February 27, 2016

Neutral tumor #evolution across #cancer types
http://www.nature.com/ng/journal/v48/n3/full/ng.3489.html Initial burst of driver events followed by random mutations

TIGRA: a targeted iterative graph routing assembler for breakpoint assembly. – PubMed – NCBI

February 21, 2016

TIGRA: Targeted Iterative Graph Routing Assembler for breakpoint[s ]http://GENOME.CSHLP.org/content/24/2/310.long key steps: read extraction & de Bruijn #assembly

This presents a breakpoint assembler used for many projects including 1000 Genomes. It uses a targeted iterative graph routing approach. The program consists of two steps: read extraction and then assembly. The assembly step uses a de Bruin graph-based approach to create contigs from the selected reads. A shortcoming of TIGRA is it depends on the success of the first step of the program, selection of reads that span breakpoints. Thus TIGRA is sensitive to the breakpoint annotation accuracy input. Breakpoints determined from discordant paired-end or split-end alignments and by predictors like breakdancer, delly, genomestrip are excellent for TIGRA, but those determined only by read-depth such as CNVnator and RDX are poor performers.

As input TIGRA requires putative breakpoints annotation/prediction (preferably at nucleotide level or at least within 100bp resolution) and BAM files (sequence reads aligned to reference genome).
In the read extraction TIGRA tries to select all the reads that are likely associated with the breakpoint as long ass they have at least one ned or subsegment that is confidently mapped. For known SV types, TIGRA extract reads selectively to reduce the over representation of the reference allele. The assembly step uses the a de Bruin graph-based approach to create contigs from the selected reads. For this TIGRA first uses an iterative procedure to explore multiple k-mers and thus increases the chance of assembling of low coverage reads. Next it records alternative path in the contain graph