Posts Tagged ‘mynotes0mg’

Deep learning and process understanding for data-driven Earth system science | Nature

March 4, 2019

https://www.nature.com/articles/s41586-019-0912-1
Perspective | Published: 13 February 2019
Deep learning and process understanding for data-driven Earth system science Markus Reichstein, Gustau Camps-Valls, Bjorn Stevens, Martin Jung, Joachim Denzler, Nuno Carvalhais & Prabhat
Nature volume 566, pages195–204 (2019)

QT:[[”
Figure 3 presents a system-modelling view that seeks to integrate machine learning into a system model. As an alternative perspective, system knowledge can be integrated into a machine learning frame- work. This may include design of the network architecture36,79, physical constraints in the cost function for optimization58, or expansion of the training dataset for undersampled domains (that is, physically based data augmentation)80.

Surrogate modelling or emulation
See Fig. 3 (circle 5). Emulation of the full (or specific parts of) a physical model can be useful for computational efficiency and tractability rea- sons. Machine learning emulators, once trained, can achieve simulations orders of magnitude faster than the original physical model without sacrificing much accuracy. This allows for fast sensitivity analysis, model parameter calibration, and derivation of confidence intervals for the estimates.

(2) Replacing a ‘physical’ sub-model with a machine learning model
See Fig. 3 (circle 2). If formulations of a submodel are of semi-empirical nature, where the functional form has little theoretical basis (for example, biological processes), this submodel can be replaced by a machine learning model if a sufficient number of observations are available. This leads to a hybrid model, which combines the strengths of physical modelling (theoretical foundations, interpretable compartments) and machine learning (data-adaptiveness).

Integration with physical modelling
Historically, physical modelling and machine learning have often been treated as two different fields with very different scientific paradigms (theory-driven versus data-driven). Yet, in fact these approaches are complementary, with physical approaches in principle being directly interpretable and offering the potential of extrapolation beyond observed conditions, whereas data-driven approaches are highly flexible in adapting to data and are amenable to finding unexpected patterns (surprises).

A success story in the geosciences is weather
prediction, which has greatly improved through the integration of better theory, increased computational power, and established observational systems, which allow for the assimilation of large amounts of data into the modelling system2
. Nevertheless, we can accurately predict the evolution
of the weather on a timescale of days, not months.
“]]

# REFs that I liked
ref 80

ref 57
Karpatne, A. et al. Theory-guided data science: a new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 29, 2318–2331 (2017).

# some key BULLETS

• Complementarity of physical & ML approaches
–“Physical approaches in principle being directly interpretable and offering the potential of extrapolation beyond observed conditions, whereas data-driven approaches are highly flexible in adapting to data”

• Hybrid #1: Physical knowledge can be integrated into ML framework –Network architecture
–Physical constraints in the cost function
–Expansion of the training dataset for undersampled domains (ie physically based data augmentation)

• Hybrid #2: ML into physical – eg Emulation of specific parts of a physical for computational efficiency

My Notes Related to the NHGRI strategic planning meeting

January 29, 2019

MAIN event page

https://www.genome.gov/27572552/from-genome-to-phenotype–genomic-variation-identification-association-and-function-in-human-health-and-disease

TWEETS related to the event

https://docs.google.com/spreadsheets/d/e/2PACX-1vR3TtIvR1OIYxHa5nTQE0kFz2Dc7d8RGts8WWf4NHbR7ZdhBH7BXSY8DjYLo23gNWEAK0GtcTGFqaw8/pubhtml

Archived copy of the above in the labdropbox

Liked-Tweets-Related-NHGRI-strategic-planning-meeting–i0g2p18-genome2020-conf0mg in labdropbox

Liked-Tweets-Related-to–CBC-Twins-DNA-testing–csquare–and–NHGRI-strategic-planning-meeting–i0g2p18-genome2020.pdf

TAGGED links

https://linkstream2.gerstein.info/tag/i0g2p18/

Random links related to the PsychENCODE ’18 Rollout

December 23, 2018

Cleaning up stuff from the psychENCODE rollout (my tag “pecrollout”)

* Some “liked” tweets (not exhaustive, mostly positive)

https://docs.google.com/spreadsheets/d/e/2PACX-1vRxRW7_Cq4GgacbPjV9f1un3pXZD092p48I04_aaXMMr4o7nbROdFDKxeFkq7BvrdSk1tWd-jrRlnDX/pubhtml

Private archives of the above :
http://meetings.gersteinlab.org/2018/12.23/Tweet-stuff-from-pecrollout-n-rsgdream18/

* Tagged articles

https://linkstream2.gerstein.info/tag/pecrollout/

* Papers

associated with the Gerstein Lab
http://papers.gersteinlab.org/subject/pecrollout

Science magazie collection
http://www.sciencemag.org/collections/psychencode?_ga=2.143857020.873191909.1545622068-923654032.1534125785

PEC website collection
http://www.psychencode.org/?page_id=227

* Yale pre-print site

http://info.gersteinlab.org/PEC_package_preprints

* Random private archived material

https://www.dropbox.com/home/01-NOT-TOP-LEVEL/ARCHIVE/random-archived-materials-from-pecrollout.x57k

BioData18

November 25, 2018

Biological Data Science ’18
https://meetings.cshl.edu/meetings.aspx?meet=DATA&year=18

FAVORITE TWEETS (public)

https://docs.google.com/spreadsheets/d/e/2PACX-1vTV8Oa4DeI9RkFa0qJNSyflh783if2RecT1naeMmwFQzuBNJqP48SmzzsmKg1ixOfFbQ7Tht5uUAOUV/pubhtml

FAVORITE TWEETS (private)

http://meetings.gersteinlab.org/2018/11.23/Favorite-tweets-from-Biological-Data-Science-2018–i0biodata18-biodata18.xlsx

http://meetings.gersteinlab.org/2018/11.25/Printout-of–Favorite-tweets-from-Biological-Data-Science-2018–i0biodata18-biodata18.pdf

SLIDE PICS

http://meetings.gersteinlab.org/2018/11.17/MG-Pics-from-i0biodata18-incl-many-slides/

Sigma Xi Conference this Week

October 30, 2018

My notes from the Sigma Xi Conference

LECTURES

http://lectures.gersteinlab.org/summary/Using-population-scale-functional-genomics-mental-disease-n-exploiting-data-exhaust-20181025-i0sigma/

http://lectures.gersteinlab.org/summary/Thoughts-on-Annotation-Variants-Application-disease-context–20182310-i0sigma+ucsf/

TWEETS

Favorited-Tweets-from-i0sigma-meeting.xlsx

https://docs.google.com/spreadsheets/d/e/2PACX-1vQ9yzRc3_Dl0bUJ5K4oxNAdLjNdZYiMk-LuZTYxn-3Spmli94nc15x4zd_lUiw3NX4BxOZNqryQ462J/pubhtml

http://meetings.gersteinlab.org/2018/10.30/Favorited-Tweets-from-i0sigma-meeting.xlsx

http://meetings.gersteinlab.org/2018/10.30/printout-of-liked-tweets-from-i0sigma.pdf

LINKS

https://linkstream2.gerstein.info/tag/i0sigma/

OTHER

Slides (Private)

http://meetings.gersteinlab.org/2018/10.30/For%20labdropbox%20-%20Many%20slide%20pics%20from%20SigmaXi%20meeting%20-%20i0sigma%20–/

Modeling RNA Secondary Structure with Sequence Comparison and Experimental Mapping Data: Biophysical Journal

August 18, 2018

Modeling RNA Secondary Structure with Sequence Comparison &
Experimental Mapping Data
https://www.Cell.com/biophysj/fulltext/S0006-3495(17)30689-6 combines TurboFold sec. structure prediction w/ results of SHAPE assays

Zhen Tan
Gaurav Sharma
David H. Mathews

Open Archive Published:July 20, 2017
DOI:https://doi.org/10.1016/j.bpj.2017.06.039

Journal Club by JG

July 27, 2018

TWAS of 229k women identifies new candidate susceptibility genes for breast cancer https://www.Nature.com/articles/s41588-018-0132-x Metascan approach. Derive model for imputing transcriptome from #GTEx & validate against TCGA. Application to BCAC.

Genome-wide prediction and functional characterization of the genetic basis of autism spectrum disorder | Nature Neuroscience

July 27, 2018

Genome-wide prediction & functional characterization of the genetic basis of autism spectrum disorder, by @OlgaTroyanskaya lab
https://www.Nature.com/articles/nn.4353 Intersected candidate #ASD genes w/ #BrainSpan gene expression to find a pre-natal signal for the disease

My notes from ISMB 2018 – i0ismb18

July 15, 2018

Liked Tweets

https://docs.google.com/spreadsheets/d/e/2PACX-1vRRXndiFoAnxu5A6NThwRBwcWhKuvShUtjL-ol0LE3NfVMW7KC_hC17-512LmJhsMAeZIPm4beRVtIa/pubhtml

My talk

http://lectures.gersteinlab.org/summary/RADAR-Annot-prioritization-var-20180708-i0ismb18/

Links

https://linkstream2.gerstein.info/tag/i0ismb18

The creation and selection of mutations resistant to a gene drive over multiple generations in the malaria mosquito

June 29, 2018

Creation & selection of mutations resistant to a #GeneDrive over mult. generations in the malaria mosquito
http://journals.PLoS.org/plosgenetics/article?id=10.1371/journal.pgen.1007039 Indels created by #NHEJ-based repair of DS breaks give rise to resistance (probably resulting from mis-matches due to
micro-homologies near breakpts)