Posts Tagged ‘mynotes0mg’

ISMB 2020 Virtual Event Now Open

July 23, 2020

Liked Tweets

Tagged items

Slide pics

in labdropbox on 22 July 2020

updated attachments: next week: GSP-TOPMed Analysis Workshop Feb 12-13 in NYC

February 23, 2020

my notes


Genome Informatics meeting at CSHL: abstract book and arrival information

November 10, 2019

Various files in labdropbox:

2019-11-06 19.11.56.i0gi19-gi2019-pics.jpg
2019-11-06 22.01.30.i0gi19-gi2019-pics.jpg

Tagged items

AnVIL ECC Meeting (Sep 9 & 10 – Cambridge, MA)

September 10, 2019 archived meeting materials (encrypted) highlighted notes for discussion

Tagged items (with tag i0anv19)

Meeting Materials for Perspectives in Comparative Genomics & Evolution – August 15-16, 2019

August 19, 2019


* MATERIAL in labdropbox


Liked-tweets-from-i0cmp19.pdf + Liked-Tweets-from-i0cmp19.xlsx i0cmp19-slides

* MATERIAL in labdropbox3

Materials from recent conferences – i0bog19 (incl. HGSV), i0ox19, i0recomb19, i0bhi

May 22, 2019


archived as:

## TALKS–20190505-i0recomb19/–20190515-i0ox19/–20190521-i0bhi/


## NOTES (Private)

Archived notes for the lab:

Also, slide pics in labdropbox on 19 & 21 May in folders :


notes from recent meetings – i0mcbios, i0brd19, i0hnb, i0aisoc

April 7, 2019

Deep learning and process understanding for data-driven Earth system science | Nature

March 4, 2019
Perspective | Published: 13 February 2019
Deep learning and process understanding for data-driven Earth system science Markus Reichstein, Gustau Camps-Valls, Bjorn Stevens, Martin Jung, Joachim Denzler, Nuno Carvalhais & Prabhat
Nature volume 566, pages195–204 (2019)

Figure 3 presents a system-modelling view that seeks to integrate machine learning into a system model. As an alternative perspective, system knowledge can be integrated into a machine learning frame- work. This may include design of the network architecture36,79, physical constraints in the cost function for optimization58, or expansion of the training dataset for undersampled domains (that is, physically based data augmentation)80.

Surrogate modelling or emulation
See Fig. 3 (circle 5). Emulation of the full (or specific parts of) a physical model can be useful for computational efficiency and tractability rea- sons. Machine learning emulators, once trained, can achieve simulations orders of magnitude faster than the original physical model without sacrificing much accuracy. This allows for fast sensitivity analysis, model parameter calibration, and derivation of confidence intervals for the estimates.

(2) Replacing a ‘physical’ sub-model with a machine learning model
See Fig. 3 (circle 2). If formulations of a submodel are of semi-empirical nature, where the functional form has little theoretical basis (for example, biological processes), this submodel can be replaced by a machine learning model if a sufficient number of observations are available. This leads to a hybrid model, which combines the strengths of physical modelling (theoretical foundations, interpretable compartments) and machine learning (data-adaptiveness).

Integration with physical modelling
Historically, physical modelling and machine learning have often been treated as two different fields with very different scientific paradigms (theory-driven versus data-driven). Yet, in fact these approaches are complementary, with physical approaches in principle being directly interpretable and offering the potential of extrapolation beyond observed conditions, whereas data-driven approaches are highly flexible in adapting to data and are amenable to finding unexpected patterns (surprises).

A success story in the geosciences is weather
prediction, which has greatly improved through the integration of better theory, increased computational power, and established observational systems, which allow for the assimilation of large amounts of data into the modelling system2
. Nevertheless, we can accurately predict the evolution
of the weather on a timescale of days, not months.

# REFs that I liked
ref 80

ref 57
Karpatne, A. et al. Theory-guided data science: a new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 29, 2318–2331 (2017).

# some key BULLETS

• Complementarity of physical & ML approaches
–“Physical approaches in principle being directly interpretable and offering the potential of extrapolation beyond observed conditions, whereas data-driven approaches are highly flexible in adapting to data”

• Hybrid #1: Physical knowledge can be integrated into ML framework –Network architecture
–Physical constraints in the cost function
–Expansion of the training dataset for undersampled domains (ie physically based data augmentation)

• Hybrid #2: ML into physical – eg Emulation of specific parts of a physical for computational efficiency

My Notes Related to the NHGRI strategic planning meeting

January 29, 2019

MAIN event page–genomic-variation-identification-association-and-function-in-human-health-and-disease

TWEETS related to the event

Archived copy of the above in the labdropbox

Liked-Tweets-Related-NHGRI-strategic-planning-meeting–i0g2p18-genome2020-conf0mg in labdropbox


TAGGED links

Random links related to the PsychENCODE ’18 Rollout

December 23, 2018

Cleaning up stuff from the psychENCODE rollout (my tag “pecrollout”)

* Some “liked” tweets (not exhaustive, mostly positive)

Private archives of the above :

* Tagged articles

* Papers

associated with the Gerstein Lab

Science magazie collection

PEC website collection

* Yale pre-print site

* Random private archived material