Posts Tagged ‘teaching’

Deep learning and process understanding for data-driven Earth system science | Nature

March 4, 2019
Perspective | Published: 13 February 2019
Deep learning and process understanding for data-driven Earth system science Markus Reichstein, Gustau Camps-Valls, Bjorn Stevens, Martin Jung, Joachim Denzler, Nuno Carvalhais & Prabhat
Nature volume 566, pages195–204 (2019)

Figure 3 presents a system-modelling view that seeks to integrate machine learning into a system model. As an alternative perspective, system knowledge can be integrated into a machine learning frame- work. This may include design of the network architecture36,79, physical constraints in the cost function for optimization58, or expansion of the training dataset for undersampled domains (that is, physically based data augmentation)80.

Surrogate modelling or emulation
See Fig. 3 (circle 5). Emulation of the full (or specific parts of) a physical model can be useful for computational efficiency and tractability rea- sons. Machine learning emulators, once trained, can achieve simulations orders of magnitude faster than the original physical model without sacrificing much accuracy. This allows for fast sensitivity analysis, model parameter calibration, and derivation of confidence intervals for the estimates.

(2) Replacing a ‘physical’ sub-model with a machine learning model
See Fig. 3 (circle 2). If formulations of a submodel are of semi-empirical nature, where the functional form has little theoretical basis (for example, biological processes), this submodel can be replaced by a machine learning model if a sufficient number of observations are available. This leads to a hybrid model, which combines the strengths of physical modelling (theoretical foundations, interpretable compartments) and machine learning (data-adaptiveness).

Integration with physical modelling
Historically, physical modelling and machine learning have often been treated as two different fields with very different scientific paradigms (theory-driven versus data-driven). Yet, in fact these approaches are complementary, with physical approaches in principle being directly interpretable and offering the potential of extrapolation beyond observed conditions, whereas data-driven approaches are highly flexible in adapting to data and are amenable to finding unexpected patterns (surprises).

A success story in the geosciences is weather
prediction, which has greatly improved through the integration of better theory, increased computational power, and established observational systems, which allow for the assimilation of large amounts of data into the modelling system2
. Nevertheless, we can accurately predict the evolution
of the weather on a timescale of days, not months.

# REFs that I liked
ref 80

ref 57
Karpatne, A. et al. Theory-guided data science: a new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 29, 2318–2331 (2017).

# some key BULLETS

• Complementarity of physical & ML approaches
–“Physical approaches in principle being directly interpretable and offering the potential of extrapolation beyond observed conditions, whereas data-driven approaches are highly flexible in adapting to data”

• Hybrid #1: Physical knowledge can be integrated into ML framework –Network architecture
–Physical constraints in the cost function
–Expansion of the training dataset for undersampled domains (ie physically based data augmentation)

• Hybrid #2: ML into physical – eg Emulation of specific parts of a physical for computational efficiency

(4) MIT Computational Biology: Genomes, Networks, Evolution, Health – Fall 2018 – 6.047/6.878/HST.507 – YouTube

December 22, 2018

Explaining Odds Ratios

November 16, 2018

Calendar adjustments and course demand

September 3, 2018

course stats

specifically, for 752:

UCI-CCBS: Big Data Image Processing & Analysis Workshop Course–UC Irvine (please share)

June 26, 2018

UC Irvine’s Center for Complex Biological Systems is pleased to announce the annual short course in Big Data Image Processing & Analysis (BigDIPA), September 17-21, 2018.

This 1-week workshop course is geared towards graduate students, postdocs, faculty and industry professionals with research interests in navigating, manipulating and extracting information from “Big Data” image sources. The course is designed to cover the complete “vertical integration” of the image data to knowledge pipeline.

The course will provide a mix of strategies for dealing with biological/biomedical big data image sources, using examples of image analyses drawn from advanced cell fluorescence microscopy techniques and neurobiology to highlight fundamental concepts and skills. Processing and analysis techniques will be generalizable and relevant to other model systems and biomedical input data sources.

For more information and to apply please visit:

Comparing Classifiers · Martin Thoma

June 6, 2018

Great talk today @Yale by @MooreJH. He describes flow of calculations in biomed. #DataScience, including feature construction, machine learning & downstream interpretation.

Great slide on ML derived from

Points of significance: Machine learning: supervised methods

March 3, 2018

Points of significance – #MachineLearning: supervised methods Nice discussion of the k in k-NN & the slack parm. C, penalizing misclassified points in SVM — both which act somewhat analogously as regularizers. Good for #teaching


December 23, 2017

Great #movie introducing Evo-Devo by @AcapellaScience Lots of complex concepts (cis-reg elements to gradients & patterning) summarized in seconds via @hoondy

Naive Bayes Classification explained with Python code

May 15, 2017

Naive #Bayes Classification explained with Python code Nice worked example; good for #teaching HT @KirkDBorne

Learning and earning: Lifelong learning is becoming an economic imperative | The Economist

April 8, 2017

Lifelong Learning Future for colleges? Microcredentails & Nanodegrees inspired by albums unbundled into iTunes songs

interesting view of where short “workshops” fit relative to the traditional course

Scott DeRue, the dean of the Ross School of Business at the University of Michigan, says the unbundling of educational content into smaller components reminds him of another industry: music. Songs used to be bundled into albums before being disaggregated by iTunes and streaming services such as Spotify. In Mr DeRue’s analogy, the degree is the album, the course content that is freely available on MOOCs is the free streaming radio service, and a “microcredential” like the nanodegree or the specialisation is paid-for iTunes.

How should universities respond to that kind of disruption? For his answer, Mr DeRue again draws on the lessons of the music industry. Faced with the disruption caused by the internet, it turned to live concerts, which provided a premium experience that cannot be replicated online. The on-campus degree also needs to mark itself out as a premium experience, he says.