Posts Tagged ‘dsg’

Machine-learning-assisted modeling: Physics Today: Vol 74, No 7

November 23, 2021

QT : {{”
One of the most successful applications of machine learning to scientific modeling is in MD. Researchers study the properties of materials and molecules by using classical Newtonian dynamics to track the nuclei in a system. One critical issue in MD is how to model the PES that describes the interaction between the nuclei. Traditionally, modelers have dealt with the problem in several ways. One approach, ab initio MD, was developed in 1985 by Roberto Car and Michele
Parrinello7 and computes the interatomic forces on the fly using models based on first principles, such as density functional theory.8 Although the approach accurately describes the system under
consideration, it‘s computationally expensive: The maximum system size that one can handle is limited to thousands of atoms. Another approach uses empirical formulas to model a PES. The method is efficient, but guessing the right formula that can model the PES accurately enough is a difficult task, particularly for complicated systems, such as multicomponent alloys. In 2007 Jörg Behler and Parrinello introduced the idea of using neural networks to model the PES.9 In that new paradigm, a quantum mechanics model generates data that are used to train a neural-network-based PES model.

7. R. Car, M. Parrinello, Phys. Rev. Lett. 55, 2471 (1985).
https://doi.org/10.1103/PhysRevLett.55.2471

9. J. Behler, M. Parrinello, Phys. Rev. Lett. 98, 146401 (2007). https://doi.org/10.1103/PhysRevLett.98.146401
“}}

https://physicstoday.scitation.org/doi/10.1063/PT.3.4793

mentions Parrinelllo classic paper :
#8 has QM & NN (NN models for PES)

Artificial intelligence alone won’t solve the complexity of Earth sciences

September 2, 2019

Artificial intelligence alone won’t solve the complexity of Earth sciences http://www.nature.com/articles/d41586-019-00556-5

Quantifying the impact of public omics data.

August 11, 2019

similar idea to quantifying the value of the data
https://www.ncbi.nlm.nih.gov/pubmed/31383865

A Decade Ago, a Scientist Promised a Brain Simulation in a Decade

August 3, 2019

QT:{{”
“In a recent paper titled “The Scientific Case for Brain Simulations,” several HBP scientists argue that big simulations “will likely be indispensable for bridging the scales between the neuron and system levels in the brain.” In other words: Scientists can look at the nuts and bolts of how neurons work, and they can study the behavior of entire organisms, but they need simulations to show how the former creates the latter. The paper’s authors draw a comparison to weather forecasts, in which an understanding of physics and chemistry at the scale of neighborhoods allows us to accurately predict temperature, rainfall, and wind across the whole globe.”
“}}

https://www.theatlantic.com/science/archive/2019/07/ten-years-human-brain-project-simulation-markram-ted-talk/594493/

Deep learning and process understanding for data-driven Earth system science | Nature

March 4, 2019

https://www.nature.com/articles/s41586-019-0912-1
Perspective | Published: 13 February 2019
Deep learning and process understanding for data-driven Earth system science Markus Reichstein, Gustau Camps-Valls, Bjorn Stevens, Martin Jung, Joachim Denzler, Nuno Carvalhais & Prabhat
Nature volume 566, pages195–204 (2019)

QT:[[”
Figure 3 presents a system-modelling view that seeks to integrate machine learning into a system model. As an alternative perspective, system knowledge can be integrated into a machine learning frame- work. This may include design of the network architecture36,79, physical constraints in the cost function for optimization58, or expansion of the training dataset for undersampled domains (that is, physically based data augmentation)80.

Surrogate modelling or emulation
See Fig. 3 (circle 5). Emulation of the full (or specific parts of) a physical model can be useful for computational efficiency and tractability rea- sons. Machine learning emulators, once trained, can achieve simulations orders of magnitude faster than the original physical model without sacrificing much accuracy. This allows for fast sensitivity analysis, model parameter calibration, and derivation of confidence intervals for the estimates.

(2) Replacing a ‘physical’ sub-model with a machine learning model
See Fig. 3 (circle 2). If formulations of a submodel are of semi-empirical nature, where the functional form has little theoretical basis (for example, biological processes), this submodel can be replaced by a machine learning model if a sufficient number of observations are available. This leads to a hybrid model, which combines the strengths of physical modelling (theoretical foundations, interpretable compartments) and machine learning (data-adaptiveness).

Integration with physical modelling
Historically, physical modelling and machine learning have often been treated as two different fields with very different scientific paradigms (theory-driven versus data-driven). Yet, in fact these approaches are complementary, with physical approaches in principle being directly interpretable and offering the potential of extrapolation beyond observed conditions, whereas data-driven approaches are highly flexible in adapting to data and are amenable to finding unexpected patterns (surprises).

A success story in the geosciences is weather
prediction, which has greatly improved through the integration of better theory, increased computational power, and established observational systems, which allow for the assimilation of large amounts of data into the modelling system2
. Nevertheless, we can accurately predict the evolution
of the weather on a timescale of days, not months.
“]]

# REFs that I liked
ref 80

ref 57
Karpatne, A. et al. Theory-guided data science: a new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 29, 2318–2331 (2017).

# some key BULLETS

• Complementarity of physical & ML approaches
–“Physical approaches in principle being directly interpretable and offering the potential of extrapolation beyond observed conditions, whereas data-driven approaches are highly flexible in adapting to data”

• Hybrid #1: Physical knowledge can be integrated into ML framework –Network architecture
–Physical constraints in the cost function
–Expansion of the training dataset for undersampled domains (ie physically based data augmentation)

• Hybrid #2: ML into physical – eg Emulation of specific parts of a physical for computational efficiency

Artificial intelligence alone won’t solve the complexity of Earth sciences

March 4, 2019

https://www.nature.com/articles/d41586-019-00556-5

Google’s DeepMind aces protein folding | Science | AAAS

December 15, 2018

Thought @RobertFService’s news piece had a good angle on this: AlphaFold won a lot but by small margins
https://www.ScienceMag.org/news/2018/12/google-s-deepmind-aces-protein-folding Google’s DeepMind aces protein folding CC @wgibson

Why “Many-Model Thinkers” Make Better Decisions

November 23, 2018

Why “Many-Model Thinkers” Make Better Decisions
https://HBR.org/2018/11/why-many-model-thinkers-make-better-decisions Intuitive description of #MachineLearning concepts. Focuses on practical business contexts (eg hiring) & explains how #ensemble models & boosting can make better choices

QT:{{”
“The agent based model is not necessarily better. It’s value comes from focusing attention where the standard model does not.

The second guideline borrows the concept of boosting, …Rather than look for trees that predict with high accuracy in isolation, boosting looks for trees that perform well when the forest of current trees does not.

A boosting approach would take data from all past decisions and see where the first model failed. …The idea of boosting is to go searching for models that do best specifically when your other models fail.

To give a second example, several firms I have visited have hired computer scientists to apply techniques from artificial intelligence to identify past hiring mistakes. This is boosting in its purest form. Rather than try to use AI to simply beat their current hiring model, they use AI to build a second model that complements their current hiring model. They look for where their current model fails and build new models to complement it.”
“}}

Cloud computing for genomic data analysis and collaboration | Nature Reviews Genetics

October 30, 2018

https://www.nature.com/articles/nrg.2018.8

A brief history of data science

September 22, 2018

https://twitter.com/YaleData/status/1043196384403443712