Posts Tagged ‘bigdata’

Network analytics in the age of big data | Science

April 2, 2017

#Network analytics in the age of #BigData Emphasizes analyzing connectivity of graph structures (eg motifs) v nodes

To mine the wiring patterns of networked data and uncover the functional organization, it is not enough to consider only simple descriptors, such as the number of interactions that each entity (node) has with other entities (called node degree), because two networks can be identical in such simple descriptors, but have a very different connectivity structure (see the figure). Instead, Benson et al. use higher-order descriptors called graphlets (e.g., a triangle) that are based on small subnetworks obtained on a subset of nodes in the data that contain all interactions that appear in the data (3). They identify network regions rich in instances of a particular graphlet type, with few of the instances of the particular graphlet crossing the boundaries of the regions. If the graphlet type is specified in advance, the method can uncover the nodes interconnected by it, which enabled Benson et al. to group together 20 neurons in the nematode worm neuronal network that are known to control a particular type of movement. In this way, the method unifies the local wiring patterning with higher-order structural modularity imposed by it, uncovering higher-order functional regions in networked data. “}}

Big Data: Astronomical or Genomical?

March 3, 2017

#BigData: Astronomical or Genomical? Est. current storage in EB/yr: Astro .1, omics .1, Twitter .001, YouTube .1-1

“Data storage requirements for all four domains are projected to be enormous. Today, the largest astronomy data center devotes ~100 petabytes to storage, and the completion of the Square Kilometre Array (SKA) project is expected to lead to a storage demand of 1 exabyte per year. YouTube currently requires from 100 petabytes to 1 exabyte for storage and may be projected to require between 1 and 2 exabytes additional storage per year by 2025. Twitter’s storage needs today are estimated at 0.5 petabytes per year, which may increase to 1.5 petabytes in the next ten years. (Our estimates here ignore the “replication factor” that multiplies storage needs by ~4, for redundancy.) For genomics, we have determined more than 100 petabytes of storage are currently used by only 20 of the largest institutions ().”

Public v. Private Polling – PredictWise

November 27, 2016

Public v Private Polling Meta-prediction from extrapolating group characteristics limited; need raw individual data

Big Data’s Mathematical Mysteries | Quanta Magazine

December 18, 2015

#BigData’s Mathematical Mysteries Nice description of unsupervised analysis as ink diffusing from drops

“In the last 15 years or so, researchers have created a number of tools to probe the geometry of these hidden structures. For example, you might build a model of the surface by first zooming in at many different points. At each point, you would place a drop of virtual ink on the surface and watch how it spread out. Depending on how the surface is curved at each point, the ink would diffuse in some directions but not in others. If you were to connect all the drops of ink, you would get a pretty good picture of what the surface looks like as a whole. And with this information in hand, you would no longer have just a collection of data points. Now you would start to see the connections on the surface, the interesting loops, folds and kinks. This would give you a map for how to explore it.”

Most Hyped Tech: Big Data Out, IoT In

July 24, 2015

Core services: Reward bioinformaticians

May 9, 2015

QT:{{"The research system does not recognize bioinformaticians for doing what the scientific community needs most. “People realize the importance, but currently there are no real solutions,” says Xiaole Liu, a bioinformatician at the Dana-Farber Cancer Institute in Boston, Massachusetts, and at Tongji University in Shanghai, China. This is why it can take more than six months to fill positions at a core, why many of biology’s brightest are leaving science for technology companies, and why conventional biologists wait nine months to get help to dissect their data.

Reward bioinformaticians [for collaboration] Despite #bigdata boom, biomedical analysis could be made more appealing

My public notes from the Yale Day of Data (#ydod2014, i0dataday)

September 30, 2014

The Institute for Data Intensive Engineering and Science – The Data-Scope

September 30, 2014

Coppi mentions: JHU’s Data-scope ( ), which has a specialized architecture for astronomical computation #ydod2014

4 PB / yr

What Big Data means to me — Bourne 21 (2): 194 — Journal of the American Medical Informatics Association

September 30, 2014

Bourne mentions “What Big Data means to me”
( ) in connection with the creation of a digital ecosystem #ydod2014

The Parable of Google Flu: Traps in Big Data Analysis

September 29, 2014

Parable of #Google Flu: Traps in #BigData Analysis Replicating results is hard, w/ an ever-changing search algorithm