Posts Tagged ‘data’

The Era of Borderless Data Is Ending – The New York Times

June 9, 2022

https://www.nytimes.com/2022/05/23/technology/data-privacy-laws.html

Liked the article & esp. the quote: “The core idea of digital sovereignty is that the digital exhaust created by a person…should be stored inside the country where it originated, or at least handled in accordance with privacy & other standards set by a government.”

However, shouldn’t a person (“a digital sovereign”?) have the right to store their data where they see fit – e.g. in another country from where they are?

Big Data’s Promise and Limitations : The New Yorker

May 4, 2013

http://www.newyorker.com/online/blogs/elements/2013/04/steamrolled-by-big-data.html

Facebook ‘Likes’ reveal more about you than you think | Detroit Free Press | freep.com

March 17, 2013

http://www.freep.com/usatoday/article/1975777

Twitter users forming tribes with own language, tweet analysis shows

March 17, 2013

http://m.guardiannews.com/news/datablog/2013/mar/15/twitter-users-tribes-language-analysis-tweets

Thoughts on “A few useful things to know about machine learning”

February 14, 2013

Some thoughts on a good paper giving intuition on machine learning approaches

http://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf
http://dl.acm.org/citation.cfm?id=2347755

In particular, the paper gives good intuition about:

– overfitting (e.g. how it’s related to multiple testing & bias v variance)
– the curse of dimensionality (in high-D all neighbors look the same)
– the non-practicality of theoretical guarantees
– how different frontiers can give the same prediction
– ensembles (which reduce variance greatly without increasing bias that much)
– ensembles vs Bayesian model averaging (which essentially select the best model)

Illumina Platinum Genomes

February 10, 2013

http://www.illumina.com/platinumgenomes/
A family trio (NA12877, NA12878, and NA12882) sequenced on a HiSeq 2000 system. An individual (NA18507) sequenced on a HiSeq 2500 system.

A few useful things to know about machine learning

February 9, 2013

homes.cs.washington.edu/~pedrod/papers/cacm12.pdf
http://dl.acm.org/citation.cfm?id=2347755