Posts Tagged ‘privacy’

IBM Research: Preserving Validity in Adaptive Data Analysis

September 23, 2015

Preserving Validity in Adaptive Data Analysis http://ibmresearchnews.blogspot.com/2015/08/preserving-validity-in-adaptive-data_6.html Using differential #privacy for correct #stats even w/ test-set reuse

QT:{{"
“A common next step would be to use the least-squares linear regression to check whether a simple linear combination of the three strongly correlated foods can predict the grade. It turns out that a little combination goes a long way: we discover that a linear combination of the three selected foods can explain a significant fraction of variance in the grade (plotted below). The regression analysis also reports that the p-value of this result is 0.00009 meaning that the probability of this happening purely by chance is less than 1 in 10,000.

Recall that no relationship exists in the true data distribution, so this discovery is clearly false. This spurious effect is known to experts as Freedman’s paradox. It arises since the variables (foods) used in the regression were chosen using the data itself.


We found that challenges of adaptivity can be addressed using techniques developed for privacy-preserving data analysis. These techniques rely on the notion of differential privacy that guarantees that the data analysis is not too sensitive to the data of any single individual. We rigorously demonstrated that ensuring differential privacy of an analysis also guarantees that the findings will be statistically valid. We then also developed additional approaches to the problem based on a new way to measure how much information an analysis reveals about a dataset.

The Thresholdout Algorithm

Using our new approach we designed an algorithm, called Thresholdout, that allows an analyst to reuse the holdout set of data for validating a large number of results, even when those results are produced by an adaptive analysis.

"}}

Isp-fellows another privacy tool

August 28, 2015

Browse More Privately With the Privacy Badger

Jason B. Jones looks at a new plug-in from the Electronic Frontier Foundation that blocks companies from tracking your behavior across multiple sites.

http://chronicle.com/blogs/profhacker/browse-more-privately-with-the-privacy-badger/60825

We’ll see you, anon

August 28, 2015

We’ll see you, anon
http://www.economist.com/news/science-and-technology/21660966-can-big-databases-be-kept-both-anonymous-and-useful-well-see-you-anon “A dilemma. People want perfect #privacy & all the benefits of openness.” Math to the rescue?

Good for an intro. on privacy & attacks

QT:{{”

This is a true dilemma. People want both perfect privacy and all the benefits of openness. But they cannot have both.

“While some level of anonymisation will remain part of any resolution of the dilemma, mathematics may change the overall equation. One approach that would shift the balance to the good is homomorphic encryption, whereby queries on an encrypted data set are themselves encrypted. The result of any inquiry is the same as the one that would have been obtained using a standard query on the unencrypted database, but the questioner never sets eyes on the data. Or there is secure multiparty computation, in which a database is divided among several repositories. Queries are thus divvied up so that no one need have access to the whole database.

These approaches are, on paper, absolute in their protections. But putting them to work on messy, real-world data is proving tricky. Another set of techniques called differential privacy seems further ahead. The idea behind it is to ensure results derived from a database would look the same whether a given individual’s data were in it or not. It works by adding a bit of noise to the data in a way that does not similarly fuzz out the statistical results.


America’s Census Bureau has used differential privacy in the past for gathering commuters’ data. Google is employing it at the moment as part of a project in which a browser plug-in gathers lots of data about a user’s software, all the while guaranteeing anonymity. Cynthia Dwork, a differential-privacy pioneer at Microsoft Research, suggests a more high-profile proving ground would be data sets—such as some of those involving automobile data or genomes—that have remained locked up because of privacy concerns.”
“}}

‘Devious Defecator’ Case Tests Genetics Law

June 5, 2015

Devious Defecator Case Tests Genetics Law http://www.nytimes.com/2015/06/02/health/devious-defecator-case-tests-genetics-law.html Non-obvious outcome of GINA protects employees from non-medical DNA testing

NSA Snooping Was Only the Beginning. Meet the Spy Chief Leading Us Into Cyberwar | WIRED

May 25, 2015

#NSA Snooping Was Only the Beginning
http://www.wired.com/2013/06/general-keith-alexander-cyberwar/ Overview of the activities of Alexander the Geek, spy master behind #stuxnet

As A Major Retraction Shows, We’re All Vulnerable To Faked Data

May 22, 2015

Major #Retraction Shows…Vulnerab[ility] To Faked Data
http://fivethirtyeight.com/datalab/as-a-major-retraction-shows-were-all-vulnerable-to-faked-data Highlights tension betw private data & #reproducibleresearch

Microbiome Fingerprints | The Scientist Magazine(R)

May 17, 2015

http://www.the-scientist.com/?articles.view/articleNo/42950/title/Microbiome-Fingerprints/

QT:{{”

As microbiome signatures mature, law enforcement or intelligence agents could theoretically track people by looking for traces of them left in the microbes they shed. Mark Gerstein, who studies biomedical informatics at Yale University and was not involved in the new study, suggested, for instance, that one could imagine tracking a terrorist’s movements through caves using their microbiome signature.

Huttenhower and his colleagues were identifying individuals out of pools of just hundreds of project participants, however. It is currently unclear how well the algorithm will perform when applied to the general population, though the researchers estimate that their code could likely pick someone out from a group of 500 to 1,000. “I would expect that number to get bigger in the future as we get more data and better data and better coding strategies,” Huttenhower said.

But the work raises privacy concerns similar to those faced by scientists gather human genomic data. Microbiome researchers are already wary of the human genomic DNA that gets caught up in microbiome sequences, but it increasingly appears that the microbiome sequences themselves are quite personal.

In the genomics field, researchers have increasingly limited access to databases containing human genomic sequencing data. Researchers must apply to use these data. “People might increasingly want to put the microbiome data under the same type of protection that they put normal genomic variants under,” said Gerstein. “Your microbiome is associated with various disease risks and proclivities for X and Y. I don’t think it’s a completely neutral identification. It potentially says things about you.”

“}}

Identifying personal microbiomes using metagenomic codes

May 17, 2015

Identifying personal microbiomes using metagenomic codes
http://www.pnas.org/content/early/2015/05/08/1423854112.abstract Pot. tracking & #privacy implications
http://www.the-scientist.com/?articles.view/articleNo/42950/title/Microbiome-Fingerprints

http://www.pnas.org/content/early/2015/05/08/1423854112.abstract

doi: 10.1073/pnas.1423854112

Identifying personal microbiomes using metagenomic codes

Eric A. Franzosa
Katherine Huang
James F. Meadow
Dirk Gevers
Katherine P. Lemond
Brendan J. M. Bohannanc
Curtis Huttenhower

Longitudinal analysis of microbial interaction between humans and the indoor environment

May 3, 2015

Microbial interaction betw humans & the indoor environment http://www.sciencemag.org/content/345/6200/1048.abstract
Unique personal signatures w/ implications for #forensics

Places change to conform to signature…..

Summarizing 4 conferences last week: AACR ’15, ISEV ’15, BioIT ’15 & ICEBEM 2015

April 28, 2015

AACR 2015
http://www.aacr.org/Meetings/Pages/MeetingDetail.aspx?EventItemID=25#.VT8JXa1Viko https://linkstream2.gerstein.info/tag/i0pcawg15/

ISEV/ERCC Education Day – ISEV – International Society for
Extracellular Vesicles
http://www.isevmeeting.org/isevercc-education-day.html
https://linkstream2.gerstein.info/tag/i0isev/

2015 Bio-IT World Conference & Expo
http://www.bio-itworldexpo.com/
https://linkstream2.gerstein.info/tag/i0bioit15/
http://lectures.gersteinlab.org/summary/Progressive-summarization-large-scale-data-interpret-cancer–20150423-i0bioIT15/

8th International Conference on Ethics in Biology, Engineering & Medicine (ICEBEM 2015)
http://www.downstate.edu/orthopaedics/bioethics/
http://lectures.gersteinlab.org/summary/Soc-n-Tech-Soln-to-Privacy-in-Personal-Genomics–20150424-i0icebem15/

Tweets for all of them
https://storify.com/markgerstein/favorite-tweets-from-bioit-15-aacr-15-and-isev-15-16