- From: Adrian Walker <adriandwalker@gmail.com>
- Date: 2012年7月20日 15:34:03 -0400
- To: David Booth <david@dbooth.org>
- Cc: Helena Deus <helena.deus@deri.org>, Melvin Carvalho <melvincarvalho@gmail.com>, "nathan@webr3.org" <nathan@webr3.org>, Michael Hausenblas <michael.hausenblas@deri.org>, "semantic-web@w3.org" <semantic-web@w3.org>, "public-lod@w3.org" <public-lod@w3.org>, "www-rdf-interest@w3.org" <www-rdf-interest@w3.org>, "protege-discussion@lists.stanford.edu" <protege-discussion@lists.stanford.edu>, "semanticweb@yahoogroups.com" <semanticweb@yahoogroups.com>, "dbworld@cs.wisc.edu" <dbworld@cs.wisc.edu>, "machine-learning@egroups.com" <machine-learning@egroups.com>, "taverna-users@lists.sourceforge.net" <taverna-users@lists.sourceforge.net>, "bbb@bioinformatics.org" <bbb@bioinformatics.org>
- Message-ID: <CABbsESfC-9oN3AutjZyy8yPWpX=CQf12yUCMbqA9ypLNzKqZtQ@mail.gmail.com>
Hi All, Stefan Decker wrote: > The discussion seem to point to a deeper question: how to enable crowd > sourcing of the analysis of these kind of data sets? This may involve > running of analysis code or maybe even manual work. > What kind of computational infrastructure would we need to enable > this? And how do we validate and aggregate results? There is a system online [1] for crowdsourcing data analysis knowledge in Executable English , with examples, such as [2]. The knowledge is used to answer questions over web databases, with English explanations of the results for validation. In some cases, the explanations can be used as plans. [3] is a short overview paper, and besides the live system [1], there are several presentations, movies etc on the site. Apologies if you have seen this before, and thanks for comments. -- Adrian [1] Internet Business Logic A Wiki and SOA Endpoint for Executable Open Vocabulary English Q/A over SQL and RDF Online at www.reengineeringllc.com Shared use is free, and there are no advertisements [2] www.reengineeringllc.com/demo_agents/MedMine2.agent [3] www.reengineeringllc.com/A_Wiki_for_Business_Rules_in_Open_Vocabulary_Executable_English.pdf On Fri, Jul 20, 2012 at 10:00 AM, David Booth <david@dbo <david@dbooth.org> oth.org <david@dbooth.org>> wrote: > On Fri, 2012年07月20日 at 10:22 +0100, Stefan Decker wrote: > > The discussion seem to point to a deeper question: how to enable crowd > > sourcing of the analysis of these kind of data sets? This may involve > > running of analysis code or maybe even manual work. > > What kind of computational infrastructure would we need to enable > > this? And how do we validate and aggregate results? > > Unfortunately, in the USA at least, the biggest barriers are not > technical, but social, because: (a) health information privacy laws such > as HIPAA > http://www.hhs.gov/ocr/privacy/ > make it difficult or impossible to publish the raw data that would be > most useful for research; and (b) researchers do not have the incentive > to publish their data that might allow other researchers to make > discoveries. > > There is a tension between privacy and the usefulness of data for > research, because full de-identification removes information that can be > critical to determining cause and effect, such as dates, times and > locations. > > We need better ways -- both bottom-up, such as http://weconsent.us/, and > top-down, such as legal changes -- to both encourage the availability of > research data and to facilitate appropriate access to it, such as > establishing well-defined tiers of access for different purposes. > > We need technical solutions that will help us work through and around > these social barriers. > > > -- > David Booth, Ph.D. > http://dbooth.org/ > > Opinions expressed herein are those of the author and do not necessarily > reflect those of his employer. > > >
Received on Friday, 20 July 2012 19:34:40 UTC