UVM Theses and Dissertations
Format:
Print
Author:
Chatterjee, Somdeb
Dept./Program:
Computer Science
Year:
2012
Degree:
MS
Abstract:
The process of extracting knowledge and discovering new patterns from large data sets is becoming increasingly automated. Current methods for automating various aspects of knowledge processing require human expertise to specify what data to collect and which potentially predictive variables to study. This work presents a novel approach to machine science where it is demonstrated that non-domain experts can generate and provide values for predictors of a human behavioral outcome. The aforementioned idea is implemented by building a website where the crowd is responsible for both creating a questionnaire by posing new questions and providing responses for questions created by the individual and predecessor users of the site. Also, It was tested whether the crowd can be motivated to participate in the experiment through competition rather than by financial incentives. It was found that the crowd was able to successfully generate many of the known predictors of Body Mass Index.
Results show that: i. Non-domain experts were able to generate useful predictors of a human behavioral outcome. ii. Users can be motivated to contribute to online experiments without any financial incentive. iii. Competition may be useful in improving performance on such a task.
In the current work, a linear model attempts to predict the behavioral outcome using participants' responses. However after the experiment concluded it was found that many of the user-generated predictors are nonlinearly correlated with the outcome. This suggests that nonlinear models might be more useful in future. It would also be of interest to use this method to crowdsource known (and possibly) unknown predictors of many other behavioral outcomes.
Results show that: i. Non-domain experts were able to generate useful predictors of a human behavioral outcome. ii. Users can be motivated to contribute to online experiments without any financial incentive. iii. Competition may be useful in improving performance on such a task.
In the current work, a linear model attempts to predict the behavioral outcome using participants' responses. However after the experiment concluded it was found that many of the user-generated predictors are nonlinearly correlated with the outcome. This suggests that nonlinear models might be more useful in future. It would also be of interest to use this method to crowdsource known (and possibly) unknown predictors of many other behavioral outcomes.