InterPoll: Crowd-Sourced Internet Polls

Authors Benjamin Livshits, Todd Mytkowicz



PDF
Thumbnail PDF

File

LIPIcs.SNAPL.2015.156.pdf
  • Filesize: 0.94 MB
  • 21 pages

Document Identifiers

Author Details

Benjamin Livshits
Todd Mytkowicz

Cite AsGet BibTex

Benjamin Livshits and Todd Mytkowicz. InterPoll: Crowd-Sourced Internet Polls. In 1st Summit on Advances in Programming Languages (SNAPL 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 32, pp. 156-176, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)
https://doi.org/10.4230/LIPIcs.SNAPL.2015.156

Abstract

Crowd-sourcing is increasingly being used to provide answers to online polls and surveys. However, existing systems, while taking care of the mechanics of attracting crowd workers, poll building, and payment, provide little to help the survey-maker or pollster in obtaining statistically significant results devoid of even the obvious selection biases. This paper proposes InterPoll, a platform for programming of crowd-sourced polls. Pollsters express polls as embedded LINQ queries and the runtime correctly reasons about uncertainty in those polls, only polling as many people as required to meet statistical guarantees. To optimize the cost of polls, InterPoll performs query optimization, as well as bias correction and power analysis. The goal of InterPoll is to provide a system that can be reliably used for research into marketing, social and political science questions. This paper highlights some of the existing challenges and how InterPoll is designed to address most of them. In this paper we summarize some of the work we have already done and give an outline for future work.
Keywords
  • CrowdSourcing
  • Polling
  • LINQ

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Sarah Anderson, Sarah Wandersee, Ariana Arcenas, and Lynn Baumgartner. Craigslist samples of convenience: recruiting hard-to-reach populations. Unpublished. Google Scholar
  2. D Andrews, B Nonnecke, and J Preece. Electronic survey methodology: A case study in reaching hard-to-involve Internet users. International Journal of \ldots, 2003. Google Scholar
  3. J Antin and A Shaw. Social desirability bias and self-reports of motivation: a study of Amazon Mechanical Turk in the US and India. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2012. Google Scholar
  4. Daniel Barowy, Charlie Curtsinger, Emery Berger, and Andrew McGregor. AutoMan: A platform for integrating human-based and digital computation. Proceedings of the ACM international conference on Object oriented programming systems languages and applications - OOPSLA'12, page 639, January 2012. Google Scholar
  5. T S Behrend, D J Sharek, and A W Meade. The viability of crowdsourcing for survey research. Behavior research methods, January 2011. Google Scholar
  6. A Berinsky, G Huber, and G Lenz. Evaluating Online Labor Markets for Experimental Research: Amazon.com’s Mechanical Turk. Political Analysis, 20(3):351-368, July 2012. Google Scholar
  7. Adam J AJ Berinsky, Gregory A GA Huber, and Gabriel S Lenz. Using mechanical Turk as a subject recruitment tool for experimental research. Typescript, Yale, pages 1-26, 2010. Google Scholar
  8. Samuel J. Best and Brian S. Krueger. Exit Polls: Surveying the American Electorate, 1972-2010. CQ Press, 2012. Google Scholar
  9. M Bleja, T Kowalski, and K Subieta. Optimization of object-oriented queries through rewriting compound weakly dependent subqueries. Database and Expert Systems, pages 1-8, January 2010. Google Scholar
  10. James Bornholt, Todd Mytkowicz, and Kathryn S. McKinley. Uncertain<T>: A First-order Type for Uncertain Data. SIGARCH Comput. Archit. News, 42(1):51-66, 2014. Google Scholar
  11. M Buhrmester and T Kwang. Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data? on Psychological Science, January 2011. Google Scholar
  12. Trent D Buskirk, D Ph, and Charles Andrus. Online Surveys Aren't Just for Computers Anymore! Exploring Potential Mode Effects between Smartphone and Computer-Based Online Surveys. American Statistical Association (ASA), events and resources for statisticians, educators, students, pages 5678-5691, 2010. Google Scholar
  13. J Chandler, P Mueller, and G Paolacci. Methodological concerns and advanced uses of crowdsourcing in psychological research. Behavioral Research, 2013. Google Scholar
  14. James Cheney, Sam Lindley, and Philip Wadler. A practical theory of language-integrated query. Proceedings of the 18th ACM SIGPLAN international conference on Functional programming - ICFP'13, page 403, January 2013. Google Scholar
  15. D. L. Clancy and K. Phillips J. Some effects of "Social desirability" in survey studies. The American Journal of Sociology, 77(5):921-940, 1972. Google Scholar
  16. Christopher Cooper, David M McCord, and Alan Socha. Evaluating the college sophomore problem: the case of personality and politics. Journal of Psychology, 145(1):23-37, 2011. Google Scholar
  17. M Couper. Designing effective web surveys, 2008. Google Scholar
  18. M P Couper. Review: Web surveys: A review of issues and approaches. The Public Opinion Quarterly, pages 1-31, January 2000. Google Scholar
  19. Franco Curmi and Maria Angela Ferrario. Online sharing of live biometric data for crowd-support: Ethical issues from system design. Unpublished, 2013. Google Scholar
  20. Nilesh Dalvi, Christopher Ré, and Dan Suciu. Probabilistic Databases: Diamonds in the Dirt. Communications of the ACM, 2009. Google Scholar
  21. D Dillman, R Tortora, and D Bowker. Principles for constructing Web surveys. Unpublished, 1998. Google Scholar
  22. M Duda and J Nobile. The fallacy of online surveys: No data are better than bad data. Human Dimensions of Wildlife, 2010. Google Scholar
  23. B Duffy, K Smith, and G Terhanian. Comparing data from online and face-to-face surveys. International Journal of, January 2005. Google Scholar
  24. Justin Ellis. How Google is quietly experimenting in new ways for readers to access publishers' content, 2011. Google Scholar
  25. Joel Evans, New Hempstead, and Anil Mathur. The value of online surveys. Internet Research, 15(2):195-219, January 2005. Google Scholar
  26. Jeremy Eysenbach, Gunther Eysenbach, and Jeremy Wyatt. Using the Internet for Surveys and Health Research. Journal of Medical Internet Research, 4(2):e13, January 2002. Google Scholar
  27. Emma Ferneyhough. Crowdsourcing Anxiety and Attention Research, 2012. Google Scholar
  28. K Fort, G Adda, and K B Cohen. Amazon Mechanical Turk: Gold mine or coal mine? Computational Linguistics, pages 1-8, January 2011. Google Scholar
  29. Michael Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, and Reynold Xin. CrowdDB: answering queries with crowdsourcing. SIGMOD'11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 1-12, June 2011. Google Scholar
  30. S Fricker, M Galesic, R Tourangeau, and T Yan. An experimental comparison of web and telephone surveys. Public Opinion Quarterly, 2005. Google Scholar
  31. M Fuchs. Mobile Web Survey: A preliminary discussion of methodological implications. Envisioning the survey interview of the future, January 2008. Google Scholar
  32. Marek Fuchs and Britta Busse. The Coverage Bias of Mobile Web Surveys Across European Countries. International Journal of Internet Science, 4(1):21-33, 2009. Google Scholar
  33. Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. Bayesian Data Analysis. CRC Press, 3rd edition, 2014. Google Scholar
  34. Andrew D Gordon, Johannes Borgstr, Nicolas Rolland, and John Guiver. Tabular: A Schema-Driven Probabilistic Programming Language. Technical report, Microsoft Research, 2013. Google Scholar
  35. Samuel Gosling, Simine Vazire, Sanjay Srivastava, and Oliver John. Should we trust web-based studies? A comparative analysis of six preconceptions about Internet questionnaires. American Psychologist, 59(2):93-104, January 2004. Google Scholar
  36. R.M. Groves. Survey Errors and Survey Costs. Wiley Series in Probability and Statistics. Wiley, 1989. Google Scholar
  37. Robert M. Groves, Floyd J. Fowler Jr., Mick P. Couper, James M. Lepkowski, Eleanor Singer, and Roger Tourangeau. Survey Methodology. Wiley, 2009. Google Scholar
  38. Torsten Grust, Jan Rittinger, and Tom Schreiber. Avalanche-safe LINQ compilation. Proceedings of the VLDB Endowment, 3(1-2):162-172, September 2010. Google Scholar
  39. H Gunn. Web-based surveys: Changing the survey process. First Monday, 2002. Google Scholar
  40. Joseph Henrich, Steven J Heine, and Ara Norenzayan. The weirdest people in the world? The Behavioral and brain sciences, 33(2-3):61-83; discussion 83-135, June 2010. Google Scholar
  41. HubSpot and SurveyMonkey. Using online surveys in your marketing. Unpublished. Google Scholar
  42. P G Ipeirotis. Analyzing the Amazon Mechanical Turk marketplace. XRDS: Crossroads, January 2010. Google Scholar
  43. P G Ipeirotis. Demographics of Mechanical Turk. 2010, January 2010. Google Scholar
  44. Floyd J. Fowler Jr. Survey Research Methods (4th ed.). SAGE Publications, Inc., 4 edition, 2009. Google Scholar
  45. R Jurca and B Faltings. Incentives for expressing opinions in online polls. Proceedings of the ACM Conference on Electronic Commerce, 2008. Google Scholar
  46. Adam Kapelner and Dana Chandler. Preventing Satisficing in Online Surveys : A "Kapcha" to Ensure Higher Quality Data. CrowdConf, 2010. Google Scholar
  47. S Keeter. The impact of cell phone noncoverage bias on polling in the 2004 presidential election. Public Opinion Quarterly, 2006. Google Scholar
  48. Scott Keeter, Leah Christian, and Senior Researcher. A Comparison of Results from Surveys by the Pew Research Center and Google Consumer Surveys. http://www.people-press.org/files/legacy-pdf/11-7-12 Google Methodology paper.pdf, 2012. Google Scholar
  49. P Kellner. Can online polls produce accurate findings? International Journal of Market Research, 2004. Google Scholar
  50. A Kittur, E H Chi, and B Suh. Crowdsourcing user studies with Mechanical Turk. Proceedings of the SIGCHI conference on, January 2008. Google Scholar
  51. Aniket Kittur, Susheel Khamkar, Paul André, and Robert Kraut. CrowdWeaver: Visually Managing Complex Crowd Work. Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW'12, page 1033, January 2012. Google Scholar
  52. R Kosara and C Ziemkiewicz. Do Mechanical Turks dream of square pie charts? Proceedings Beyond time and errors: novel evaLuation methods for Information Visualization, 2010. Google Scholar
  53. Robert Kraut, Judith Olson, Mahzarin Banaji, Amy Bruckman, Jeffrey Cohen, and Mick Couper. Psychological Research Online: Report of Board of Scientific Affairs' Advisory Group on the Conduct of Research on the Internet. American Psychologist, 59(2):105-117, January 2004. Google Scholar
  54. Robert E Kraut. CrowdForge : Crowdsourcing Complex Work. UIST, pages 43-52, 2011. Google Scholar
  55. Paul Krugman. What People (Don’t) Know About The Deficit, April 2013. Google Scholar
  56. A Kulkarni, M Can, and B Hartmann. Collaboratively crowdsourcing workflows with turkomatic. Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, January 2012. Google Scholar
  57. A P Kulkarni, M Can, and B Hartmann. Turkomatic: automatic recursive task and workflow design for mechanical turk. CHI'11 Extended Abstracts on Human, January 2011. Google Scholar
  58. G Little, L B Chilton, M Goldman, and R C Miller. TurKit: tools for iterative tasks on Mechanical Turk. Proceedings of UIST, pages 1-2, January 2009. Google Scholar
  59. Benjamin Livshits and George Kastrinis. Optimizing human computation to save time and money. Technical Report MSR-TR-2014-145, Microsoft Research, November 2014. Google Scholar
  60. Benjamin Livshits and Todd Mytkowicz. Saving money while polling with interpoll using power analysis. In In Proceedings of the Conference on Human Computation and Crowdsourcing (HCOMP 2014), November 2014. Google Scholar
  61. A Marcus, E Wu, Karger, S R Madden, and R C Miller. Crowdsourced databases: Query processing with people. 2011, January 2011. Google Scholar
  62. Adam Marcus, David Karger, Samuel Madden, Robert Miller, and Sewoong Oh. Counting with the crowd. Proceedings of the VLDB Endowment ,, 6(2), December 2012. Google Scholar
  63. Adam Marcus, Eugene Wu, David Karger, Samuel Madden, and Robert Miller. Human-powered sorts and joins. Proceedings of the VLDB Endowment ,, 5(1), September 2011. Google Scholar
  64. W Mason and S Suri. Conducting behavioral research on Amazon’s Mechanical Turk. Behavior research methods, January 2012. Google Scholar
  65. Joe Mayo. LINQ Programming. McGraw-Hill Osborne Media, 1 edition, 2008. Google Scholar
  66. Paul Mcdonald, Matt Mohebbi, and Brett Slatkin. Comparing Google Consumer Surveys to Existing Probability and Non-Probability Based Internet Surveys. URL: http://www.google.com/insights/consumersurveys/static/consumer_surveys_whitepaper.pdf.
  67. Patrick Minder, Sven Seuken, Abraham Bernstein, and Mengia Zollinger. CrowdManager - Combinatorial Allocation and Pricing of Crowdsourcing Tasks with Time Constraints. Workshop on Social Computing and User Generated Content in conjunction with ACM Conference on Electronic Commerce (ACM-EC 2012), 2012. Google Scholar
  68. Derek Murray, Michael Isard, and Yuan Yu. Steno: automatic optimization of declarative queries. Proceedings of the Conference on Programming Language Design and Implementation, pages 1-11, June 2011. Google Scholar
  69. Venkata Nerella, Sanjay Madria, and Thomas Weigert. An Approach for Optimization of Object Queries on Collections Using Annotations. 2013 17th European Conference on Software Maintenance and Reengineering, pages 273-282, March 2013. Google Scholar
  70. Daniel M. Oppenheimer, Tom Meyvis, and Nicolas Davidenko. Instructional manipulation checks: Detecting satisficing to increase statistical power. Journal of Experimental Social Psychology, 45(4):867-872, July 2009. Google Scholar
  71. G Paolacci, J Chandler, and P G Ipeirotis. Running experiments on Amazon Mechanical Turk. Judgment and Decision, January 2010. Google Scholar
  72. Pew Research Center. Demographics of Internet users, 2013. Google Scholar
  73. Steven J Phillips, Miroslav Dudík, Jane Elith, Catherine H Graham, Anthony Lehmann, John Leathwick, and Simon Ferrier. Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data. Ecological applications : a publication of the Ecological Society of America, 19(1):181-97, January 2009. Google Scholar
  74. P Podsakoff, S MacKenzie, and J Lee. Common method biases in behavioral research: a critical review of the literature and recommended remedies. Journal of Applied Psychology, 88(5):879-903, 2003. Google Scholar
  75. Ramo and S M Hall. Reaching young adult smokers through the Internet: Comparison of three recruitment mechanisms. Nicotine & Tobacco, January 2010. Google Scholar
  76. D Ramo, S Hall, and J Prochaska. Reliability and validity of self-reported smoking in an anonymous online survey with young adults. Health Psychology, 2011. Google Scholar
  77. Allan Roshwalb, Neal El-Dash, and Clifford Young. Toward the use of Bayesian credibility intervals in online survey results. http://www.ipsos-na.com/knowledge-ideas/public-affairs/points-of-view/?q=bayesian-credibility-interval, 2012.
  78. J Ross, A Zaldivar, L Irani, B Tomlinson, and M Silberman. Who are the crowdworkers?: shifting demographics in Mechanical Turk. CHI'10 Extended, January 2009. Google Scholar
  79. Matthew Salganik and Karen Levy. Wiki surveys: Open and quantifiable social data collection. http://arxiv.org/abs/1202.0500, February 2012. Google Scholar
  80. L Sax, S Gilmartin, and A Bryant. Assessing response rates and nonresponse bias in web and paper surveys. Research in higher education, 2003. Google Scholar
  81. L Schmidt. Crowdsourcing for human subjects research. Proceedings of CrowdConf, 2010. Google Scholar
  82. M Schonlau, A Soest, A Kapteyn, and M Couper. Selection bias in Web surveys and the use of propensity scores. Sociological Methods & Research, 37(3):291-318, February 2009. Google Scholar
  83. G Schueller and A Behrend. Stream Fusion using Reactive Programming, LINQ and Magic Updates. Proceedings of the International Conference on Information Fusion, pages 1-8, January 2013. Google Scholar
  84. S Sills and C Song. Innovations in survey research an application of web-based surveys. Social science computer review, 2002. Google Scholar
  85. Cindy D. Simasa2 and Elizabeth N. Kama. Risk Orientations and Policy Frames. The Journal of Politics, 72(2), 2010. Google Scholar
  86. Martha Sinclair, Joanne O'Toole, Manori Malawaraarachchi, and Karin Leder. Comparison of response rates and cost-effectiveness for a community-based survey: postal, internet and telephone modes with generic or personalised recruitment approaches. BMC medical research methodology, 12(1):132, January 2012. Google Scholar
  87. Nick Sparrow. Developing Reliable Online Polls. International Journal of Market Research, 48(6), 2006. Google Scholar
  88. Robin Sprou. Exit Polls: Better or Worse Since the 2000 Election? Joan Shorestein Center on the Press, Politics and Public Policy, 2008. Google Scholar
  89. J Sprouse. A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory. Behavior research methods, January 2011. Google Scholar
  90. L B Stephenson and J Crête. Studying political behavior: A comparison of Internet and telephone surveys. International Journal of Public Opinion Research, January 2011. Google Scholar
  91. SurveyMonkey. Data Quality: Measuring the Quality of Online Data Sources. http://www.slideshare.net/SurveyMonkeyAudience/surveymonkey-audience-data-quality-whitepaper-september-2012, 2012. Google Scholar
  92. SurveyMonkey. Market Research Survey; Get to know your customer, grow your business, 2013. Google Scholar
  93. M Swan. Crowdsourced health research studies: an important emerging complement to clinical trials in the public health research ecosystem. Journal of Medical Internet Research, January 2012. Google Scholar
  94. Melanie Swan. Scaling crowdsourced health studies : the emergence of a new form of contract research organization. Personalized Medicine, 9:223-234, 2012. Google Scholar
  95. Swati Tawalare and S Dhande. Query Optimization to Improve Performance of the Code Execution. Computer Engineering and Intelligent Systems, 3(1):44-52, January 2012. Google Scholar
  96. Emma Tosch and Emery D. Berger. Surveyman: Programming and automatically debugging surveys. In Proceedings of Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA'14, 2014. Google Scholar
  97. Roger Tourangeau, Frederick G. Conrad, and Mick P. Couper. The Science of Web Surveys. Oxford University Press, 2013. Google Scholar
  98. H L Truong, S Dustdar, and K Bhattacharya. Programming hybrid services in the cloud. Service-Oriented Computing, pages 1-15, January 2012. Google Scholar
  99. Amos Tversky and Daniel Kahneman. The Framing of Decisions and the Psychology of Choice The Framing of Decisions and the Psychology of Choice. Science, 211(4481):453-458, 1981. Google Scholar
  100. US Census. Current population survey, October 2010, school enrollment and Internet use supplement file, 2010. Google Scholar
  101. USamp. Panel Book 2013. 2013. Google Scholar
  102. A. Wald. Sequential tests of statistical hypotheses. The Annals of Mathematical Statistics, 16(2):117-186, 06 1945. Google Scholar
  103. Fabian L Wauthier and Michael I Jordan. Bayesian Bias Mitigation for Crowdsourcing. Neural Information Processing Systems Conference, pages 1-9, 2011. Google Scholar
  104. R W White. Beliefs and Biases in Web Search. 2013, January 2013. Google Scholar
  105. K Wright. Researching Internet-Based Populations: Advantages and Disadvantages of Online Survey Research, Online Questionnaire Authoring Software Packages, and Web Survey Services. Journal of Computer-Mediated Communication, 2005. Google Scholar
  106. J Wyatt. When to use web-based surveys. Journal of the American Medical Informatics Association, 2000. Google Scholar
  107. D Yeager, J Krosnick, L Chang, and H Javitz. Comparing the accuracy of RDD telephone surveys and internet surveys conducted with probability and non-probability samples. Public Opinion Quarterly, 2011. Google Scholar
  108. X Yin, W Liu, Y Wang, C Yang, and L Lu. What? How? Where? A Survey of Crowdsourcing. Frontier and Future Development of, January 2014. Google Scholar
  109. Clifford Young, John Vidmar, Julia Clark, and Neale El-Dash. Our brave new world: blended online samples and performance of no probability approaches. Ipsos Public Affairs. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail