For best experience please turn on javascript and use a modern browser!

# Survey and Statistics

## Elementary

Before doing any statistics it’s essential to know what type of question you have, on a continuum from descriptive to mechanistic”-explanatory:

• Leek, J.T. and Peng, R.D. (2015) What is the question? Science 347: 1314-1315.

Equally important is to realize that decisions on data collection (sample, variables and operationalizations) and modeling are much more important for the results than p-levels. Moveover, “a p value of 0.05 does not mean that there is a 95% chance that a given hypothesis is correct. Instead, it signifies that if the null hypothesis is true, and all other assumptions made are valid, there is a 5% chance of obtaining a result as least as extreme as the one observed. And a p value cannot indicate the importance of a finding” (Monya Baker, Nature 2016, p.151). See:

• Leek, J.T. and Peng, R.D. (2015) P values are just the tip of the iceberg. Nature 520: 612.
• Brown, A.W., Kaiser, K.A. and Allison, D.B. (2018) Issues with data and analysis: Errors, underlying themes, and potential solutions. Proceedings of the National Academy of Sciences 115: 2563-2570.
• Cumming, G. (2008) Replication and p intervals. Perspectives on Psychological Science 3: 286-300.
• Endogeneity [a lucid clip on Youtube].
• Significance and power

Statistics at a very elementary level:

• Paul D. Allison. (1999) Multiple Regression. Thousand Oaks: Sage.
• Wonnacott, R.H. and Wonnacott, R.J. (1990)  Introductory Statistics. New York: Wiley.

Much better is to first learn a little bit of math, and then everything else is much better comprehensible:

• Fox, John (2009), A Mathematics Primer for Social Statistics. Thousand Oaks: Sage.

Quick introductions to, and overviews of, many statistical topics can be had through Cosma Shalizi’s notebooks. To see the forest through the trees:

• Kass, Robert E. (2011) Statistical Inference: The Big Picture. Statistical Science 26:1-9.

Non-technical treatise of the most important statistical insights and techniques:

• Stigler, S.M. (2016) The Seven Pillars of Statistical Wisdom. Harvard U.P.

If you use R, which you should anyway:

• Snijders, Tom A.B. and Bosker, Roel J. (2012, 2nd ed)  Multilevel Analysis. Los Angeles: Sage.
• Wooldridge, Jeffrey M. (2012, 5th ed)  Introductory Econometrics. Mason: South-Western.
• Angrist, Joshua D. and Pischke, Jörn-Steffen (2009).  Mostly Harmless Econometrics.  Princeton: Princeton U.P.
• Gelman, Andrew and Hill, Jennifer (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge U.P. Features computer code in R.
• Stock, James H. and Watson, Mark W. (2017) Twenty years of time series econometrics in ten pictures. Journal of Economic Perspectives 31: 59-86. Has references to important studies and textbooks, with an emphasis on macro phenomena.
• Cameron, A. Colin and Pravin K. Trivedi (2005)  Microeconometrics. Cambridge U.P.
• Shalizi, C. (2010) The bootstrap. American Scientist 98: 186-190. Also contains an excellent one-page overview of what statistics is about.
• John Fox and Sanford Weisberg (2018, 3rd ed.) An R Companion to Applied Regression. Thousand Oaks: Sage.
• John Fox has course material on Structural Equation Models on his website, and he made a  package for it in R. There is also the lavaan package in R, with a blog. Arguably the best introductory textbook on structural equation modeling: Bill Shipley (2000) Cause and Correlation in Biology. Cambridge U.P.

## Bayesian

• Efron, B. (2013) Bayes Theorem in the 21st Century. Science 340: 1177-1178.
• McElreath, R. (2018) Statistical Rethinking, 2nd ed. CRS Press.
• History of the conflict between frequentists and Bayesians: Mathias W. Madsen (2015) The Kid, the Clerk, and the Gambler. University of Amsterdam, PhD.

## On the internet

Introductory (and not so introductory) talks on Youtube

## Survey design

• Czaja, R. F. and Blair, J. E. (2005) Designing surveys. A guide to decisions and procedures. Thousand Oaks: Pine Forge.
• De Leeuw, E. D., Hox, J. J. and Dillman, D. A. (Eds.) (2008) International Handbook of Survey Methodology. New York: Lawrence Erlbaum Associates
• Fowler, F. J. (1995) Improving Survey Questions. Design and Evaluation. Thousand Oaks: SAGE.
•  Fowler, F. J. (2009) Survey Research Methods. Thousand Oaks: SAGE.
•  Groves, R. M. et al. (2009) Survey Methodology. Hoboken: Wiley.
• Krosnick, J. A., and Fabrigar, L. R. (2013) The handbook of questionnaire design. New York: Oxford University Press
•  Schaeffer, N. C. and Presser, S. (2003) The Science of Asking Questions. Annual Review of Sociology 29, 65–88.
• Sudman, S., Bradburn, N. M. and Schwarz, N. (1996) Thinking about answers. The application of cognitive processes to survey methodology. San Francisco: Jossey-Bass.
• Tourangeau, R., Rips, L. J. and Rasinski, K. A. (2000) The psychology of survey response. Cambridge: Cambridge University Press.
• Kahneman, D. et al. (2004) A survey method for characterizing daily life experience: The day reconstruction method. Science 306: 1776-1780.
• Alan B. Krueger and Arthur A. Stone (2014) Progress in measuring subjective well-being. Science 346: 42-43.
• Zwane, A.P. et al (2011) Being surveyed can change later behavior and related parameter estimates. PNAS 108: 1821-1826.
• Rogers, T., ten Brinke, L. and Carney, D.R. (2016) Unacquanted callers can predict which citizens will vote over and above citizens’ stated self-predictions. PNAS 113: 6449-6453.

## Data visualization

• Spiegelhalter, D., Pearson, M. and Short, I. (2011) Visualizing uncertainty about the future. Science 333: 1393-1400.
• Use bar charts instead of pie charts: W.S. Cleveland and R. McGill (1984) Graphical perception. J. Am. Stat. Assoc 79: 531-554.

## Meta analysis

• Jessica Gurevitch, e.a. (2018) Meta-analysis and the science of research synthesis. Nature 555: 175-182.
• Jop de Vrieze (2018) The metawars: meta-analyses were supposed to end scientific debates. Often, they only cause more controversy. Science 361: 1185 – 1188.

## Causality

The debate on causality is ongoing for about 2500 years, and the references below are only to a very small portion of the pertaining literature, yet touching upon some of the most salient issues that social scientists have to deal with. See also experimental research.

• Hubert M. Blalock (1961) Causal inferences in nonexperimental research. Univ. North Car. Press: Chapel Hill. Arguably the most classical text for modern survey researchers.
• Andrew Gelman (2011) Causality and statistical learning American  Journal of Sociology 117: 955-966. A review of three books by one of the top statisticians around (see his own book above).
• Mott Greene (2001) A tool, not a tyrant. Nature 410: 875. On mechanisms.
• Kenneth A. Bollen and Mark D. Noble (2011) Structural equation models and the quantification of behavior. Proceedings of the National Academy of Sciences 108: 15639-15646. A brief introduction to structural equation models by their father, an approach that is also used by Judea Pearl:
• Judea Pearl (2010) The foundations of causal inference. Sociological Methodology 40: 75-149.
• Paul W. Holland (1986) Statistics and causal inference. J. Am. Stat. Association 81: 945-960.
• D.R. Cox and Nanny Wermuth (2001) Some statistical aspects of causality. European Sociological Review 17: 65-74.
• Donald B. Rubin (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66: 688-701.
• Susan Athey and Guido Imbens (2016) Recursive partitioning for heterogeneous causal effects. PNAS 113: 7353-7360.
• Susan Athey and Guido Imbens (2017) The state of applied econometrics: causality and policy evaluation. Journal of Economic Perspectives 31: 3-32.

## (Im)proper use of statistics

• Gigerenzer, G. (2004) Mindless statistics. The Journal of Socio-Economics 33: 587-606.
• Simonsohn, U., Nelson, L.D. and Simmons, J.P. (2014) P-curve: a key to the file-drawer. Journal of Experimental Psychology 143: 534-547.
• Watts, D.J. (2014) Common sense and sociological explanation. American Journal of Sociology 120: 313-351.
• Loken, E and Gelman, A (2017) Measurement error and the replication crisis: The assumption that measurement error always reduces effect sizes is false. Science 355: 584-585.
• Young, C. (2009) Model uncertainty in sociological research. American Sociological Review 74: 380-397.
• Freese, J. (2014) Defending the decimals: Why foolishly false precision might strengthen social science. Sociological Science 1: 532-541.
• Nuzzo, R. (2014). Statistical errors. Nature 506: 150-152.
• Goodman, S.N. (2016) Aligning statistical and scientific reasoning. Science 352: 1180.
• Ellen Hamaker and Oisin Ryan (2019) A squared standard error is not a measure of individual differences. PNAS 116: 6544-6545.
• Cosma Shalizi on abuse of factor analysis