Data Analysis

Analyzing the data may seem like a mountain, impossible to pass. We can assure you that you don’t need to have a complete understanding of the maths behind the stats to analyze the data that you industriously collected. It is true that a good understanding will give you a head start, but it is not a prerequisite. Yet, students often don’t know where to begin. They seemingly lack the “hands-on” skills to start their statistical endeavour. This page aims to provide  some useful, practical information about how to conduct the analyses. It is organized around frequently asked questions  and includes links to other websites, videos and academic literature.

Question: “I need to freshen up my memory on statistics. What shall I read?” 

One very good reference is Andy Field´s book Discovering Statistics using SPSS (3rd ed.). Andy Field presents statistics in an easy to understand yet thorough way. You can use the book also as a step-by-step guide when running your analysis with SPSS.


  • Field, A.P. (2009). Discovering statistics with SPSS (3rd ed.). London: Sage Publications.

Question: “What is the difference between moderation and mediation?”

A moderator variable is one that influences the strength of a relationship between two other variables. A mediator variable is one that explains the relationship between the two other variables.



  • Baron, R.M., & Kenny, D.A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182.

Question: “How many respondents do I need for my analyses?”

Great question. The short (and simple) answer is: The more respondents you have, the better. The more precise answer is: “It depends”. The question may be answered from two perspectives: Generalizability and statistical power. In order to be able to make generalizable conclusions based on statistically significant findings, a large sample is needed. The good news is, however, that the larger the population size, the lower your sample size has to be to be confident you’re receiving the best information. Regarding statistical power, you want your sample size to be large enough to be able to find significant effects (if they exist). To compute the necessary sample size you are recommended to perform a priori power analyses. In a priori power analysis, researchers specify the size of the effect to be detected (i.e., a measure of the “distance” between H0 and H1), the alpha level, and the desired power level (1-beta) of the test (Erdfelder, Faul, & Buchner, 1996). Power analysis can be performed with the freely available program G*Power.

A possible “rule of thumb” (applied by the webmasters) is that for most analyses one needs at least 250 complete (i.e., no missing values) cases. For example, for a simple cross-sectional design the target sample should consist of, let’s say, 500-600 respondents because the response rate in survey research is generally below 50%. For a repeated-measurement design (e.g., daily diary study), also a large number of persons and measurement moments are desirable. Previous studies in high-ranking journals have sampled at least 100 persons  and at least five measurement moments (Ohly, Sonnentag, Niessen, & Zapf, 2010).


  • Erdfelder, E., Faul, F., & Buchner, A. (1996). GPOWER: A general power analysis program. Behavior Research Methods, Instruments & Computers, 28, 1-11.
  • Ohly, S., Sonnentag, S., Niessen, C., & Zapf, D. (2010). Diary studies in organizational research: An introduction and some practical recommendations. Journal of Personnel Psychology, 9, 79-93.

Question: “My supervisor wants me to conduct multilevel analysis, but I have no idea what this is about. Where can I find more information about multilevel analysis?”

The term multilevel refers to a hierarchical or nested data structure, usually individuals within organizational groups (e.g., employees within teams), but the nesting may also consist of repeated measures from individuals over time (e.g., daily or weekly diary studies). The expression multilevel model or multilevel analysis is used as a generic term for all models for hierarchical or nested data.



  • Bliese, P.D., Chan, D., & Ployhart, R.E. (2007). Multilevel methods. Future directions in measurement, longitudinal analyses, and nonnormal outcomes. Organizational Research Methods, 10, 551-563.
  • Enders, C.K., & Tofighi, D. (2007). Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. Psychological Methods, 12, 121-138.
  • Hayes, F. (2006). A primer on multilevel modeling. Human Communication Research, 32, 385-410.
  • Hox, J. (2010). Multilevel analysis, techniques and applications. Ney York, NY: Routledge.
  • LeBreton, J.M., & Senter, J.L. (2008). Answers to 20 questions about interrater reliability and interrater agreement. Organizational Research Methods, 11, 815-852.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s