13 Introduction to Regression analysis

Regression analysis is a powerful statistical tool that investigates the relationships between variables. At its core, the technique is used to understand and quantify how one variable (the criterion variable) changes in response to another or several others (predictor variables). More than merely determining a singular correlation, regression analysis offers a nuance, enabling researchers to predict outcomes and uncover sophisticated patterns embedded within datasets.

13.1 Data-driven decision making

Moneyball image

The 2011 film Moneyball is based on the true story of the Oakland Athletics baseball team's 2002 season. Their general manager, Billy Beane (played by Brad Pitt in the film), faced a problem: he had a limited budget to put together a winning team. Rather than relying on traditional baseball scouting methods, which often depended heavily on scouts' intuitions and were prone to various biases, Beane employed the skills of a young Yale economics graduate named Peter Brand (played by Jonah Hill). Brand used statistical analysis to evaluate players' values.

The traditional method of valuing players was subjective. Scouts often looked at the physique of players, their style, how they moved, or even things like the attractiveness of their girlfriends as an indicator of their confidence. Beane and Brand shifted the focus to objective evidence, including statistics.

At its core, the analytics used in "Moneyball" is about predicting runs, and more importantly, wins. Using regression analysis, they could determine which statistics were most strongly correlated with creating runs. Once they had an understanding of what leads to runs, they could use that to build a model to aid them on player acquisition decisions.

The film (and also the far more nerdy book by Michael Lewis) highlights the tensions that arise when data-driven metrics clash with entrenched traditional norms. Yet, as time progresses, the efficacy of these analytical techniques becomes increasingly evident. Today, it's a rarity to encounter a professional sports team that doesn't incorporate some form of statistical analysis into its strategy and decision-making processes, showcasing the undeniable impact and relevance of regression in our modern world.

13.2 Further real world examples

Of course, it’s not just the world of sports that have taken advantage of regression analysis to inform decision making. Here are some examples from everyday life where regression analysis plays a role.

  • Streaming Services and Recommendations: Most of you probably use platforms like Netflix, Spotify, or YouTube. These platforms utilise regression analysis to predict what shows, songs, or videos you might like based on your past behaviour and the behaviour of others with similar tastes.

  • Predicting Box-office Income from Different Forms of Advertising: Film producers often use regression analyses to gauge which adverts, from TV spots to social media campaigns, most influence cinema ticket sales, optimising advertising budgets for upcoming films based on these insights.

13.3 Use of regression analysis in psychology

Regression analysis is also a cornerstone of psychological research, as it allows psychologists to simultaneously explore and dissect the influence of numerous variables on a single outcome variable. Such a comprehensive approach is indispensable in a field like psychology, where behaviours and mental processes are often the outcome of a web of interconnected variables.

For instance:

  • Predictors of Job Satisfaction: Organizational psychologists could use regression to determine which factors (e.g., salary, working hours, team dynamics, or leadership style) are the most significant predictors of job satisfaction among employees.

  • Influence on Social Behaviours: Social psychologists might employ regression analysis to understand how various factors like media exposure, peer influence, and past experiences predict certain social behaviours or attitudes, such as aggression or altruism.

Personally I'm interested in public health and climate change. Regression is indispensable for studying areas like these:

  • Determinants of Vaccine Hesitancy: I've used regression analysis to predict the likelihood of individuals being hesitant to take vaccines. Predictors for such a analysis being demographic factors like age and education level, psychological factors such as risk perception and trust in healthcare, as well as societal variables like exposure to misinformation on social media.

  • Predictors of Pro-Environmental Behaviours: I am currently using regression model to predict pro-environmental behaviour through varying levels of capability, motivation, and opportunity (i.e. the COM-B model).

Reflect on a psychological phenomenon or behaviour that has piqued your interest or that you've recently studied.

Jot down potential factors or variables that you believe might influence this phenomenon or behaviour.

Consider how these factors might be integrated into a regression model. Which variable would you choose as the dependent variable (the variable you will be predicting)? What variables might serve as the predictors of the dependent variable?

13.4 Overview of regression techniques covered in this module

In this handbook we will cover three main regression techniques:

  1. Simple linear regression (basically just a fancy correlation)
  2. Multiple linear regression (the technique that is required for your assignment)
  3. Binary logistic regression (multiple regression with a binary, instead of continuous, criterion variable)

Continue with statistics in psychology and you’ll learn further forms of regression, and regression like, analysis techniques such as:

  1. Ordinal regression
  2. Mediation and moderation
  3. Multi-level modelling
  4. Structural equation modelling