Survey Weighting Explained: How To Ensure A Poll Is Truly Representative
Image source: @thisisengineering via Unsplash
In the world of polling and online surveys, weighting data is a common practice. However, it can be a tricky concept to wrap your head around if you're new to market research or statistics.
So, in this article, I'll explain what weighting is, the various types of weighting, when it's needed and why. I will also provide a brief case study that shows an example of how you can weight data in software like Excel.
So, let's jump straight in.
What is weighting?
Weighting allows you to control for under or over-representation in a sample, or it can help you address potential bias from your sampling method or selection process.
For example, let's say you're running a poll and want to ensure that the results represent both males and females equally. After collecting n=100 responses, you find that you've collected 70 responses from males but only 30 responses from females. This sample has a strong male skew, and if you look at the results as a whole (i.e. all n=100 samples), it won't be representative of both males and females.
In this example, weighting the dataset can help fix the under-representation of females without the need to collect more samples.
Achieving 'representative' data
In my definition above, you may have noticed a common word: representative. This is a common term in a market researchers' lexicon and is sometimes abbreviated as just "rep" (e.g. " I need a rep sample of the general population").
The business of online surveys and polling is all about building representative datasets. It's easy to put a poll on a website and get 1,000 responses. What's hard is ensuring that a dataset meaningfully and accurately represents the target population you want to study.
Market research starts with the target population
Before we get into the details of weighting data, you need to understand how research firms think about the audience, or target population, that they aim to study. Choosing and defining the target population is one of the first things you should always do in market research.
Are you planning to run a nationally representative poll that reflects the views of an entire country's voting intention? Or are you looking to understand the attitudes and interests of console gamers? These are very different target populations (i.e. a nation versus gamers), and each would require a different approach to ensure said population is represented.
A nationally representative election poll, for example, will usually involve setting demographic quotas for variables like gender, age, income, educational level or even location (e.g. region, district, etc). This means that the survey software will enforce minimum sample sizes for specific demographic variables and then stop accepting responses once a quota is filled.
For a study about console gamers, you would typically create screening questions at the beginning of a survey to 'screen' for the population you want to study. This could involve including a few questions that ask if the respondent plays video games and, if so, on which platform (i.e. mobile, console, PC, etc.).
In summary, when doing survey-based research, you should always start by defining the target population you want to study. Then, you'll configure your survey using features like quotas and screen-outs to ensure that your target population is represented.
Types of weighting
Before I get into when and why you should weight data, let me briefly cover the different methods of weighting (but just a heads up that this article will focus on the first method; demographic weighting).
Demographic Weighting - This is one of the most basic and commonly used methods. It involves adjusting the survey data to match known demographic characteristics of the target population, such as age, gender, income, or education levels. For example, if 50% of the target population is male but 70% of the survey respondents are male, female responses would need to be given more weight.
Probability Weighting - This method is used when the probability of selection for each respondent in the survey is not equal. For example, in a telephone survey, people with landlines may have a higher chance of being selected than those with only cell phones. Probability weighting adjusts for these differences to ensure that each individual's response is represented in proportion to their likelihood of being chosen. In summary, probability weighting is more concerned with addressing a sampling or selection bias, rather than an issue with demographic representation.
Post-Stratification Weighting - After collecting data, researchers can use post-stratification to adjust their sample to match the population in terms of various groups or categories (like age groups, education levels, etc.). This method is similar to demographic weighting but can be more complex and involve multiple variables. The polling firm YouGov (whom I used to work for) used a variant of this method, known as multi-level regression and post-stratification (MRP), to call the 2017 UK General Election.
Raking - Raking is an advanced form of post-stratification. It adjusts weights on multiple dimensions iteratively. For example, it first adjusts weights for age distribution, then for gender, and then for other variables, cycling through these adjustments several times until the sample closely matches the population across all key dimensions.
Propensity Score Weighting - This method is used when there is a concern about nonresponse bias – that is, when the people who respond to the survey are systematically different from those who do not (e.g. older respondents being less likely to respond to an online poll compared to younger respondents). This approach uses a propensity score to estimate the likelihood of each individual in the sample responding. The survey data is then weighted according to these propensity scores.
Regression Weighting - This method is sometimes used in more sophisticated analyses, as it involves creating a statistical model (usually a regression model) that predicts responses based on key variables. The model's predictions are then used to adjust the weights of the survey responses.
So now you know about different types of weighting. For the rest of this article, I'll focus on the most common and accessible method, demographic weighting.
When demographic weighting is needed
There are a few scenarios where demographic weighting is needed, and this usually comes down to when the profile of the overall sample (e.g. proportions of gender, age, income, etc) doesn't match the target population. This can happen for a few reasons, which include:
Quotas weren't set in the survey
Quotas were set but not achieved
Quotas were set and achieved, but data cleaning resulted in over/under-representation
Allow me to go through each of these scenarios.
Scenario 1 - Quotas weren't set in the survey
In this scenario, the researcher doesn't configure any quotas in their survey and just collects as many samples as possible. This approach is generally not advisable because it will result in significant skews in your dataset. Hence, your data won't meaningfully represent any target population. Weighting can fix this, but it also has limitations. Depending on your sample size and how skewed your sample is, weighting may not be enough to fix your data.
For example, if you have a very small sample size (e.g. 50 samples) and a very strong skew (say 80% female), weighting may be ineffective.
Scenario 2 - Quotas were set but not achieved
In this scenario, quotas were configured in the survey, but some or all were not achieved. This can be a common occurrence in the world of survey-based research, especially with hard-to-reach demographics, like older generations (65+) or high-net-worth individuals. Some groups are more difficult to reach through online or telephone surveys, and you may find that you can only achieve some of the quotas you set.
When this happens, you can use weighting to ensure adequate representation for any demographics we may have under-sampled.
Scenario 3 - Quotas were set and achieved, but data cleaning resulted in over/under-representation
With this scenario, you achieved all quotas during fieldwork, but data cleaning resulted in either the over or under-representation of a certain group.
Why would this happen? Once fieldwork is complete, a researcher will typically clean the data to remove outliers or respondents who failed quality control (QC). This process involves physically removing respondents from the dataset, meaning you lose samples. As a result, you may see the proportional representation of demographics start to change.
For example, say you collected n=100 samples and achieved your target quotas for a 50% / 50% split on gender. But in the process of cleaning your data, you had to remove n=10 male respondents who failed QC. You will now have a 45% / 55% gender split (e.g. 40 males divided by 90 net samples). Despite achieving your quotas, data cleaning created an imbalance. And hence, we can use weighting to fix this imbalance.
Based on these three scenarios, I've created the graphic below to help illustrate how why weighting is needed.
Demographic weighting, a live example
Let's go through a real-world example of demographic weighting. Let's say you ran a poll and collected n=50 samples. One of the questions in this poll asked respondents about their favourite cuisine, and the unweighted results looked like this:
Q: What is your favourite cuisine? (unweighted)
You may look at these results and say, ok, slightly more than 1/3 of the population prefer Italian food. However, upon looking at the raw data, you notice a heavy gender skew: 70% male and 30% female. The problem is that you want the data to be equally representative of both males and females (i.e. a 50/50 split).
So I use demographic weighting to ensure males and females both have equal representation in the data. After weighting, the result for the same question looks like this:
Q: What is your favourite cuisine? (unweighted vs weighted)
As you can see, the selection rate for favourite cuisines shifts after weighting. The unweighted data significantly over-stated the population's preference for Italian food and significantly under-stated preference for Mexican food.
If we had just gone ahead and reported on the unweighted data, we would have misinformed our audience about how much or little the target population prefers specific cuisines.
If you'd like to see the raw data and calculations used for demographic weighting in this example, you can download the XLS dataset below.
Or, if you'd like to learn more about the online surveys methodology, check out my best selling Udemy course, An Introduction to Online Quantitative Market Research.
Conclusion
Weighting is an essential step in the field of online surveys and polling as it ensures the underlying data is accurate and representative. It's a way to ensure that all voices are heard equally, preventing over-representation of groups that may be skewing your data.