Published on 5 May 2022 by Pritha Bhandari. Revised on 5 December 2022.
A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them.
A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative.
Positive correlation | Both variables change in the same direction | As height increases, weight also increases |
---|---|---|
Negative correlation | The variables change in opposite directions | As coffee consumption increases, tiredness decreases |
Zero correlation | There is no relationship between the variables | Coffee consumption is not correlated with height |
Correlational and experimental research both use quantitative methods to investigate relationships between variables. But there are important differences in how data is collected and the types of conclusions you can draw.
Correlational research | Experimental research | |
---|---|---|
Purpose | Used to test strength of association between variables | Used to test cause-and-effect relationships between variables |
Variables | Variables are only observed with no manipulation or intervention by researchers | An independent variable is manipulated and a dependent variable is observed |
Control | Limited control is used, so other variables may play a role in the relationship | Extraneous variables are controlled so that they can’t impact your variables of interest |
Validity | High external validity: you can confidently generalise your conclusions to other populations or settings | High internal validity: you can confidently draw conclusions about causation |
Correlational research is ideal for gathering data quickly from natural settings. That helps you generalise your findings to real-life situations in an externally valid way.
There are a few situations where correlational research is an appropriate choice.
You want to find out if there is an association between two variables, but you don’t expect to find a causal relationship between them.
Correlational research can provide insights into complex real-world relationships, helping researchers develop theories and make predictions.
Example: Investigating a non-causal association You want to know if there is any correlation between the number of children people have and which political party they vote for.
You don’t think having more children causes people to vote differently. It’s more likely that both are influenced by other variables such as age, religion, ideology, and socioeconomic status. But a strong correlation could be useful for making predictions about voting patterns.
You think there is a causal relationship between two variables, but it is impractical, unethical, or too costly to conduct experimental research that manipulates one of the variables.
Correlational research can provide initial indications or additional support for theories about causal relationships.
Example: Investigating causal relationships with limitations You want to investigate whether greenhouse gas emissions cause global warming. It is not practically possible to do an experiment that controls global emissions over time, but through observation and analysis you can show a strong correlation that supports the theory.
You have developed a new instrument for measuring your variable, and you need to test its reliability or validity.
Correlational research can be used to assess whether a tool consistently or accurately captures the concept it aims to measure.
Example: Investigating new measurement tools You develop a new scale to measure loneliness in young children based on anecdotal evidence during lockdowns.
To validate this scale, you need to test whether it’s actually measuring loneliness. You collect data on loneliness using three different measures, including the new scale, and test the degrees of correlations between the different measurements. Finding high correlations means that your scale is valid.
There are many different methods you can use in correlational research. In the social and behavioural sciences, the most common data collection methods for this type of research include surveys, observations, and secondary data.
It’s important to carefully choose and plan your methods to ensure the reliability and validity of your results. You should carefully select a representative sample so that your data reflects the population you’re interested in without bias.
In survey research, you can use questionnaires to measure your variables of interest. You can conduct surveys online, by post, by phone, or in person.
Surveys are a quick, flexible way to collect standardised data from many participants, but it’s important to ensure that your questions are worded in an unbiased way and capture relevant insights.
Example: Surveys To find out if there is a relationship between vegetarianism and income, you send out a questionnaire about diet to a sample of people from different income brackets. You statistically analyse the responses to determine whether vegetarians generally have higher incomes.
Naturalistic observation is a type of field research where you gather data about a behaviour or phenomenon in its natural environment.
This method often involves recording, counting, describing, and categorising actions and events. Naturalistic observation can include both qualitative and quantitative elements, but to assess correlation, you collect data that can be analysed quantitatively (e.g., frequencies, durations, scales, and amounts).
Naturalistic observation lets you easily generalise your results to real-world contexts, and you can study experiences that aren’t replicable in lab settings. But data analysis can be time-consuming and unpredictable, and researcher bias may skew the interpretations.
Example: Naturalistic observation To find out if there is a correlation between gender and class participation, you observe college seminars, note the frequency and duration of students’ contributions, and categorise them based on gender. You statistically analyse the data to determine whether men are more likely to speak up in class than women.
Instead of collecting original data, you can also use data that has already been collected for a different purpose, such as official records, polls, or previous studies.
Using secondary data is inexpensive and fast, because data collection is complete. However, the data may be unreliable, incomplete, or not entirely relevant, and you have no control over the reliability or validity of the data collection procedures.
Example: Secondary data To find out if working hours are related to mental health, you use official national statistics and scientific studies from several different countries to combine data on average working hours and rates of mental illness. You statistically analyse the data to see if countries that work fewer hours have better mental health outcomes.
After collecting data, you can statistically analyse the relationship between variables using correlation or regression analyses, or both. You can also visualise the relationships between variables with a scatterplot.
Different types of correlation coefficients and regression analyses are appropriate for your data based on their levels of measurement and distributions.
Using a correlation analysis, you can summarise the relationship between variables into a correlation coefficient: a single number that describes the strength and direction of the relationship between variables. With this number, you’ll quantify the degree of the relationship between variables.
The Pearson product-moment correlation coefficient, also known as Pearson’s r, is commonly used for assessing a linear relationship between two quantitative variables.
Correlation coefficients are usually found for two variables at a time, but you can use a multiple correlation coefficient for three or more variables.
With a regression analysis, you can predict how much a change in one variable will be associated with a change in the other variable. The result is a regression equation that describes the line on a graph of your variables.
You can use this equation to predict the value of one variable based on the given value(s) of the other variable(s). It’s best to perform a regression analysis after testing for a correlation between your variables.
It’s important to remember that correlation does not imply causation. Just because you find a correlation between two things doesn’t mean you can conclude one of them causes the other, for a few reasons.
If two variables are correlated, it could be because one of them is a cause and the other is an effect. But the correlational research design doesn’t allow you to infer which is which. To err on the side of caution, researchers don’t conclude causality from correlational studies.
Example: Directionality problem You find a positive correlation between vitamin D levels and depression: people with low vitamin D levels are more likely to have depression.
But you can’t be certain about whether having low vitamin D levels causes depression, or whether having depression causes reduced intakes of vitamin D through lifestyle or appetite changes. Therefore, you can only conclude that there is a relationship between these two variables.
A confounding variable is a third variable that influences other variables to make them seem causally related even though they are not. Instead, there are separate causal links between the confounder and each variable.
In correlational research, there’s limited or no researcher control over extraneous variables. Even if you statistically control for some potential confounders, there may still be other hidden variables that disguise the relationship between your study variables.
Example: Third variable You find a strong positive correlation between working hours and work-related stress: people with lower working hours report lower levels of work-related stress. However, this doesn’t prove that lower working hours causes a reduction in stress.
There are many other variables that may influence both variables, such as average income, working conditions, and job insecurity. You might statistically control for these variables, but you can’t say for certain that lower working hours reduce stress because other variables may complicate the relationship.
Although a correlational study can’t demonstrate causation on its own, it can help you develop a causal hypothesis that’s tested in controlled experiments.
A correlation reflects the strength and/or direction of the association between two or more variables.
A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research.
Controlled experiments establish causality, whereas correlational studies only show associations between variables.
In general, correlational research is high in external validity while experimental research is high in internal validity.
A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.
A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.
Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions. The Pearson product-moment correlation coefficient (Pearson’s r) is commonly used to assess a linear relationship between two quantitative variables.
If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.