Understanding Regression Analysis

Nick Baker Chief Research Officer 1 February 2019

When measuring the health of customer relationships, three metrics count the most...

...customer satisfaction, customer loyalty, and customer advocacy (otherwise known as ‘will you tell your friends about us’).

This will give you a good overall picture of where you stand in the eyes of your customers. But what those metrics don’t tell you is how to make that relationship better. For that, you need metrics that can take your understanding to the next level.

...in many markets, customers will remain loyal even if they’re unhappy, because the cost or effort of change is too high relative to the benefit.”

One option is to ask customers directly why they are or aren’t satisfied, loyal or advocates. This can be revealing, but people often struggle to articulate what really motivates them:

: They may never have thought about it too deeply, and give superficial responses as a result

: They may find their reasons hard to describe and end up being misleading

: They may give undue weight to ‘rational’ factors such as price, especially in B2B markets

Rather than asking customers directly, there is an alternative approach that applies a statistical method. It’s called Regression Analysis and it allows you to deduce what really matters and use that information to help you make better decisions.

...it would be ideal to excel in every single area, but in the real world, limited budgets and resources mean that investment needs to be prioritised.”

Regression Analysis explained

Regression Analysis comes in a variety of ‘flavours’: Linear Regression, Stepwise Regression, Ridge Regression. Each is designed to work in different situations, but all have the same basic variables at heart:

: Dependent variable – the thing we’re interested in changing, e.g. a customer satisfaction score or Net Promoter Score (NPS).

: Independent variables – the things we think might drive a change in the dependent variable, such as high-quality customer service leads improve overall satisfaction, for example.

Regression Analysis looks for relationships between these variables. It ‘freezes’ all the independent variables except for one, and then identifies the impact changing this variable has on the dependent variable. Do this for every variable in turn and we’re able to identify the power of each in moving the dependent variable.

Interpreting the Regression Analysis output

It’s possible to run Regression Analysis output yourself, using Excel or SPSS. Or you can use a professional statistician. Either way, to interpret the output, four numbers are especially important.

The first two numbers relate to the regression model itself:

Is the model really telling us anything?

The F-value measures the statistical significance of the model. Typically an F-value with a significance of less than 0.05 is considered statistically meaningful and therefore we can be confident that the outputs from the analysis are not due to chance alone.

How accurate is the model?

The R-Squared (or the Adjusted R-Squared) shows how much of the movement in the dependent variable is explained by the independent variables. For example, an R-Squared value of 0.8 means that 80% of the movement in the dependent variable can be explained by the independent variables tested. That means it would be highly predictive and could be said to be accurate.

The other two critical numbers relate to each of the independent variables:

Does the variable really matter?

Like the F-value, the P-value is a measure of statistical significance, but this time it indicates if the effect of the independent variable (rather than the model as a whole) is statistically significant. Again, a value lower than 0.05 is what you’re looking for.

How much impact does the variable have?

If multiple independent variables have been tested (as is often the case), the coefficient tells you how much the dependent variable is expected to increase by when the independent variable under consideration increases by one and all other independent variables are held at the same value. Sometimes the coefficient is replaced with a standardised coefficient which shows the relative contribution of each independent variable in moving the dependent variable.

Regression Analysis in market research – an example

Now we know the theory, let’s see how Regression Analysis looks in practice by using a real example.

Our goal was to advise a business software supplier on how they could improve levels of customer satisfaction.

We started by conducting in-depth interviews with delighted, content and dissatisfied customers to identify areas that could potentially influence levels of satisfaction. We complemented this with some internal workshops with customer-facing staff to uncover their beliefs about what makes customers happy.

Using these insights as a basis, we then created a structured survey which, amongst other things, asked 350 customers to rate their satisfaction from 1 to 10 in 3 key areas:

: Overall satisfaction with the supplier

: Satisfaction in regard to four high-level factors: product quality, consultancy on product use, technical support and quality of the relationship

: Satisfaction in regard to various sub-areas within these high-level factors, e.g. we broke technical support down into subsets including speed of response, expertise of the call handler, attitude of the call handler and ease of solving the issue

We wanted to test a critical assumption: does customer satisfaction actually matter? After all, in many markets, customers will remain loyal even if they’re unhappy, because the cost or effort of change is too high relative to the benefit.

To establish this, we ran a simple correlation analysis between overall satisfaction and claimed loyalty. This resulted in a correlation coefficient (R) of 0.79 which suggests that there is indeed a positive relationship between the two (as a rule of thumb, a correlation of between 0.5 and 0.7 suggests a strong relationship and anything above 0.7 suggests a very strong relationship).

Confident that improving overall levels of customer satisfaction would most likely yield commercial benefits, we then needed to understand how to achieve it. This is where we brought in Regression Analysis. Using ‘overall satisfaction’ as the dependent variable, and the four high-level factors as the independent variables, we sought to identify where the broad focus should be.

Before interpreting the output of our analysis, we needed to establish if the model was reliable and accurate:

: The F-value was 0.00000000004. Anything under 0.05 is significant so this result shows that the model is highly reliable.

: The Adjusted R-Squared was 0.87. Again, that gives confidence as it means that the model explains 87% of the movement in overall satisfaction.

It passed with flying colours on both counts. Happy, then, that the model was reliable and accurate, we looked at what the four high-level factors could tell us:

We could see that all of the factors had some impact on overall satisfaction and the P-values (all under 0.05) showed that this was significant in a statistical sense.

It was also clear that ensuring satisfaction with the product itself is absolutely critical: for every 1-point increase in satisfaction with the product on our 1-10 scale, overall satisfaction increased by almost half of one point (0.46).

Contrast this with technical support where the same 1-point increase only delivered a 0.09 boost in overall satisfaction, which is around a fifth less than a 1-point increase in product satisfaction would deliver.

We then ran a second Regression Analysis to identify how specifically to realise this gain. What areas would increase overall satisfaction?

Once again the first check was to make sure the generated model was accurate and reliable. With an F-value well under 0.05, and an Adjusted R-Squared of 0.9, it was. The outputs for the six product factors tested were as follows:

By this point, we knew a great deal:

: The more satisfied a customer is, the more likely they are to remain a customer

: Satisfaction with the product itself is most powerful in driving overall satisfaction

: Satisfaction with the product is in turn driven by its reliability, functionality and value

We now needed to look at one more thing: are there actually low levels of satisfaction in these areas and, if so, what action could be taken to fix it?

To establish this, we plotted the importance – as measured by the coefficient – of the high-level factors and the sub-factors against the satisfaction of customers in these areas.

This exercise showed the value of looking at customer satisfaction in the context of what matters most. Obviously, it would be ideal to excel in every single area, but in the real world, limited budgets and resources mean that investment needs to be prioritised.

If we’d simply measured satisfaction in the four high-level areas, the conclusion would be to focus on technical support, as this was a clear area of weakness.

However, having complemented this understanding with a Regression Analysis, we could see that the investment would be better spent on improving product quality. This would be far more influential in driving customer satisfaction and is strongly linked with loyalty and, therefore, commercial success.

Likewise, investments in improving product quality should focus on enhancing reliability even though ease of integration is poor.

Read more about our approach to business-to-business (B2B) customer satisfaction surveys and how you can use them to make better decisions for your business.