What is multivariable regression with an example

A multivariable regression is a type of statistical analysis that is used to model the relationship between multiple independent variables and a single dependent variable. In other words, it is a way to predict the value of a dependent variable based on the values of several independent variables.

Example

For example, suppose you want to predict the price of a house based on its size, number of bedrooms, and location. You could use multivariate regression to analyze the data and build a model that predicts the price of a house based on these variables.

Multivariate regression can be used to analyze both continuous and categorical variables. It is a useful tool for understanding the relationships between variables and for making predictions based on those relationships.

Let’s say we are given data on TV, radio, and newspaper advertising spending for a list of companies, and our goal is to predict sales in terms of units sold.

CompanyRatio($)SalesNewsUnits
Netflix230.137.869.122.1
Microsoft44.539.323.110.4
Twitter17.245.934.718.3
Instagram151.541.313.218.5

Growing complexity

Linear regression can become challenging when dealing with multiple features as it makes it difficult to visualize and comprehend the data. One solution is to break down the data and compare one or two features at a time. In this example, we explore how Radio and TV investments impact sales. The following image displays a 3D plane that shows the relationship between Radio, TV, and Sales.

Normalization

Normalizing input data is a technique used to speed up gradient calculation as the number of features grows. It’s important for datasets with high standard deviations or differences in the ranges of the attributes. The goal is to ensure all values are within the same range, typically -1 to 1.

Code

For each feature column {
Normalize your features to achieve faster gradient computation by following these two simple steps:

1. Subtract the mean of the column (mean normalization)
2. Divide by the range of the column (feature scaling)
}

Our input is a matrix with 200 rows and 3 columns, containing data on TV, Radio, and Newspaper advertising. To speed up gradient calculation and ensure fair treatment of all features, we normalize the input by subtracting each feature’s mean and dividing it by its range. The resulting output is another matrix with the same dimensions, where all values are now within the range of -1 to 1, making it easier to train our model.

def normalize(features):
    **
    features     -   (200, 3)
    features.T   -   (3, 200)
    We transpose the input matrix, swapping
    cols and rows to make vector math easier
    **
    for feature in features.T:
        fmean = np.mean(feature)
        frange = np.amax(feature) - np.amin(feature)
        #Vector Subtraction
        feature -= fmean
        #Vector Division
        feature /= frange
    return features

Making predictions

Our prediction function estimates sales based on our current weights and a company’s spending on TV, radio, and newspaper. The model seeks to identify weight values that minimize the cost function.

Reference links :

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3049417/

One thought on “What is multivariable regression with an example

Leave a Reply