# F-statistic: Understanding model significance using python

In statistics, a test of significance is a method of reaching a conclusion to either reject or accept certain claims based on the data. In the case of regression analysis, it is used to determine whether an independent variable is significant in explaining the variance of the dependent variable. So suppose we have our regression equation:

In this case,

• The null hypothesis H0 would be: β= 0 i.e predictor x is not able to explain the variance of the independent variable y.
• Alternative hypothesis H1 would…

# Akaike Information Criterion: Model Selection

Akaike Information Criterion or AIC is a statistical method used for model selection. It helps you compare candidate models and select the best among them.

Candidate models can be models each containing a different subset or combination of independent/predictor variables.

AIC aims to select the model which best explains the variance in the dependent variable with the fewest number of independent variables (parameters). So it helps select a simpler model (fewer parameters) over a complex model (more parameters).

But why select a simpler model over a complex one?

• To reduce overfitting:

We know that the more complex the model, the…

# Covariance vs Correlation: Which should you use?

Covariance and correlation, you have probably come across these terms in probability theory and statistics. They both are used to describe a very similar aspect i.e the type of linear relationship between some random variables/ features. But then what are the differences between these terms and which one should you use?

To answer these questions I’ll first start by giving a brief overview of these topics. I’ll be using car specifications as an analogy for better intuition.

Covariance

Covariance signifies the direction of the linear relationship between some random variables i.e if the variables are directly proportionate or inversely proportionate… 