# Fuel consumption in cars depending on transmission

The study focuses in fuel consumption (MPG, miles per gallon), which variables are more significant and, specifically, if manual or automatic transmission are better for MPG. We elaborated a multivariable linear model that explains the relationship with MPG. This model shows that cars with manual transmission get more MPG that those with automatic transmission, which is also confirmed with a t-test and a boxplot. This was the course project for the Regression Models course, part of the Data Science Specialization by Johns Hopkins University on Coursera.

## Exploratory Data Analysis

First, we explore the mtcars dataset:

```
library (datasets)
data(mtcars)
mtcars[1:3,]
```

```
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
```

In the Appendix there is a matrix of plots for the dataset (Figure 1) and a boxplot of MPG vs Transmission (Figure 2). In order to perform the analysis, the variables *cyl* (number of cylinders), *vs* (V/S), *gear* (number of forward gears), *carb* (number of carburetors) and *am* (transmission) were transformed from numerical to factor.

## Analysis (inference and regression model)

In general, boxplot in Figure 2 (Appendix) shows that there is a difference in MPG according to transmission. With a t-test, assuming that the transmission data has a normal distribution, we can reject de null hypothesis that there is no diferences in MPG with transmission:

```
t.test(mpg~am, mtcars)
```

```
##
## Welch Two Sample t-test
##
## data: mpg by am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.280194 -3.209684
## sample estimates:
## mean in group Automatic mean in group Manual
## 17.14737 24.39231
```

But, as Figure 1 (Appendix) shows, there are several variables that correlates with MPG. We need to build a multivariable model for MPG. In order to get the best model, we used the *step* function, that choose a model according to the Akaike Information Criterion (AIC) in a stepwise Algorithm. Our initial model includes all the variables.

```
lm.initial <- lm(mpg ~ ., data = mtcars)
lm.aic <- step(lm.initial, direction = "both")
summary(lm.aic)
```

With AIC the best model as a predictor of fuel consumption (MPG) includes the variables *cyl*, *hp*, *wt* and *am*. We also tested the Bayesian Information Criterion (BIC) to choose the best model:

```
lm.bayesian <- step(lm.initial, k=log(nrow(mtcars)))
```

With BIC, the best model includes the variables *wt*, *qsec* and *am*:

```
summary(lm.bayesian)
```

```
##
## Call:
## lm(formula = mpg ~ wt + qsec + am, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4811 -1.5555 -0.7257 1.4110 4.6610
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.6178 6.9596 1.382 0.177915
## wt -3.9165 0.7112 -5.507 6.95e-06 ***
## qsec 1.2259 0.2887 4.247 0.000216 ***
## amManual 2.9358 1.4109 2.081 0.046716 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.459 on 28 degrees of freedom
## Multiple R-squared: 0.8497, Adjusted R-squared: 0.8336
## F-statistic: 52.75 on 3 and 28 DF, p-value: 1.21e-11
```

We selected the BIC best model because it’s more simple (it has fewer variables), while it explain the same variability (R-squared: BIC= 0.834, AIC= 0.84), and has the same p-value near 0. In Figure 3 (Appendix) there are the graphics for this model, showing that residuals are normally distributed (*Normal Q-Q*), verify the independence condition (*Residuals vs Fitted*), have constant variance (*Scale-Location*) and there are no disrupting outliers (*Residuals vs Leverage*).

According to this model, when *wt* and *qsec* are constant, cars with manual transmission (*amManual*) add 2.936 more miles per gallon (MPG) to the cars with automatic transmission (*Intercept*).

So the **conclusion** is that **cars with manual transmission are more efficient in fuel consumption (they have a 30.5% better MPG) than those with automatic transmission**.

# Appendix

## Figure 1

## Figure 2

## Figure 3

!