Week 08 (Linear Learning)

D.1 Linear Regression Model

Consider the linear regression model $y = X\beta\space + \epsilon$. Define each term of the equation and explain why it is called a linear model.

$y := (y_1,y_2,\ldots,y_T)'$ is a Vector of $T$ observations / $y$-values

$X:= \begin{pmatrix} 1 & x_{11} & \cdots & x_{1p} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_{T1} & \cdots & x_{Tp} \end{pmatrix}$ , where each of the $p$ columns holds the $x$-values. $p$ is the number of dimensions in which our linear regression model operates

$\beta := {[\beta_0, \beta_1, \ldots,\beta_p]}'$ is also a vector which contains the slope coefficients

$\epsilon = (\epsilon_1,\epsilon_2,\ldots, \epsilon_T)'$ is another vector, that contains the errors / residuals for each value

The Linear Regression Model is considered linear, because a linear relation between at least one independent $(X)$ and exactly one dependent variable $(y)$ is presented.

D.2 MA(1) as LRM

Write an MA(1) model as a linear regression problem.

The MA(1) DGP looks as follows: $y_t = \mu + \theta\epsilon_{t - 1} +\epsilon_t$, where $\mu$ is the mean.

We can rewrite each term of the Linear Regression Model like this:

$y = (y_2,\ldots,y_T)'$

$X = \begin{pmatrix} 1 & \epsilon_1 \\ \vdots & \vdots \\ 1 & \epsilon_{T-1} \end{pmatrix}$

$\epsilon = (\epsilon_2,\ldots, \epsilon_{T})' : iid(0,\sigma^2_\epsilon)$

$\beta = (\mu,\theta)'$

D.3 Linear Regression Model

Assume a return factor of interest $f_t$, follows a conditional Gaussian distribution with a one-step ahead mean of $c \space +\phi_f \cdot f_{t- 1} + \theta_1 \space\cdot\epsilon_{t - 1}$ and one-step ahead variance $\sigma^2_\epsilon$, where $c,\phi,\theta,\sigma^2_\epsilon$

are constants and ϵ being a forecast error. Write down the data generating process for $f_t$.

f_t follows an ARMA(1,1) ??

D.4 Estimating a LRM

Write down a linear regression model. Highlight which parameters are to be estimated and also highlight the information that is used to pin down the parameters.

Linear regression model $y = X\beta\space + \epsilon$
The parameters $\beta$ und $\epsilon$ have to be estimated
The information from $X$ and $y$ is used (TODO how exactly)
IDEA: beta = (X'X)^-1*X'y, residuals of past observations can be calculated (real value - value in regression)

D.5 OLS as Method of Choice

Name the necessary data characteristics to ensure that OLS is the best estimation method of choice.

$X$ is 'weakly exogenous' (Explained in D.6)
'linearity assumption' (Explained in D.7)
White Noise Residuals $\implies$The Residuals are of White Noise
Absence of Multi-Colinearity

D.6 Weak-Exogeneity

Explain the concept of 'weak exogeneity'.

A data-set is weakly exogenous if it is measured without any measurement error and effectively a constant for the problem.