I have been reviewing Linear Algebra concepts to get a better grasp of Machine Learning through a MOOC called “Mathematics for Machine Learning: Linear Algebra” offered by the Imperial College of London on Coursera. This blog post aims to summarize some of the basic concepts in Linear Algebra.
Vectors
What are vectors?
How is the direction of a vector represented? This is done in terms of the angle it forms with a set of standard bases, or axes as you would know them. In 2 dimensions, these would be the x and y axes. In 3 dimensions, these would be x, y and z axes. You can extend this idea to any number of higher dimensions.
What are standard bases?
They are vectors of unit length (magnitude 1) that are orthogonal (perpendicular) to each other. In 2 dimensions, the $x$ and $y$ axes form a plane, and in 3 dimensions, the $x$, $y$, and $z$ axes form a cuboid.
The reason why these are called standard bases is because they are the ones often used to describe vectors. Denoting the $x$ axis as $\hat{i}$, the $y$ axis as $\hat{j}$, and the $z$ axis as $\hat{k}$, we can describe any vector $\vec{v}$ as $\vec{v} = a\hat{i} + b\hat{j} + c\hat{k}$ where $a$, $b$, and $c$ are real numbers.
However, we could use other sets of bases to describe a vector space. These could be of unit length and orthogonal to each other, just like the standard bases, but the main idea is that you can have any vectors form bases as long as they are linearly independent.
What is linear dependence?
If you can describe one of the vectors in the set as a linear combination of the other vectors, then these vectors are linearly dependent.
Consider the case of 2 or 3 vectors in a regular 3-dimensional space. If you can describe one vector as a linear combination of the other vectors ($\vec{a} = n\vec{b}$) then it points in the same or the opposite direction but with its magnitude increased (scaled) by a factor of $n$. This means that it still lies along the same line as the other vector.
On the other hand, if one vector can be described as a linear combination of two other vectors ($\vec{a} = m\vec{b} + n\vec{c}$), then it means that it lies in the plane formed by those vectors because that is what the vector addition signifies.
How can you mathematically represent vectors?
You can operate with vectors in mathematics as lists of numbers, formed by writing down the values of the components along the different axes, wrapped in round or square brackets.
For example, $(3 \quad 4 \quad 5)$ is a 3-dimensional vector that has 3 units of length in the direction of the $x$ axis, 4 in the direction of the $y$ axis, and 5 in the direction of the $z$ axis.
To make things more standardized for discussion, instead of axes, most linear algebra texts use the term standard bases, but know that these essentially refer to the same concept. You can represent these as $(1 \quad 0)$ and $(0 \quad 1)$ in the 2-dimensional case and $(1 \quad 0 \quad 0)$, $(0 \quad 1 \quad 0)$, and $(0 \quad 0 \quad 1)$ in the 3-dimensional case.
Hence, you can represent any vector $\vec{v} = a\hat{i} + b\hat{j} + c\hat{k}$ as $(a \quad b \quad c)$ which is actually $a(1 \quad 0 \quad 0) + b(0 \quad 1 \quad 0) + c(0 \quad 0 \quad 1)$.
Is there an alternate way to represent vectors?
You can also represent them in terms of $r$, the magnitude of the vector, and the angles it forms with some of the bases. In the 2-dimensional case, for example, you can do so with variables $r$ and $\theta$, while for the 3-dimensional case, you can use $r$, $\theta$, and $\alpha$.
Using the bracket notation, you can write these as $(r \quad \theta)$ and $(r \quad \theta \quad \alpha)$ respectively.
The coordinates of the point that lie at the end of the vector are known as polar coordinates. By varying the angle, you can describe objects like circles and spheres in the 2 dimensional and 3 dimensional cases.
How can vectors be used in the real world?
When you store numerical data in tables in terms of rows and columns, you can think of each of these rows/columns as a vector. For example, in a table consisting of the heights and weights of different people, each person’s height and weight is a row vector whereas the list of heights or the list of weights is a column vector.
What are some operations you can do on vectors?
You can add two vectors following the parallelogram law or the triangle law. In order to add two vectors, you need to place the tail of the second vector on the head of the first. You can easily extend this to the general case of adding n vectors to get a polygon.
You can scale a vector by a factor of n. This essential means that you increase its magnitude by a factor of n. If n is positive, then the vector is stretched in the same direction. If n is zero, then the vector is reduced to a point. Otherwise, the vector is stretched in the opposite direction.
You can take the dot product of two vectors, $\vec{a} \cdot \vec{b}$, which is a scalar that is:
- The projection of vector $\vec{a}$ in the direction of $\vec{b}$ multiplied by the magnitude of $\vec{b}$, or
- The projection of $\vec{b}$ in the direction of $\vec{a}$ multiplied by the magnitude of $\vec{a}$.
Mathematically, it works out to be $\vec{a} \cdot \vec{b} = |\vec{a}| |\vec{b}| \cos(\theta)$.