Skip to main content

Vectors in Machine Learning

As data scientists we work with data in various formats such as text images and numerical values We often use vectors to represent data in a structured and efficient manner especially in machine learning applications In this blog post we will explore what vectors are in terms of machine learning their significance and how they are used

What is a Vector?

In mathematics, a vector is a mathematical object that has both magnitude and direction. In machine learning, a vector is a mathematical representation of a set of numerical values. Vectors are usually represented as arrays or lists of numbers, and each number in the list represents a specific feature or attribute of the data.

For example, suppose we have a dataset of houses, and we want to predict their prices based on their features such as the number of bedrooms, the size of the house, and the location. We can represent each house as a vector, where each element of the vector represents a specific feature of the house, such as the number of bedrooms, the size of the house, and the location. The resulting vector would look something like this:

house=[3, 1500, 1]

Here, the first element represents the number of bedrooms, the second element represents the size of the house in square feet, and the third element represents the location.

Why are Vectors Important in Machine Learning?

Vectors are important in machine learning because they allow us to represent complex data in a structured and efficient manner. In machine learning, we often work with large datasets that contain millions of data points, and representing each data point as a vector significantly reduces the computational complexity of the algorithms. Vectors also allow us to perform mathematical operations such as addition, subtraction, and multiplication on the data, which are essential in many machine learning algorithms.

Another reason why vectors are important in machine learning is that they allow us to compare and measure the similarity between different data points. For example, suppose we have a dataset of images of cats and dogs, and we want to classify new images as either a cat or a dog. We can represent each image as a vector of pixel values and use a distance metric such as Euclidean distance to measure the similarity between the images. The image with the closest vector would be classified as the same category.

How are Vectors Used in Machine Learning?

Vectors are used in various machine learning algorithms, including regression, classification, clustering, and dimensionality reduction. In regression, we use vectors to represent the input and output variables, where the input vector represents the features of the data, and the output vector represents the target variable we want to predict. In classification, we use vectors to represent the input data and the class labels. In clustering, we use vectors to group similar data points together. In dimensionality reduction, we use vectors to represent high-dimensional data in a lower-dimensional space, which makes the data easier to visualize and analyze

Feature Vector
In data science and machine learning, a feature vector is a numerical representation of an object or an instance in a dataset. It is a fundamental concept used to convert raw data into a format suitable for machine learning algorithms.

Each object or instance in a dataset is described by a set of features, which are measurable characteristics or attributes of that object. These features can be quantitative (e.g., numerical values) or qualitative (e.g., categories, labels, or binary values).

A feature vector is essentially a one-dimensional array or list that combines all the features of an object into a single vector. The order of the features in the vector is important, as it corresponds to the specific order in which the features are defined for that object. This vector format is necessary because most machine learning algorithms require the data to be represented in a numerical form.

Here's a more detailed explanation of a feature vector:
Dataset: A dataset contains multiple instances, and each instance represents an object, observation, or data point. For example, in a dataset of houses, each instance could represent a single house.

Features: Each instance in the dataset is described by a set of features. These features are the variables or attributes that characterize the object. For a house, features could include square footage, number of bedrooms, neighborhood, and age of the house.

Feature Vector: To perform machine learning tasks on this data, we create a feature vector for each instance. The feature vector is a representation of the object's features in a single list or array. It contains numerical values that encode the information about the object.

For example, let's say we have a dataset of houses with the following features:

Square footage (numerical)
Number of bedrooms (numerical)
Neighborhood (categorical: e.g., "Suburban," "Urban," "Rural")
Age of the house (numerical)

For a specific house in the dataset, the feature vector might look like this:

House A: Square footage = 1500, Number of bedrooms = 3, Neighborhood = "Urban," Age of the house = 20 years.

The feature vector for House A would be: [1500, 3, "Urban", 20]

Note that categorical features like "Urban" are often encoded using techniques like one-hot encoding or label encoding to represent them as numerical values in the feature vector.

Once the data is represented in the form of feature vectors, machine learning algorithms can be applied to learn patterns, make predictions, or perform other tasks based on the relationships between the features and the target variable. The quality and relevance of the feature vectors play a crucial role in the success of machine learning models and data analysis tasks. Feature engineering, which involves selecting, transforming, or creating relevant features, is an important aspect of machine learning and data science to improve model performance and accuracy.

Vector Operations

Vectors can be manipulated using various operations such as addition, subtraction, scalar multiplication, dot product, and cross product.

Addition and Subtraction: Vectors can be added or subtracted element-wise. For example, if we have two vectors a and b, the sum of the two vectors a + b would be a vector where the i-th element is the sum of the i-th element in a and b.

Example: Let vector A = [3, 5] and vector B = [-1, 2]. 
The sum of A and B is: A + B = [3 + (-1), 5 + 2] = [2, 7].
The difference between A and B is: A - B = [3 - (-1), 5 - 2] = [4, 3].

Scalar Multiplication: Vectors can be multiplied by a scalar value, which multiplies each element of the vector by the scalar value. For example, if we have a vector a and a scalar value c, the product of the vector and scalar c * a would be a vector where the i-th element is the i-th element in a multiplied by c.

Example: Let vector C = [2, 4, 6] and scalar k = 3. 
The scalar multiplication of C by k is: k * C = 3 * [2, 4, 6] = [6, 12, 18].

Dot Product: The dot product of two vectors is defined as the sum of the products of their corresponding elements. The dot product is a scalar value that measures the similarity between the two vectors. For example, if we have two vectors a and b, the dot product of the two vectors a . b would be a scalar value where the i-th element is the product of the i-th element in a and b, summed over all the elements.
Example: Let vector D = [1, 3, -5] and vector E = [4, -2, -1]. 
The dot product of D and E is: D · E = (1 * 4) + (3 * -2) + (-5 * -1) = 4 - 6 + 5 = 3.

Cross Product: The cross product of two 3-dimensional vectors is a vector that is perpendicular(orthogonal) to both vectors. The magnitude and direction of the cross product depend on the original vectors.The cross product is used in certain machine learning algorithms such as computer vision.

To compute the cross product of two vectors A = [A1, A2, A3] and B = [B1, B2, B3], you can use the following formula:

A × B = [ (A2 * B3) - (A3 * B2), (A3 * B1) - (A1 * B3), (A1 * B2) - (A2 * B1) ]

Here's an example to illustrate how to compute the cross product:
Example: Let vector A = [2, -3, 4] and vector B = [5, 1, 2].

The cross product of A and B is: A × B = [(-3 * 2) - (4 * 1), (4 * 5) - (2 * 2), (2 * 1) - ((-3) * 5)] 
= [-10, 16, 17].

Vector Projection: The vector projection of one vector onto another represents the component of the first vector that lies in the direction of the second vector.

Example: Let vector A = [4, 6] and vector B = [2, 1]. 
The projection of A onto B  is: 
proj_B(A) = ((A · B) / ||B||^2) * B, where ||B|| is the magnitude of vector B.
 proj_B(A) = ((4 * 2 + 6 * 1) / (2^2 + 1^2)) * [2, 1] 
= (8 + 6) / 5 * [2, 1] 
= [28/5, 15/5].
Angle Between Vectors
To find the angle between two vectors, you can use the dot product formula and trigonometric functions. The dot product of two vectors can be used to calculate the angle between them using the following formula:

θ = cos^(-1) [(A · B) / (||A|| * ||B||)]

where:
θ is the angle between the vectors A and B.
A · B is the dot product of vectors A and B.
||A|| and ||B|| are the magnitudes (or lengths) of vectors A and B, respectively.

Let's illustrate this with an example:

Example:
Consider two 2-dimensional vectors:
A = [3, 1]
B = [2, 4]

Step 1: Calculate the dot product of A and B.
A · B = (3 * 2) + (1 * 4) = 6 + 4 = 10

Step 2: Calculate the magnitudes of vectors A and B.
||A|| = √(3^2 + 1^2) = √(9 + 1) = √10 ≈ 3.16
||B|| = √(2^2 + 4^2) = √(4 + 16) = √20 ≈ 4.47

Step 3: Plug the values into the angle formula.
θ = cos^(-1) [(A · B) / (||A|| * ||B||)]
θ = cos^(-1) [10 / (3.16 * 4.47)]
θ = cos^(-1) [10 / 14.12]
θ ≈ cos^(-1) [0.708]

Step 4: Calculate the angle in degrees.
θ ≈ 45.54 degrees

So, the angle between vectors A and B is approximately 45.54 degrees.

Python Code:
import numpy as np
A=np.array([2,-3,4])
B=np.array([5,1,2])
print("Vector A")
print(A)
print("Vector B")
print(B)
print("Vector Addition")
print(A+B)
print("Vector Subtraction")
print(A-B)
print("Scalar Multiplication")
print(2*A)
print("Dot product")
print(A.dot(B))
print("Cross Product")
print(np.cross(A,B))
print("Magnitude of vector A")
print(np.linalg.norm(A))
print("Projection of A onto B")
a=np.array([4,6])
b=np.array([2,1])
print((np.dot(a, b) / np.dot(b, b)) * b)
print("Angle between A and B")
a=np.array([3,1])
b=np.array([2,4])
print(degrees(acos(np.dot(a, b) / (sqrt(np.dot(a, a))*sqrt(np.dot(b, b))))))

Linear Combination of vectors

In linear algebra, a linear combination of vectors refers to the vector obtained by scaling each vector by some scalar factor and then adding them together. This operation is fundamental in vector spaces and plays a crucial role in various mathematical applications, including solving systems of linear equations and understanding vector spaces and subspaces.

Mathematically, the linear combination of vectors is expressed as follows:

Let's say we have 'n' vectors, v1, v2, v3, ..., vn, and 'n' corresponding scalar coefficients, c1 c2, c3, ..., cn. The linear combination of these vectors is given by:

c1 * v1 + c2 * v2 + c3 * v3 + ... + cn * vn

The coefficients c1, c2, c3, ..., cn represent how much each vector is scaled or weighted in the combination.

Example: Consider two 2-dimensional vectors:
v₁ = [2, 4] v₂ = [-1, 3]
Let's find a linear combination of these vectors using arbitrary coefficients:
c₁ = 3 c₂ = -2
The linear combination of v₁ and v₂ with the given coefficients would be:
Linear combination = 3 * v₁ + (-2) * v₂
To calculate the result, we perform scalar multiplication on each vector and then add them:
3 * v₁ = 3 * [2, 4] = [6, 12] (-2) * v₂ = (-2) * [-1, 3] = [2, -6]

Now, add the scaled vectors:
[6, 12] + [2, -6] = [8, 6]

So, the linear combination of the vectors v₁ = [2, 4] and v₂ = [-1, 3] with coefficients c₁ = 3 and c₂ = -2 is the vector [8, 6]. This means that if we take three times the vector v₁ and subtract two times the vector v₂, we get the resulting vector [8, 6].

Linearly Dependent and Independent Vectors

In linear algebra, a set of vectors is said to be linearly independent if none of the vectors in the set can be expressed as a linear combination of the others. In other words, no vector in the set can be formed by scaling and adding other vectors from the same set.

Conversely, a set of vectors is linearly dependent if at least one vector in the set can be expressed as a linear combination of the others. This means that there are non-zero coefficients that can be multiplied with each vector to create a linear combination that equals the zero vector.

Let's look at examples to illustrate both cases:

Example of Linearly Independent Vectors:
Consider two 2-dimensional vectors:

v₁ = [2, 1]
v₂ = [-3, 5]

To check if these vectors are linearly independent, we need to find coefficients c₁ and c₂ (not both zero) such that:

c₁ * v₁ + c₂ * v₂ = [0, 0]

In this case, we can set c₁ = 5 and c₂ = 2. Let's verify:

5 * v₁ + 2 * v₂ = 5 * [2, 1] + 2 * [-3, 5] = [10, 5] + [-6, 10] = [4, 15]

Since we cannot find coefficients (other than both being zero) that make the sum equal to [0, 0], the vectors v₁ and v₂ are linearly independent.

Example of Linearly Dependent Vectors:
Consider two 3-dimensional vectors:

u₁ = [1, 2, 3]
u₂ = [2, 4, 6]

To check if these vectors are linearly dependent, we need to find coefficients c₁ and c₂ (not both zero) such that:

c₁ * u₁ + c₂ * u₂ = [0, 0, 0]

In this case, we can set c₁ = 2 and c₂ = -1. Let's verify:

2 * u₁ + (-1) * u₂ = 2 * [1, 2, 3] + (-1) * [2, 4, 6] = [2, 4, 6] + [-2, -4, -6] = [0, 0, 0]

Since we can find non-zero coefficients that make the sum equal to [0, 0, 0], the vectors u₁ and u₂ are linearly dependent.

In summary, linearly independent vectors cannot be expressed as a linear combination of each other, while linearly dependent vectors can be expressed as a linear combination of at least one vector from the set.

Orthogonal Vectors
Orthogonal vectors are a special case of vectors in which their dot product is equal to zero. Geometrically, orthogonal vectors are perpendicular to each other, forming a right angle between them.

Mathematically, two vectors A and B are orthogonal if their dot product (also known as the scalar product) is zero:

A · B = 0

This means that the angle between the two vectors is 90 degrees (or π/2 radians).

Let's look at an example of orthogonal vectors:

Example:
Consider two 2-dimensional vectors:

A = [2, 1]
B = [-1, 2]

To check if these vectors are orthogonal, we need to compute their dot product:

A · B = (2 * -1) + (1 * 2) = -2 + 2 = 0

Since the dot product is zero, the vectors A and B are orthogonal to each other. Visually, if you were to plot these vectors on a Cartesian coordinate system, they would intersect at a right angle (90 degrees).

Orthonormal vectors
Orthonormal vectors are a special type of vectors in which they are both orthogonal (perpendicular to each other) and normalized (having a length of 1). In other words, orthonormal vectors are unit vectors that are also mutually perpendicular.

Mathematically, two vectors A and B are orthonormal if:

They are orthogonal: A · B = 0
They are normalized: ||A|| = ||B|| = 1
Here's an example of orthonormal vectors:

Example:
Consider two 3-dimensional vectors:

A = [1/√2, 1/√2, 0]
B = [-1/√2, 1/√2, 0]

Step 1: Verify orthogonality.
A · B = (1/√2 * -1/√2) + (1/√2 * 1/√2) + (0 * 0) = -1/2 + 1/2 + 0 = 0

Since the dot product is zero, vectors A and B are orthogonal.

Step 2: Verify normalization.
||A|| = √((1/√2)^2 + (1/√2)^2 + 0^2) = √(1/2 + 1/2 + 0) = √1 = 1
||B|| = √((-1/√2)^2 + (1/√2)^2 + 0^2) = √(1/2 + 1/2 + 0) = √1 = 1

Since both vectors have a length of 1, they are normalized.

Thus, vectors A and B are orthonormal vectors since they satisfy both the conditions of orthogonality and normalization.

Visually, if you were to plot these vectors in a 3-dimensional Cartesian coordinate system, they would form a right angle (90 degrees) at the origin, and each vector's length would be one unit.

Colinear vectors
Colinear vectors are a special case of vectors that lie on the same straight line or are parallel to each other. In other words, two or more vectors are colinear if they point in the same or opposite directions and are proportional to each other.

Mathematically, two vectors A and B are colinear if there exists a scalar value (k) such that:

A = k * B

Here's an example of colinear vectors:

Example:
Consider two 2-dimensional vectors:

A = [3, 6]
B = [-1, -2]

To check if these vectors are colinear, we need to find a scalar k such that:

A = k * B

Let's calculate the scalar value (k):

k = A₁ / B₁ = 3 / (-1) = -3

Now, let's verify if A = k * B:

A = -3 * B = -3 * [-1, -2] = [3, 6]

As we can see, vector A and vector B are proportional to each other, with a scalar value of -3. This means they are colinear vectors, and they lie on the same straight line. In this example, vector A is three times the magnitude of vector B and points in the same direction as B.

Euclidean space

Euclidean space, also known as Euclidean n-space, is a fundamental concept in mathematics and refers to a geometric space of a specific dimension where points are represented by n-tuples of real numbers. The space is named after the ancient Greek mathematician Euclid, who laid the foundations for geometry in his work "Elements."

In Euclidean space, any point is described using coordinates corresponding to its location in n-dimensional space. For example:

In 1-dimensional Euclidean space (a straight line), points are represented as real numbers on the number line.

In 2-dimensional Euclidean space (a plane), points are represented as ordered pairs (x, y) corresponding to their location in the x-axis and y-axis.

In 3-dimensional Euclidean space (ordinary 3D space), points are represented as ordered triples (x, y, z) corresponding to their location in the x-axis, y-axis, and z-axis.

In general, for n-dimensional Euclidean space (often denoted as Rⁿ), points are represented as ordered n-tuples (x1, x2, ..., xn), where each xᵢ is a real number, and there are n coordinates to specify the point's location in n-dimensional space.

Key characteristics of Euclidean space include:

Distance: Euclidean space uses the Euclidean distance formula to measure the distance between two points. In 2D space, the distance between points (x₁, y₁) and (x₂, y₂) is given by √((x₂ - x₁)² + (y₂ - y₁)²), and in 3D space, the distance between points (x₁, y₁, z₁) and (x₂, y₂, z₂) is given by √((x₂ - x₁)² + (y₂ - y₁)² + (z₂ - z₁)²).

Orthogonality: In 2D and 3D Euclidean space, orthogonal vectors are vectors that are perpendicular to each other, forming right angles.

Geometric Transformations: Euclidean space allows for various geometric transformations, such as translations, rotations, reflections, and scaling.

Euclidean space serves as a fundamental framework for many branches of mathematics, including geometry, linear algebra, calculus, and various areas of physics and engineering. It provides a natural and intuitive way to model and understand spatial relationships and is a foundational concept in mathematics and its applications.

Comments

Popular posts from this blog

Mathematics for Machine Learning- CST 284 - KTU Minor Notes - Dr Binu V P

  Introduction About Me Syllabus Course Outcomes and Model Question Paper Question Paper July 2021 and evaluation scheme Question Paper June 2022 and evaluation scheme Overview of Machine Learning What is Machine Learning (video) Learn the Seven Steps in Machine Learning (video) Linear Algebra in Machine Learning Module I- Linear Algebra 1.Geometry of Linear Equations (video-Gilbert Strang) 2.Elimination with Matrices (video-Gilbert Strang) 3.Solving System of equations using Gauss Elimination Method 4.Row Echelon form and Reduced Row Echelon Form -Python Code 5.Solving system of equations Python code 6. Practice problems Gauss Elimination ( contact) 7.Finding Inverse using Gauss Jordan Elimination  (video) 8.Finding Inverse using Gauss Jordan Elimination-Python code Vectors in Machine Learning- Basics 9.Vector spaces and sub spaces 10.Linear Independence 11.Linear Independence, Basis and Dimension (video) 12.Generating set basis and span 13.Rank of a Matrix 14.Linear Mapping and Matri

1.1 Solving system of equations using Gauss Elimination Method

Elementary Transformations Key to solving a system of linear equations are elementary transformations that keep the solution set the same, but that transform the equation system into a simpler form: Exchange of two equations (rows in the matrix representing the system of equations) Multiplication of an equation (row) with a constant  Addition of two equations (rows) Add a scalar multiple of one row to the other. Row Echelon Form A matrix is in row-echelon form if All rows that contain only zeros are at the bottom of the matrix; correspondingly,all rows that contain at least one nonzero element are on top of rows that contain only zeros. Looking at nonzero rows only, the first nonzero number from the left pivot (also called the pivot or the leading coefficient) is always strictly to the right of the  pivot of the row above it. The row-echelon form is where the leading (first non-zero) entry of each row has only zeroes below it. These leading entries are called pivots Example: $\begin

4.3 Sum Rule, Product Rule, and Bayes’ Theorem

 We think of probability theory as an extension to logical reasoning Probabilistic modeling  provides a principled foundation for designing machine learning methods. Once we have defined probability distributions corresponding to the uncertainties of the data and our problem, it turns out that there are only two fundamental rules, the sum rule and the product rule. Let $p(x,y)$ is the joint distribution of the two random variables $x, y$. The distributions $p(x)$ and $p(y)$ are the corresponding marginal distributions, and $p(y |x)$ is the conditional distribution of $y$ given $x$. Sum Rule The addition rule states the probability of two events is the sum of the probability that either will happen minus the probability that both will happen. The addition rule is: $P(A∪B)=P(A)+P(B)−P(A∩B)$ Suppose $A$ and $B$ are disjoint, their intersection is empty. Then the probability of their intersection is zero. In symbols:  $P(A∩B)=0$  The addition law then simplifies to: $P(A∪B)=P(A)+P(B)$  wh