Skip to main content

4.1b Probability Distributions for Continuous Random Variable

For a discrete random variable $X$ the probability that $X$ assumes one of its possible values on a single trial of the experiment makes good sense. This is not the case for a continuous random variable. 

For example, suppose $X$ denotes the length of time a commuter just arriving at a bus stop has to wait for the next bus. If buses run every 30 minutes without fail, then the set of possible values of X is the interval denoted [0,30], the set of all decimal numbers between 0 and 30. But although the number 7.211916 is a possible value of $X$, there is little or no meaning to the concept of the probability that the commuter will wait precisely 7.211916 minutes for the next bus. If anything the probability should be zero, since if we could meaningfully measure the waiting time to the nearest millionth of a minute it is practically inconceivable that we would ever get exactly 7.211916 minutes. More meaningful questions are those of the form: What is the probability that the commuter's waiting time is less than 10 minutes, or is between 5 and 10 minutes? In other words, with continuous random variables one is concerned not with the event that the variable assumes a single particular value, but with the event that the random variable assumes a value in a particular interval.

Definition

The probability distribution of a continuous random variable $X$ is an assignment of probabilities to intervals of decimal numbers using a function $f(x)$, called a density function, in the following way: the probability that $X$ assumes a value in the interval $[a,b]$ is equal to the area of the region that is bounded above by the graph of the equation $y=f(x)$, bounded below by the x-axis, and bounded on the left and right by the vertical lines through $a$ and $b$. The total area under the curve is 1.


Every density function $f(x)$ must satisfy the following two conditions:


For all numbers $x, f(x)≥0$, so that the graph of $y=f(x)$ never drops below the x-axis.
The area of the region under the graph of $y=f(x)$ and above the x-axis is 1.

Because the area of a line segment is 0, the definition of the probability distribution of a continuous random variable implies that for any particular decimal number, say 'a', the probability that $X$ assumes the exact value 'a' is 0. This property implies that whether or not the endpoints of an interval are included makes no difference concerning the probability of the interval.

For any continuous random variable X:
$P(a≤X≤b)=P(a<X≤b)=P(a≤X<b)=P(a<X<b)$

Example
A random variable $X$ has the uniform distribution on the interval [0,1]]: the density function is $f(x)=1$, if x is between 0 and 1 and $f(x)=0$ for all other values of $x$, as shown in Figure 

  1. Find $P(X > 0.75)$, the probability that X assumes a value greater than 0.75.
  2. Find $P(X ≤ 0.2)$, the probability that X assumes a value less than or equal to 0.2.
  3. Find $P(0.4 < X < 0.7)$, the probability that X assumes a value between 0.4 and 0.7.
1.$P(X > 0.75)$ is the area of the rectangle of height 1 and base length 1−0.75=0.25, hence is $base×height=(0.25)(1)=0.25$

2.$P(X ≤ 0.2)$ is the area of the rectangle of height 1 and base length 0.2−0=0.2, hence is base×height=(0.2)⋅(1)=0.2

3.P(0.4 < X < 0.7) is the area of the rectangle of height 1 and length 0.7−0.4=0.3, hence is base×height=(0.3)⋅(1)=0.3.

A man arrives at a bus stop at a random time (that is, with no regard for the scheduled service) to catch the next bus. Buses run every 30 minutes without fail, hence the next bus will come any time during the next 30 minutes with evenly distributed probability (a uniform distribution). Find the probability that a bus will come within the next 10 minutes.

The graph of the density function is a horizontal line above the interval from 0 to 30 and is the x-axis everywhere else. Since the total area under the curve must be 1, the height of the horizontal line is 1/30. The probability sought is $P(0≤X≤10)$. By definition, this probability is the area of the rectangular region bounded above by the horizontal line $f(x)=1∕30$, bounded below by the x-axis, bounded on the left by the vertical line at 0 (the y-axis), and bounded on the right by the vertical line at 10. This is the shaded region.Its area is the base of the rectangle times its height, 10⋅(1∕30)=1∕3. 

Thus P(0≤X≤10)=1∕3.



Normal Distribution
Most people have heard of the “bell curve.” It is the graph of a specific density function $f(x)$ that describes the behavior of continuous random variables as different as the heights of human beings, the amount of a product in a container that was filled by a high-speed packing machine, or the velocities of molecules in a gas. The formula for $f(x)$ contains two parameters $\mu$ and $\sigma$ that can be assigned any specific numerical values, so long as $\sigma$ is positive. We will not need to know the formula for $f(x)$, but for those who are interested it is

$f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{\frac{-1/2(\mu-x)^2}{\sigma^2}}$

The probability distribution corresponding to the density function for the bell curve with parameters $\mu$ and $\sigma$ is called the normal distribution with mean $\mu$ and standard deviation $\sigma$.
A continuous random variable whose probabilities are described by the normal distribution with mean $\mu$ and standard deviation $\sigma$ is called a normally distributed random variable, or a normal random variable for short, with mean $\mu$ and standard deviation $\sigma$

The density curve for the normal distribution is symmetric about the mean.




Standard normal distribution
A standard normal random variable is a normally distributed random variable with mean $\mu = 0$ and standard deviation $\sigma = 1$. It will always be denoted by the letter $Z$.


The probability values can be obtained from the table.

Probability Computations for General Normal Random Variables

If $X$ is a normally distributed random variable with mean $\mu$ and standard deviation $\sigma$, then
$P(a<X<b)=P(\frac{a−μ}{σ}<Z<\frac{b−\mu}{\sigma})$
where $Z$ denotes a standard normal random variable. a can be any decimal number or −∞−∞; $b$ can be any decimal number or ∞.

The new endpoints $\frac{(a−\mu)}{\sigma}$ and $\frac{(b−\mu)}{\sigma}$ are the $z$-scores of $a$ and $b$.


Example

Let $X$ be a normal random variable with mean $\mu = 10$ and standard deviation $\sigma = 2.5$. Compute the following probabilities.
1.$P(X < 14)$.
2.$P(8<X<14)$.

1.$P(X<14)=P(Z<\frac{14-\mu}{\sigma})$
$\quad =P(Z<\frac{14-10}{2.5})$
$=P(Z<1.60)$
$=0.9452$


2.$P(8<X<14)$
 $= P(\frac{8-10}{2.5} <X < \frac{14-10}{2.5})$
$=P(-0.80 < Z < 1.60)$
$=0.9452-0.2119$
$=0.7333$


Comments

Popular posts from this blog

Mathematics for Machine Learning- CST 284 - KTU Minor Notes - Dr Binu V P

  Introduction About Me Syllabus Course Outcomes and Model Question Paper Question Paper July 2021 and evaluation scheme Question Paper June 2022 and evaluation scheme Overview of Machine Learning What is Machine Learning (video) Learn the Seven Steps in Machine Learning (video) Linear Algebra in Machine Learning Module I- Linear Algebra 1.Geometry of Linear Equations (video-Gilbert Strang) 2.Elimination with Matrices (video-Gilbert Strang) 3.Solving System of equations using Gauss Elimination Method 4.Row Echelon form and Reduced Row Echelon Form -Python Code 5.Solving system of equations Python code 6. Practice problems Gauss Elimination ( contact) 7.Finding Inverse using Gauss Jordan Elimination  (video) 8.Finding Inverse using Gauss Jordan Elimination-Python code Vectors in Machine Learning- Basics 9.Vector spaces and sub spaces 10.Linear Independence 11.Linear Independence, Basis and Dimension (video) 12.Generating set basis and span 13.Rank of a Matrix 14.Linear Mapping and Matri

1.1 Solving system of equations using Gauss Elimination Method

Elementary Transformations Key to solving a system of linear equations are elementary transformations that keep the solution set the same, but that transform the equation system into a simpler form: Exchange of two equations (rows in the matrix representing the system of equations) Multiplication of an equation (row) with a constant  Addition of two equations (rows) Add a scalar multiple of one row to the other. Row Echelon Form A matrix is in row-echelon form if All rows that contain only zeros are at the bottom of the matrix; correspondingly,all rows that contain at least one nonzero element are on top of rows that contain only zeros. Looking at nonzero rows only, the first nonzero number from the left pivot (also called the pivot or the leading coefficient) is always strictly to the right of the  pivot of the row above it. The row-echelon form is where the leading (first non-zero) entry of each row has only zeroes below it. These leading entries are called pivots Example: $\begin

4.3 Sum Rule, Product Rule, and Bayes’ Theorem

 We think of probability theory as an extension to logical reasoning Probabilistic modeling  provides a principled foundation for designing machine learning methods. Once we have defined probability distributions corresponding to the uncertainties of the data and our problem, it turns out that there are only two fundamental rules, the sum rule and the product rule. Let $p(x,y)$ is the joint distribution of the two random variables $x, y$. The distributions $p(x)$ and $p(y)$ are the corresponding marginal distributions, and $p(y |x)$ is the conditional distribution of $y$ given $x$. Sum Rule The addition rule states the probability of two events is the sum of the probability that either will happen minus the probability that both will happen. The addition rule is: $P(A∪B)=P(A)+P(B)−P(A∩B)$ Suppose $A$ and $B$ are disjoint, their intersection is empty. Then the probability of their intersection is zero. In symbols:  $P(A∩B)=0$  The addition law then simplifies to: $P(A∪B)=P(A)+P(B)$  wh