\[ r^{th}\ moment = \frac{1}{n} \Sigma(x_i - A)^r \]
*** About Mean (Central Moment)
When A = Mean, then the moment is called central moment.
\[ \mu_r = \frac{1}{n} \Sigma(x_i - Mean)^r \]
*** About Zero (Raw Moment)
When A = 0, then the moment is called raw moment.
\[ \mu_r^{'} = \frac{1}{n} \Sigma x_i^r \]
* Grouped Data
Data which is grouped based on the frequency at which it occurs. So if 9 appears 5 times in our observations, we group as x(observation) = 9 and f (frequency) = 5.
#+attr_latex: :align |c|c|c|
|------------------+---------------|
| x (observations) | f (frequency) |
|------------------+---------------|
| 2 | 5 |
| 1 | 3 |
| 4 | 5 |
| 8 | 9 |
|------------------+---------------|
If we store it in data way, i.e. the observations are of form 10-20, 20-30, 30-40 ... then we will get $x_i$ by doing
Now we select one bag at random (i.e, the probability of choosing any of the two bags is equal so 0.5). If we draw a marble, what is the probability that it is a green marble?
*Sol.* The green marbles are in parts in bag 1 and bag 2. \\
Now, we can use the law of total probability to get
\[ P(G) = P(G|B_1)P(B_1) + P(G|B_2)P(B_2) \]
*Example* 2, Suppose a there are 3 forests in a park.
+ Forest A occupies 50% of land and 20% plants in it are poisonous
+ Forest B occupies 30% of land and 40% plants in it are poisonous
+ Forest C occupies 20% of land and 70% plants in it are poisonous
What is the probability of a random plant from the park being poisonous.
*Sol.* Since probability is equal across whole area of the park. Event A is plant being from Forest A, Event B is plant being from Forest B and Event C is plant being from Forest C. If event P is plant being poisonous, then using law of total probability,
\[ P(P) = P(P|A)P(A) + P(P|B)P(B) + P(P|C)P(C) \]
And we know P(A) = 0.5, P(B) = 0.3 and P(C) = 0.2. Also P(P|A) = 0.20, P(P|B) = 0.40 and P(P|C) = 0.70
** Some basic identities
+ Probabilities follow law of inclusion and exclusion
\[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \]
+ DeMorgan's Theorem
\[ P(\overline{A \cap B }) = P(\overline{A} \cup \overline{B}) \]
\[ P(\overline{A \cup B }) = P(\overline{A} \cap \overline{B}) \]
It is a mathematical function that gives probability of occurance of different possible outcomes. We use variables to represent these possible outcomes called *random variables*. These are represented by capital letters. Example, $X$, $Y$, etc. We use these random variables as:
\\
Suppose X is flipping two coins.
\[ X = \{HH, HT, TT, TH\} \]
We can represent it as,
\[ X = \{0, 1, 2, 3\} \]
Now we can write a probability function $P(X=x)$ for flipping two coins as :
#+attr_latex: :align |c|c|c|
|-----+----------|
| $x$ | $P(X=x)$ |
|-----+----------|
| 0 | 0.25 |
| 1 | 0.25 |
| 2 | 0.25 |
| 3 | 0.25 |
|-----+----------|
Another example is throwing two dice and our random variable $X$ is sum of those two dice.
#+attr_latex: :align |c|c|c|
|-----+----------------|
| $x$ | $P(X=x)$ |
|-----+----------------|
| 2 | $1/36$ |
| 3 | $2/36$ |
| 4 | $3/36$ |
| 5 | $4/36$ |
| 6 | $5/36$ |
| 7 | $6/36$ |
| 8 | $5/36$ |
| 9 | $4/36$ |
| 10 | $3/36$ |
| 11 | $2/36$ |
| 12 | $1/36$ |
|-----+----------------|
** Types of probability functions (Continious and Discrete random variables)
Based on the range of the Random variables, probability function has two different names.
+ For discrete random variables it is called Probability Distribution function.
+ For continious random variables it is called Probability Density function.
* Proability Mass Function
If we can get a function such that,
\[ f(x) = P(X=x) \]
then $f(x)$ is called a *Probability Mass Function* (PMF).
** Properties of Probability Mass Function
Suppose a PMF
\[ f(x) = P(X=x) \]
Then,
*** For discrete variables
\[ \Sigma f(x) = 1 \]
\[ E(X^n) = \Sigma x^n f(x) \]
For $E(X)$, the summation is over all possible values of x.
The use of a binomial distribution is to calculate a known probability repeated n number of times, i.e, doing *n* number of trials.
A binomial distribution deals with discrete random variables.
\[ X = \{ 0,1,2, .... n \} \]
where *n* is the number of trials.
\[ P(X=x) = \ ^nC_x\ (p)^x(q)^{n-x} \]
Here
\[ n \rightarrow number\ of\ trials \]
\[ x \rightarrow number\ of\ successes \]
\[ p \rightarrow probability\ of\ success \]
\[ q \rightarrow probability\ of\ failure \]
\[ p = 1 - q \]
+ Mean
\[ Mean = np \]
+ Variance
\[ Variance = npq \]
+ Moment Generating Function
\[ M(t) = (q + pe^t)^n \]
** Additive Property of Binomial Distribution
For an independent variable $X$. The binomial distribution is represented as
\[ X ~ B(n,p) \]
Here,
\[ n \rightarrow number\ of\ trials \]
\[ p \rightarrow probability\ of\ success \]
+ Property
If given,
\[ X_1 \sim B(n_1, p) \]
\[ X_2 \sim B(n_2, p) \]
Then,
\[ X_1 + X_2 \sim B(n_1 + n_2, p) \]
+ *NOTE*
If
\[ X_1 \sim B(n_1, p_1) \]
\[ X_2 \sim B(n_2, p_2) \]
Then $X_1 + X_2$ is not a binomial distribution.
** Using a binomial distribution
We can use binomial distribution to easily calculate probability of multiple trials, if probability of one trial is known. Example, the probability of a duplet (both dice have same number) when two dice are thrown is $\frac{6}{36}$. \\
Suppose now we want to know the probability of a 3 duplets if a pair of dice is thrown 5 times. So in this case :
The normal distribution with Mean 0 and Variance 1 is called the standard normal distribution.
\[ Z \sim N(0,1) \]
To calculate area under a given normal distribution, we can use the standard normal distribution. For that we need to calculate corresponding values in standard distribution from our given distribution. For that we have formula
\[ For\ X \sim N(\mu, \sigma) \]
\[ z = \frac{x - \mu}{\sigma} \]
\[ x \rightarrow value\ in\ our\ normal\ distribution \]
\[ \mu \rightarrow mean\ of\ our\ distribution \]
\[ \sigma \rightarrow standard\ deviation\ of\ our\ distribution \]
\[ z \rightarrow corresponding\ value\ in\ standard\ normal\ distribution \]
Example,
Suppose for a normal distribution with X \sim N(\mu, \sigma) and we want to calculate probability P(a < X < b), then the ranges for same proability in the Z normal distribution will be,