You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+37Lines changed: 37 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -462,6 +462,43 @@ The conditional probability is the probability that some event(s) occur given th
462
462
**Example:** The probability that a card is a four given that we have drawn a red card is P(4|red) = 2/26 = 1/13. (There are 52 cards in the pack, 26 are red and 26 are black. Now because we’ve already picked a red card, we know that there are only 26 cards to choose from, hence why the first denominator is 26).
463
463
464
464
465
+
#### Central Limit Theorem (CLT)
466
+
The central limit theorem (CLT) is simple. It just says that with a large sample size, sample means are normally distributed.
467
+
468
+
Well, the central limit theorem (CLT) is at the heart of hypothesis testing – a critical component of the data science lifecycle.
469
+
470
+
#### Formally Defining the Central Limit Theorem:
471
+
Given a dataset with unknown distribution (it could be uniform, binomial or completely random), the sample means will approximate the normal distribution.
472
+
473
+
#### Assumptions Behind the Central Limit Theorem
474
+
Before we dive into the implementation of the central limit theorem, it’s important to understand the assumptions behind this technique:
475
+
476
+
* The data must follow the **randomization condition**. It must be sampled randomly
477
+
* Samples should be **independent of each other**. One sample should not influence the other samples
478
+
***Sample size** should be not more than 10% of the population when sampling is done without replacement
479
+
* The sample size should be sufficiently large. Now, how we will figure out how large this size should be? Well, it depends on the population. When the population is skewed or asymmetric, the sample size should be large. If the population is symmetric, then we can draw small samples as well
480
+
In general, a **sample size of 30 is considered sufficient when the population is symmetric**.
481
+
482
+
The mean of the sample means is denoted as:
483
+
484
+
μ X̄ = μ
485
+
486
+
where,
487
+
488
+
μ X̄ = Mean of the sample means
489
+
μ= Population mean
490
+
And, the standard deviation of the sample mean is denoted as:
491
+
492
+
493
+
σ X̄ = σ/sqrt(n)
494
+
495
+
where,
496
+
497
+
σ X̄ = Standard deviation of the sample mean
498
+
σ = Population standard deviation
499
+
n = sample size
500
+
501
+
465
502
# 9. Baye's Theorem (aka, Bayes Rule)
466
503
Before understanding Baye's Theorem first we learn about **Conditional Probability**:
0 commit comments