Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 6c724e5

Browse files
Update README.md
1 parent 2473380 commit 6c724e5

File tree

1 file changed

+54
-1
lines changed

1 file changed

+54
-1
lines changed

‎README.md‎

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -680,7 +680,46 @@ A test of a statistical hypothesis, where the region of rejection is on both sid
680680
**For example**, suppose the null hypothesis states that the mean is equal to 10. The alternative hypothesis would be that the mean is less than 10 or greater than 10. The region of rejection would consist of a range of numbers located on both sides of sampling distribution; that is, the region of rejection would consist partly of numbers that were less than 10 and partly of numbers that were greater than 10.
681681

682682

683-
# 12. Testing the Data
683+
# 12. Statistical Testing
684+
685+
Statistical Tests are intended to decide weather a hypothesis about distribution of one or more populations should be accepted or rejected.
686+
687+
Their are two type of statistical tests:
688+
#### (1) Parametric Tests
689+
#### (2) Non Parametric Tests
690+
691+
#### Why to use Statistical Testing?
692+
* To calculate the difference in the sample and population means
693+
* To find the difference in sample means
694+
* To test the significance of association between two variables
695+
* To calculate several population means
696+
* To test the difference in proportions between two independent populations
697+
* To test the difference in proporation between sample and population
698+
699+
#### What are parameters?
700+
* Parameters are numbers which summarize the data for the entrire population, while statistics are numbers which summarize the data from a sample
701+
* Parametric Testing is used for quanititve data and continuous variables
702+
703+
#### (1) Parametric Tests : A parametric test makes assumption regarding population parameters and distribution
704+
##### (a) Z Testing
705+
##### (b) Student T-Testing
706+
##### (c) P Testing
707+
##### (d) ANOVA Testing
708+
709+
#### (a) Z Testing:
710+
The Z Test is used for testing significance difference between two point estimates
711+
##### Assumptions for Z Test
712+
* The sample must be randomly selected and data must be quantitative
713+
* Sample should be larger
714+
* Data should follow a normal distribution
715+
716+
#### (2) Non-Parametric Tests:
717+
718+
### A/B Testing:
719+
720+
721+
722+
684723

685724
##### Problem 1: Two-Tailed Test
686725

@@ -748,6 +787,20 @@ Since we have a one-tailed test, the P-value is the probability that the z-score
748787
Interpret results. Since the P-value (0.04) is less than the significance level (0.05), we cannot accept the null hypothesis.
749788
Note: If you use this approach on an exam, you may also want to mention why this approach is appropriate. Specifically, the approach is appropriate because the sampling method was simple random sampling, the sample included at least 10 successes and 10 failures, and the population size was at least 10 times the sample size.
750789

790+
751791
# 13. Data Clustering
752792

793+
#### Introduction to Data Clustering
794+
Cluster is a group of objects that belongs to the same class. In other words, similar objects are grouped in one cluster and dissimilar objects are grouped in another cluster.
795+
796+
#### What is Clustering?
797+
798+
Clustering is the process of making a group of abstract objects into classes of similar objects.
799+
800+
#### Points to Remember
801+
* A cluster of data objects can be treated as one group.
802+
* While doing cluster analysis, we first partition the set of data into groups based on data similarity and then assign the labels to the groups.
803+
* The main advantage of clustering over classification is that, it is adaptable to changes and helps single out useful features that distinguish different groups.
804+
805+
753806
# 14. Regression Modelling

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /