티스토리 뷰
🔷 Hypergeometric Distribution
✅ Definition
The Hypergeometric Distribution describes a discrete probability distribution that models the probability of obtaining exactly \( k \) target items when drawing \( n \) samples without replacement from a finite population of size \( N \), which contains \( K \) such target items.
The distribution is defined by the following parameters:
- \( N \): Population size
- \( K \): Number of target (or "success") items in the population
- \( n \): Sample size
- \( X \): Number of target items in the sample (random variable)
The probability mass function (PMF) of the Hypergeometric Distribution is:
\[ P(X = k) = \frac{\binom{K}{k} \binom{N-K}{n-k}}{\binom{N}{n}} \]
This formula calculates the proportion of combinations where \( k \) target items and \( n-k \) non-target items are drawn simultaneously, out of all possible combinations.
✅ Intuitive Explanation
This distribution models situations where selected items are not replaced after being drawn. Each draw affects the probabilities of the next, since the population changes. This creates a dynamic and highly realistic probability structure.
For example, imagine a box full of colored balls, and you want to know how many red ones you’ll get if you draw a few. If you do not put the ball back after each draw, the chance of drawing another red ball changes each time.
The Hypergeometric Distribution predicts how many of your target items (like red balls) will appear in a sample like this. In contrast, distributions like Bernoulli or Binomial assume replacement or independent trials, which is not the case here.
This model also assumes a finite population. If the population were infinite, or effectively very large, the assumption would no longer hold. That’s why both conditions—finite size and no replacement—must be satisfied.
✅ Real-World Examples
✅ Scenario Descriptions
1. Candy Sharing
A bag contains 20 candies: 5 are strawberry flavor and the rest are lemon.
Suppose friends randomly pick 4 candies each.
The number of strawberry candies each person gets can vary every time.
The key point is that once someone takes a candy, it doesn’t go back. So later people have fewer options, and the chances of picking a strawberry candy also change.
2. Pencils in a Case
A student has 10 pencils: 6 blue and 4 black.
If the student randomly picks 3 pencils without looking,
the number of blue pencils picked will not always be the same.
Once one pencil is picked, both the total number and the number of blue pencils decrease. So the probability of drawing a blue pencil next changes each time. The Hypergeometric Distribution fits this case well.
3. Defective Products Inspection
A factory produces 100 products, 10 of which are defective.
If 5 products are randomly selected for inspection,
how many defective items appear is uncertain.
That’s when the Hypergeometric Distribution becomes useful.
Since selected items are not returned, each draw influences the next. This violates independence and requires a model like the Hypergeometric Distribution.
✅ Formula Explanation
✅ Meaning of Combination Symbols
The Hypergeometric PMF is given by:
\[ P(X = k) = \frac{\binom{K}{k} \binom{N-K}{n-k}}{\binom{N}{n}} \]
Here, the binomial coefficient \(\binom{a}{b}\) represents the number of ways to choose \( b \) items from \( a \) items without regard to order. This is called a combination and is defined as:
\[ \binom{a}{b} = \frac{a!}{b!(a-b)!} \]
For example, suppose we have two 'a's and one 'b' and want to know all the unique 3-letter combinations:
- aab
- aba
- baa
These represent different positions for the two 'a's. The number of ways to choose two positions for 'a' from three is \(\binom{3}{2} = 3\), exactly matching our cases.
In the Hypergeometric formula:
- \(\binom{K}{k}\): ways to choose \( k \) target items from \( K \)
- \(\binom{N-K}{n-k}\): ways to choose \( n-k \) non-target items from the rest
- \(\binom{N}{n}\): total ways to draw \( n \) items from \( N \)
✅ Numerical Examples
1. Pencil Example
\[ P(X = 2) = \frac{\binom{6}{2} \binom{4}{1}}{\binom{10}{3}} = \frac{15 \cdot 4}{120} = \frac{60}{120} = 0.5 \]
2. Candy Example
\[ P(X = 1) = \frac{\binom{5}{1} \binom{15}{3}}{\binom{20}{4}} = \frac{5 \cdot 455}{4845} = \frac{2275}{4845} \approx 0.4696 \]
3. Defective Products Example
\[ P(X = 2) = \frac{\binom{10}{2} \binom{90}{3}}{\binom{100}{5}} = \frac{45 \cdot 117480}{75287520} = \frac{5286600}{75287520} \approx 0.0702 \]
Thus, the Hypergeometric Distribution is a powerful tool for accurately estimating how many of a specific item will appear in a sample, especially in realistic settings where previous selections affect future ones.
'통계학 > 여인권-통계학의 이해' 카테고리의 다른 글
Expectation and Variance of the Hypergeometric Distribution (0) | 2025.04.20 |
---|---|
초기하분포의 기대값과 분산 (0) | 2025.04.20 |
초기하분포(Hypergeometric Distribution)란? (0) | 2025.04.20 |
🔷 이항분포: 모수 (0) | 2025.04.19 |
이항분포의 평균과 분산 (0) | 2025.04.19 |
- Total
- Today
- Yesterday
- 백준
- stl
- K-MOOC
- 정보처리기사
- 티스토리챌린지
- C
- c++
- C/C++
- 강화학습
- 여인권
- 일본어
- 인프런
- 통계학
- 일본어문법무작정따라하기
- 윤성우
- 심리학
- 파이썬
- 뇌와행동의기초
- 류근관
- 일문따
- 통계
- 보세사
- 열혈프로그래밍
- Python
- 사회심리학
- 오블완
- 데이터분석
- 코딩테스트
- 회계
- 인지부조화
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | ||||
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |