티스토리 뷰

반응형
Hypergeometric Distribution

🔷 Hypergeometric Distribution

✅ Definition

The Hypergeometric Distribution describes a discrete probability distribution that models the probability of obtaining exactly \( k \) target items when drawing \( n \) samples without replacement from a finite population of size \( N \), which contains \( K \) such target items.


The distribution is defined by the following parameters:

  • \( N \): Population size
  • \( K \): Number of target (or "success") items in the population
  • \( n \): Sample size
  • \( X \): Number of target items in the sample (random variable)

The probability mass function (PMF) of the Hypergeometric Distribution is:

\[ P(X = k) = \frac{\binom{K}{k} \binom{N-K}{n-k}}{\binom{N}{n}} \]

This formula calculates the proportion of combinations where \( k \) target items and \( n-k \) non-target items are drawn simultaneously, out of all possible combinations.




✅ Intuitive Explanation

This distribution models situations where selected items are not replaced after being drawn. Each draw affects the probabilities of the next, since the population changes. This creates a dynamic and highly realistic probability structure.


For example, imagine a box full of colored balls, and you want to know how many red ones you’ll get if you draw a few. If you do not put the ball back after each draw, the chance of drawing another red ball changes each time.


The Hypergeometric Distribution predicts how many of your target items (like red balls) will appear in a sample like this. In contrast, distributions like Bernoulli or Binomial assume replacement or independent trials, which is not the case here.


This model also assumes a finite population. If the population were infinite, or effectively very large, the assumption would no longer hold. That’s why both conditions—finite size and no replacement—must be satisfied.




반응형

 


✅ Real-World Examples

✅ Scenario Descriptions

1. Candy Sharing
A bag contains 20 candies: 5 are strawberry flavor and the rest are lemon. Suppose friends randomly pick 4 candies each. The number of strawberry candies each person gets can vary every time.


The key point is that once someone takes a candy, it doesn’t go back. So later people have fewer options, and the chances of picking a strawberry candy also change.


2. Pencils in a Case
A student has 10 pencils: 6 blue and 4 black. If the student randomly picks 3 pencils without looking, the number of blue pencils picked will not always be the same.


Once one pencil is picked, both the total number and the number of blue pencils decrease. So the probability of drawing a blue pencil next changes each time. The Hypergeometric Distribution fits this case well.


3. Defective Products Inspection
A factory produces 100 products, 10 of which are defective. If 5 products are randomly selected for inspection, how many defective items appear is uncertain. That’s when the Hypergeometric Distribution becomes useful.


Since selected items are not returned, each draw influences the next. This violates independence and requires a model like the Hypergeometric Distribution.




✅ Formula Explanation

✅ Meaning of Combination Symbols

The Hypergeometric PMF is given by:

\[ P(X = k) = \frac{\binom{K}{k} \binom{N-K}{n-k}}{\binom{N}{n}} \]

Here, the binomial coefficient \(\binom{a}{b}\) represents the number of ways to choose \( b \) items from \( a \) items without regard to order. This is called a combination and is defined as:

\[ \binom{a}{b} = \frac{a!}{b!(a-b)!} \]

For example, suppose we have two 'a's and one 'b' and want to know all the unique 3-letter combinations:

  • aab
  • aba
  • baa

These represent different positions for the two 'a's. The number of ways to choose two positions for 'a' from three is \(\binom{3}{2} = 3\), exactly matching our cases.


In the Hypergeometric formula:

  • \(\binom{K}{k}\): ways to choose \( k \) target items from \( K \)
  • \(\binom{N-K}{n-k}\): ways to choose \( n-k \) non-target items from the rest
  • \(\binom{N}{n}\): total ways to draw \( n \) items from \( N \)




반응형

 


✅ Numerical Examples

1. Pencil Example

\[ P(X = 2) = \frac{\binom{6}{2} \binom{4}{1}}{\binom{10}{3}} = \frac{15 \cdot 4}{120} = \frac{60}{120} = 0.5 \]

2. Candy Example

\[ P(X = 1) = \frac{\binom{5}{1} \binom{15}{3}}{\binom{20}{4}} = \frac{5 \cdot 455}{4845} = \frac{2275}{4845} \approx 0.4696 \]

3. Defective Products Example

\[ P(X = 2) = \frac{\binom{10}{2} \binom{90}{3}}{\binom{100}{5}} = \frac{45 \cdot 117480}{75287520} = \frac{5286600}{75287520} \approx 0.0702 \]




Thus, the Hypergeometric Distribution is a powerful tool for accurately estimating how many of a specific item will appear in a sample, especially in realistic settings where previous selections affect future ones.

반응형
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2025/05   »
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
글 보관함
반응형