How AI Guesses What You're Thinking in Just 4 Questions

The Game That Seems Impossible

Think of any object. A cat, a skyscraper, a slice of pizza. Now imagine an AI asks you just 4-6 yes/no questions and correctly guesses what you were thinking.

Sounds like magic? It's actually math. Specifically, it's Shannon entropy and information theory — the same principles that power compression algorithms, cryptography, and modern machine learning.

Let's break down exactly how it works.

Shannon Entropy: Measuring Uncertainty

In 1948, Claude Shannon published "A Mathematical Theory of Communication" and forever changed how we think about information. His key insight: information is surprise.

If I tell you "the sun rose today," that carries almost zero information — you already expected it. But if I tell you "it snowed in the Sahara," that's highly informative because it's unexpected.

Shannon formalized this with entropy, a measure of uncertainty:

H(X) = -Σ p(x) · log₂ p(x)

Shannon Entropy — measured in bits

The higher the entropy, the more uncertain you are, and the more information you need to resolve that uncertainty.

A Simple Example

Imagine a bag with 8 equally likely items. The entropy is:

H = -8 × (1/8 × log2(1/8))
H = -8 × (1/8 × -3)
H = 3 bits

This means you need exactly 3 yes/no questions to identify the item. Each perfect question cuts the remaining possibilities in half.

Binary Search on the Property Space

Here's how the AI actually works under the hood:

Property database — Every object has a vector of properties: is it alive? is it bigger than a breadbox? can you eat it?
Entropy calculation — For each possible question, the AI calculates how much entropy it would eliminate.
Greedy selection — It picks the question that maximizes information gain — the question that most evenly splits the remaining candidates.
Update beliefs — After your answer, the AI eliminates all objects inconsistent with it and recalculates.

This is essentially binary search on a property space rather than a sorted list.

Why 4 Questions, Not 20?

Humans playing 20 Questions waste questions. They ask things like "Is it a type of food?" when only 12% of objects are food. That question eliminates just 12% of possibilities on a "yes" or 88% on a "no" — very uneven.

The AI always asks questions that split possibilities close to 50/50, which is mathematically optimal.

Strategy	Avg. Questions	Efficiency
Random human	15-20	~30%
Experienced human	8-12	~55%
Entropy-optimal AI	4-6	~90%
Theoretical minimum	log2(N)	100%

With a database of ~30 common objects, log2(30) = 4.9 bits. The AI typically needs 4-6 questions because real-world properties aren't perfectly binary and some objects share many properties.

Information Gain: Picking the Best Question

The secret sauce is information gain — the reduction in entropy after asking a question:

IG(Q) = H(before) - H(after | Q)

Information Gain — higher is better

A perfect question has information gain of exactly 1 bit — it cuts uncertainty in half. The AI greedily picks the highest-gain question at each step.

Example Walkthrough

Say we have 16 candidate objects. Entropy = 4 bits.

Q1: "Is it alive?" — Splits 8/8. Gain: 1 bit. Remaining: 3 bits.
Q2: "Can you hold it in your hand?" — Splits 4/4. Gain: 1 bit. Remaining: 2 bits.
Q3: "Is it found indoors?" — Splits 2/2. Gain: 1 bit. Remaining: 1 bit.
Q4: "Is it used daily?" — Splits 1/1. Gain: 1 bit. Solved!

4 questions, 16 objects, zero wasted bits.

불가능해 보이는 게임

아무 물체나 하나 떠올려 보세요. 고양이, 마천루, 피자 한 조각. 이제 AI가 단 4~6개의 예/아니오 질문만으로 당신이 생각한 것을 정확히 맞추는 상상을 해보세요.

마법처럼 보이죠? 사실 수학입니다. 정확히는 섀넌 엔트로피와 정보이론 — 압축 알고리즘, 암호학, 현대 머신러닝의 근간이 되는 바로 그 원리입니다.

어떻게 작동하는지 하나씩 분석해 봅시다.

섀넌 엔트로피: 불확실성 측정

1948년 클로드 섀넌은 "통신의 수학적 이론"을 발표하며 정보에 대한 우리의 생각을 영원히 바꿨습니다. 그의 핵심 통찰: 정보는 놀라움입니다.

"오늘 해가 떴다"라고 말하면 정보량이 거의 0입니다 — 이미 예상했으니까요. 하지만 "사하라 사막에 눈이 왔다"는 예상 밖이기에 매우 높은 정보량을 갖습니다.

섀넌은 이를 엔트로피로 정형화했습니다. 불확실성의 척도죠:

H(X) = -Σ p(x) · log₂ p(x)

섀넌 엔트로피 — 단위: 비트(bits)

엔트로피가 높을수록 불확실성이 크고, 그만큼 더 많은 정보가 필요합니다.

간단한 예시

주머니에 8개의 동일한 확률의 아이템이 있다면 엔트로피는:

H = -8 × (1/8 × log2(1/8))
H = -8 × (1/8 × -3)
H = 3 비트

아이템을 식별하는 데 정확히 3번의 예/아니오 질문이 필요합니다. 완벽한 질문 하나가 남은 가능성을 정확히 반으로 줄입니다.

속성 공간에서의 이진 탐색

AI가 실제로 작동하는 방식은 이렇습니다:

속성 데이터베이스 — 모든 객체는 속성 벡터를 가짐: 살아있는가? 빵 상자보다 큰가? 먹을 수 있는가?
엔트로피 계산 — 각 질문에 대해 AI는 얼마나 많은 엔트로피를 제거할 수 있는지 계산.
탐욕적 선택 — 남은 후보를 가장 균등하게 나누는, 즉 정보 이득이 최대인 질문을 선택.
신념 갱신 — 답변 후 일치하지 않는 모든 객체를 제거하고 재계산.

정렬된 리스트가 아닌 속성 공간에서의 이진 탐색입니다.

왜 20문제가 아니라 4문제일까?

사람은 스무고개를 할 때 질문을 낭비합니다. "음식인가요?" 같은 질문은 물체의 12%만 음식일 때 "예"에서 12%, "아니오"에서 88%만 제거합니다 — 매우 불균등하죠.

AI는 항상 가능성을 50/50에 가깝게 나누는 질문을 합니다. 이것이 수학적으로 최적입니다.

전략	평균 질문 수	효율성
일반 사람	15-20	~30%
숙련된 사람	8-12	~55%
엔트로피 최적 AI	4-6	~90%
이론적 최솟값	log2(N)	100%

~30개의 객체 데이터베이스에서 log2(30) = 4.9 비트. 실제 속성이 완벽한 이진값이 아니고 일부 객체가 속성을 공유하기 때문에 AI는 보통 4~6번의 질문이 필요합니다.

정보 이득: 최적의 질문 고르기

비결은 정보 이득 — 질문 후 엔트로피의 감소량입니다:

IG(Q) = H(이전) - H(이후 | Q)

정보 이득 — 높을수록 좋은 질문

완벽한 질문의 정보 이득은 정확히 1비트 — 불확실성을 정확히 반으로 줄입니다. AI는 매 단계에서 가장 높은 이득의 질문을 탐욕적으로 선택합니다.

예시 풀이

16개의 후보 객체가 있다면 엔트로피 = 4비트:

Q1: "살아있나요?" — 8/8 분할. 이득: 1비트. 남은: 3비트.
Q2: "손에 들 수 있나요?" — 4/4 분할. 이득: 1비트. 남은: 2비트.
Q3: "실내에서 볼 수 있나요?" — 2/2 분할. 이득: 1비트. 남은: 1비트.
Q4: "매일 사용하나요?" — 1/1 분할. 이득: 1비트. 정답!

4문제, 16개 객체, 낭비 비트 0.

Interactive: Entropy Calculator

Adjust the number of objects and see how many questions the AI needs:

Number of objects: 30

Entropy: 4.91 bits
The AI needs ~5 questions to identify the object.

Try It Yourself!

Think of something and see if the AI can guess it in under 6 questions.

Play 20 Questions AI

Beyond Games: Where This Math Lives

Shannon entropy isn't just a party trick. It's the backbone of:

Data compression (ZIP, MP3, JPEG) — remove bits that carry no information
Decision trees in machine learning — split on features with highest information gain
Cryptography — maximize entropy to make messages unpredictable
Natural language processing — predict next words by measuring surprise
Medical diagnosis — ask the most discriminating tests first

The 20 Questions game is, in a way, the purest expression of how information theory works. Every question is a measurement. Every answer reduces uncertainty. Optimality means asking the right questions.

Key Takeaways

Entropy measures uncertainty — more possible outcomes = more bits needed.
Optimal questions split 50/50 — anything else wastes information capacity.
log2(N) is the floor — you can't beat math, but you can get close.
Real-world noise adds 1-2 extra questions — property overlap and ambiguity cost bits.
This is how all AI "thinks" — from ChatGPT to medical AI, reducing uncertainty is the game.

게임을 넘어서: 이 수학이 사용되는 곳

섀넌 엔트로피는 단순한 게임 트릭이 아닙니다. 다음의 근간입니다:

데이터 압축 (ZIP, MP3, JPEG) — 정보가 없는 비트를 제거
머신러닝의 의사결정 트리 — 정보 이득이 가장 높은 특성으로 분할
암호학 — 엔트로피를 최대화하여 메시지를 예측 불가능하게
자연어 처리 — 놀라움을 측정하여 다음 단어를 예측
의료 진단 — 가장 판별력 있는 검사를 먼저 수행

스무고개 게임은 어떤 면에서 정보이론의 가장 순수한 표현입니다. 모든 질문은 측정이고, 모든 답은 불확실성을 줄이며, 최적이란 올바른 질문을 하는 것입니다.

핵심 정리

엔트로피는 불확실성을 측정 — 가능한 결과가 많을수록 더 많은 비트 필요.
최적 질문은 50/50으로 분할 — 다른 비율은 정보 용량의 낭비.
log2(N)이 하한 — 수학을 이길 수는 없지만 가까이 갈 수 있음.
현실 노이즈는 1~2개 추가 질문 발생 — 속성 중복과 모호성이 비트를 소모.
이것이 모든 AI의 "사고" 방식 — ChatGPT부터 의료 AI까지, 불확실성 감소가 게임.