Skip to content

Commit 83a4032

Browse files
authored
Merge pull request #3 from AISmithLab/dev
Dev
2 parents b6b0516 + 2876c09 commit 83a4032

14 files changed

+1811
-0
lines changed

studies/study_013/README.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# Study 013: Opportunity Evaluation under Risky Conditions
2+
3+
**Authors:** Hean Tat Keh, Maw Der Foo, Boon Chong Lim
4+
5+
**Year:** 2002
6+
7+
**Journal:** *Entrepreneurship Theory and Practice*, 27(2), 125-148
8+
9+
## Description
10+
11+
This study examines how cognitive biases affect entrepreneurs' opportunity evaluation under risky conditions. Using a survey of 77 founders of top SMEs in Singapore, the study measures four cognitive biases (overconfidence, illusion of control, belief in the law of small numbers, and planning fallacy) and tests how they influence risk perception and opportunity evaluation of a standardized business case vignette. The benchmark implementation focuses on the paper's calibration test and its reported regression findings.
12+
13+
## Participants
14+
15+
- **N = 77** founders and owners of the top 500 SMEs in Singapore
16+
- 97% male, mean age 46.6 years
17+
- 92.4% Chinese, 79% founded their business
18+
- Business revenue: 48.6% between S$1M-S$25M, 44.4% between S$25M-S$50M
19+
20+
## Key Findings Tested
21+
22+
| Finding | Hypothesis | Human Result |
23+
|---------|-----------|--------------|
24+
| F1 | Entrepreneurs are overconfident (mean items outside 90% CI > 1) | Mean = 5.17, SD = 2.64 |
25+
| F2 | Risk perception negatively predicts opportunity evaluation (H1) | beta = -0.50, t = -5.98, p < .001 |
26+
| F3 | Illusion of control negatively predicts risk perception in Model 1 (H5) | beta = -0.76, t = -3.34, p < .01 |
27+
| F4 | Illusion of control positively predicts opportunity evaluation in Model 2 | beta = 0.40, t = 2.23, p < .05 |
28+
| F5 | Belief in the law of small numbers positively predicts opportunity evaluation in Model 2 | beta = 1.17, t = 1.91, p < .06 |
29+
30+
## Questionnaire Structure
31+
32+
- **Section A:** 5 forced-choice gamble items (risk propensity)
33+
- **Section B:** 7 Likert items (2 filler, 2 planning fallacy, 3 illusion of control)
34+
- **Section C:** 10 confidence-interval estimation items (overconfidence)
35+
- **Section D:** Business case vignette + 4 risk perception items + 3 opportunity evaluation items + 1 optional open-ended item coded for belief in the law of small numbers
36+
37+
## File Structure
38+
39+
```
40+
study_013/
41+
├── index.json
42+
├── README.md
43+
├── source/
44+
│ ├── Keh-Foo-Lim-2002-Opportunity-Evaluation.pdf
45+
│ ├── metadata.json
46+
│ ├── specification.json
47+
│ ├── ground_truth.json
48+
│ └── materials/
49+
│ ├── section_a_risk_propensity.json
50+
│ ├── section_b_cognitive_biases.json
51+
│ ├── section_c_overconfidence.json
52+
│ └── section_d_case_vignette.json
53+
└── scripts/
54+
├── config.py
55+
├── evaluator.py
56+
├── study_utils.py
57+
└── stats_lib.py
58+
```
59+
60+
## Overconfidence Answer Key
61+
62+
The 10 confidence-interval items reference Singapore statistics circa 1999-2000. Correct answers have been verified against:
63+
- Yearbook of Statistics Singapore 2000 (Department of Statistics)
64+
- Changi Airport Group corporate history
65+
- LTA Vehicle Quota Tender Results 2000-2004
66+
- SingStat residential dwelling datasets
67+
68+
## Contributor
69+
70+
Guankai Zhai ([@zgk2003](https://github.com/zgk2003))

studies/study_013/index.json

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
{
2+
"title": "Opportunity Evaluation under Risky Conditions: The Cognitive Processes of Entrepreneurs",
3+
"authors": [
4+
"Hean Tat Keh",
5+
"Maw Der Foo",
6+
"Boon Chong Lim"
7+
],
8+
"year": 2002,
9+
"description": "This study examines how cognitive biases affect entrepreneurs' opportunity evaluation under risky conditions. Using a survey of 77 founders of top SMEs in Singapore, the study measures overconfidence, illusion of control, belief in the law of small numbers, and planning fallacy, then relates those constructs to risk perception and opportunity evaluation for a standardized business vignette. The benchmark reproduces the paper's calibration test and its reported regression findings: entrepreneurs are overconfident, risk perception negatively predicts opportunity evaluation, illusion of control lowers risk perception and increases opportunity evaluation before mediation, and belief in the law of small numbers increases opportunity evaluation in the pre-mediation model.",
10+
"contributors": [
11+
{
12+
"name": "Guankai Zhai",
13+
"github": "https://github.com/zgk2003"
14+
}
15+
]
16+
}
Lines changed: 273 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
import numpy as np
2+
3+
import sys
4+
sys.path.insert(0, str(__import__("pathlib").Path(__file__).resolve().parent))
5+
from study_utils import BaseStudyConfig, PromptBuilder, compute_construct_scores, iter_response_records
6+
7+
import random
8+
9+
10+
AGE_DISTRIBUTION = [
11+
(range(30, 40), 0.222), # Less than 40
12+
(range(40, 61), 0.715), # 40 to 60
13+
(range(61, 70), 0.063), # More than 60
14+
]
15+
16+
SEX_OPTIONS = ["male", "female"]
17+
SEX_WEIGHTS = [0.97, 0.03]
18+
19+
RACE_OPTIONS = ["Chinese", "Indian", "Other"]
20+
RACE_WEIGHTS = [0.924, 0.045, 0.031]
21+
22+
EDUCATION_OPTIONS = ["secondary", "postsecondary", "primary/other"]
23+
EDUCATION_WEIGHTS = [0.061, 0.864, 0.075]
24+
25+
BUSINESS_SIZE_OPTIONS = [
26+
"Less than S$1m",
27+
"Between S$1m and S$25m",
28+
"Between S$25m and S$50m",
29+
"More than S$50m",
30+
]
31+
32+
BUSINESS_SIZE_WEIGHTS = [0.028, 0.486, 0.444, 0.042]
33+
34+
35+
def weighted_age_sample():
36+
"""Sample an age from the Table 2 age distribution."""
37+
r = random.random()
38+
cumulative = 0
39+
for age_range, prob in AGE_DISTRIBUTION:
40+
cumulative += prob
41+
if r < cumulative:
42+
return random.choice(list(age_range))
43+
return random.randint(40, 60)
44+
45+
46+
def weighted_choice(options, weights):
47+
"""Draw one option according to the reported sample proportions."""
48+
return random.choices(options, weights=weights, k=1)[0]
49+
50+
51+
class CustomPromptBuilder(PromptBuilder):
52+
"""Builds the full Keh, Foo & Lim (2002) questionnaire prompt."""
53+
54+
def build_trial_prompt(self, trial_metadata):
55+
profile = trial_metadata.get("profile") or trial_metadata.get("participant_profile", {})
56+
items_a = trial_metadata.get("items_a", [])
57+
items_b = trial_metadata.get("items_b", [])
58+
items_c = trial_metadata.get("items_c", [])
59+
items_d = trial_metadata.get("items_d", [])
60+
vignette_text = trial_metadata.get("vignette_text", "")
61+
62+
lines = []
63+
optional_question_numbers = []
64+
65+
# --- Persona Introduction ---
66+
age = profile.get("age", 47)
67+
sex = profile.get("sex", "male")
68+
race = profile.get("race", "Chinese")
69+
education = profile.get("education", "postsecondary")
70+
business_size = profile.get("business_size", "Between S$1m and S$25m")
71+
founder = profile.get("is_founder", True)
72+
73+
lines.append("You are participating in a research study on entrepreneurial decision-making.")
74+
lines.append(
75+
"Answer as one of the Singapore SME founders/owners described in the original paper."
76+
)
77+
lines.append(
78+
f"Imagine you are a {age}-year-old {sex} entrepreneur in Singapore, "
79+
f"{race}, with {education} education, who {'founded' if founder else 'bought over'} "
80+
f"the business you run (annual revenue: {business_size})."
81+
)
82+
lines.append("Please answer all questions honestly from that participant's perspective.\n")
83+
84+
q_counter = 1
85+
86+
# --- Section A: Risk Propensity (5 forced-choice items) ---
87+
lines.append("=" * 60)
88+
lines.append("SECTION A: RISK PREFERENCES")
89+
lines.append("=" * 60)
90+
lines.append("Please answer the following five items by choosing the alternative (\"a\" or \"b\") you would feel most comfortable with.\n")
91+
92+
for item in items_a:
93+
options = item.get("options", [])
94+
lines.append(f"Q{q_counter}: Which would you prefer?")
95+
lines.append(f" a) {options[0]}")
96+
lines.append(f" b) {options[1]}")
97+
lines.append(f" (Answer Q{q_counter}=a or Q{q_counter}=b)\n")
98+
item["q_idx"] = q_counter
99+
q_counter += 1
100+
101+
# --- Section B: Cognitive Biases (7 Likert items) ---
102+
lines.append("=" * 60)
103+
lines.append("SECTION B: BUSINESS ATTITUDES")
104+
lines.append("=" * 60)
105+
lines.append("Please indicate how much you agree with each statement.")
106+
lines.append("Scale: 1 = Strongly Disagree, 2 = Disagree, 3 = Slightly Disagree, 4 = Neutral, 5 = Slightly Agree, 6 = Agree, 7 = Strongly Agree\n")
107+
108+
for item in items_b:
109+
lines.append(f"Q{q_counter}: {item['question']}")
110+
lines.append(f" (Answer Q{q_counter}=1 to Q{q_counter}=7)\n")
111+
item["q_idx"] = q_counter
112+
q_counter += 1
113+
114+
# --- Section C: Overconfidence (10 confidence-interval items) ---
115+
lines.append("=" * 60)
116+
lines.append("SECTION C: GENERAL KNOWLEDGE")
117+
lines.append("=" * 60)
118+
lines.append("For each question below, provide a LOWER LIMIT and UPPER LIMIT such that you are 90% confident the correct answer falls within your range.")
119+
lines.append("If you have absolutely no idea, provide the widest reasonable range.\n")
120+
121+
for item in items_c:
122+
unit = item.get("unit", "")
123+
lines.append(f"Q{q_counter} (Lower Limit) and Q{q_counter + 1} (Upper Limit): {item['question']}")
124+
lines.append(f" Unit: {unit}")
125+
lines.append(f" (Answer Q{q_counter}=<lower> Q{q_counter + 1}=<upper>)\n")
126+
item["q_idx_lower"] = q_counter
127+
item["q_idx_upper"] = q_counter + 1
128+
q_counter += 2
129+
130+
# --- Section D: Case Vignette + Risk Perception + Opportunity Evaluation ---
131+
lines.append("=" * 60)
132+
lines.append("SECTION D: BUSINESS CASE EVALUATION")
133+
lines.append("=" * 60)
134+
lines.append("Please read the following case study carefully, then answer the questions.\n")
135+
lines.append(vignette_text)
136+
lines.append("")
137+
lines.append("Based on the case above, please indicate how much you agree with each statement.")
138+
lines.append("Scale: 1 = Strongly Disagree, 2 = Disagree, 3 = Slightly Disagree, 4 = Neutral, 5 = Slightly Agree, 6 = Agree, 7 = Strongly Agree\n")
139+
140+
for item in items_d:
141+
if item["type"] == "likert_7":
142+
lines.append(f"Q{q_counter}: {item['question']}")
143+
lines.append(f" (Answer Q{q_counter}=1 to Q{q_counter}=7)\n")
144+
item["q_idx"] = q_counter
145+
q_counter += 1
146+
elif item["type"] == "open_ended":
147+
lines.append(f"Q{q_counter}: {item['question']}")
148+
lines.append(" Focus on the issues that actually drive your judgment from the case as written.")
149+
lines.append(" Mention extra information only if you genuinely need it.")
150+
lines.append(f" (Optional. Answer Q{q_counter}=<brief response>, write Q{q_counter}=No additional information needed, or omit Q{q_counter} to skip.)\n")
151+
item["q_idx"] = q_counter
152+
optional_question_numbers.append(q_counter)
153+
q_counter += 1
154+
155+
# --- Response format ---
156+
lines.append("=" * 60)
157+
lines.append("RESPONSE FORMAT (MANDATORY)")
158+
lines.append("=" * 60)
159+
lines.append("Output ONLY answer lines in the format: Qk=<value>")
160+
lines.append("One answer per line. Do not include explanations.")
161+
if optional_question_numbers:
162+
optional_labels = ", ".join(f"Q{idx}" for idx in optional_question_numbers)
163+
required_answers = (q_counter - 1) - len(optional_question_numbers)
164+
lines.append(f"All numbered items except {optional_labels} are required.")
165+
lines.append(
166+
f"For {optional_labels}, respond with the issues influencing your judgment, "
167+
"or state that no additional information is needed."
168+
)
169+
lines.append(f"Expected number of answer lines: {required_answers} to {q_counter - 1}")
170+
else:
171+
lines.append(f"Expected number of answer lines: {q_counter - 1}")
172+
173+
return "\n".join(lines)
174+
175+
176+
class StudyStudy013Config(BaseStudyConfig):
177+
"""Study config for Keh, Foo & Lim (2002) — Opportunity Evaluation under Risky Conditions."""
178+
179+
prompt_builder_class = CustomPromptBuilder
180+
PROMPT_VARIANT = "v1"
181+
182+
def create_trials(self, n_trials=None):
183+
spec = self.load_specification()
184+
n = n_trials if n_trials is not None else spec["participants"]["n"]
185+
186+
# Load all materials
187+
mat_a = self.load_material("section_a_risk_propensity")
188+
mat_b = self.load_material("section_b_cognitive_biases")
189+
mat_c = self.load_material("section_c_overconfidence")
190+
mat_d = self.load_material("section_d_case_vignette")
191+
192+
vignette_text = mat_d.get("vignette_text", "")
193+
194+
trials = []
195+
for i in range(n):
196+
# Generate entrepreneur profiles only from demographics reported in Table 2.
197+
age = weighted_age_sample()
198+
sex = weighted_choice(SEX_OPTIONS, SEX_WEIGHTS)
199+
race = weighted_choice(RACE_OPTIONS, RACE_WEIGHTS)
200+
education = weighted_choice(EDUCATION_OPTIONS, EDUCATION_WEIGHTS)
201+
business_size = random.choices(BUSINESS_SIZE_OPTIONS, weights=BUSINESS_SIZE_WEIGHTS, k=1)[0]
202+
is_founder = random.random() < 0.79
203+
204+
profile = {
205+
"age": age,
206+
"sex": sex,
207+
"race": race,
208+
"education": education,
209+
"business_size": business_size,
210+
"is_founder": is_founder,
211+
}
212+
213+
# Deep copy items to avoid mutation across trials
214+
import copy
215+
trial = {
216+
"sub_study_id": "keh_foo_lim_opportunity_evaluation",
217+
"scenario_id": "mr_tan_vignette",
218+
"scenario": "mr_tan_vignette",
219+
"profile": profile,
220+
"items_a": copy.deepcopy(mat_a["items"]),
221+
"items_b": copy.deepcopy(mat_b["items"]),
222+
"items_c": copy.deepcopy(mat_c["items"]),
223+
"items_d": copy.deepcopy(mat_d["items"]),
224+
"vignette_text": vignette_text,
225+
"variant": self.PROMPT_VARIANT,
226+
}
227+
trials.append(trial)
228+
229+
return trials
230+
231+
def aggregate_results(self, raw_results):
232+
"""Parse Qk=value responses and compute per-participant construct scores."""
233+
participants = []
234+
235+
for record in iter_response_records(raw_results):
236+
participant_scores = compute_construct_scores(
237+
record.get("response_text", ""),
238+
record.get("trial_info", {}),
239+
)
240+
if participant_scores is not None:
241+
participants.append(participant_scores)
242+
243+
# Compute descriptive statistics
244+
if not participants:
245+
return {"participants": [], "descriptive_statistics": {}, "n_valid": 0}
246+
247+
constructs = [
248+
"risk_propensity",
249+
"planning_fallacy",
250+
"illusion_of_control",
251+
"overconfidence",
252+
"risk_perception",
253+
"opportunity_evaluation",
254+
"small_numbers",
255+
"age",
256+
]
257+
258+
desc_stats = {}
259+
for c in constructs:
260+
values = [p[c] for p in participants if p.get(c) is not None]
261+
if not values:
262+
continue
263+
desc_stats[c] = {
264+
"mean": float(np.mean(values)),
265+
"sd": float(np.std(values, ddof=1)) if len(values) > 1 else 0.0,
266+
"n": len(values),
267+
}
268+
269+
return {
270+
"participants": participants,
271+
"descriptive_statistics": desc_stats,
272+
"n_valid": len(participants),
273+
}

0 commit comments

Comments
 (0)