Summary
When using Gemini 3 models (gemini-3-flash-preview, gemini-3-pro-preview) with LangExtract, the API calls timeout because Gemini 3 defaults to thinking_level: high which adds significant latency. There's currently no way to pass thinking_config to reduce this latency.
Problem
The _API_CONFIG_KEYS allowlist in langextract/providers/gemini.py (line 40-48) does not include thinking_config:
_API_CONFIG_KEYS: Final[set[str]] = {
'response_mime_type',
'response_schema',
'safety_settings',
'system_instruction',
'tools',
'stop_sequences',
'candidate_count',
}
This means any thinking_config passed via language_model_params gets filtered out (line 186-188):
self._extra_kwargs = {
k: v for k, v in (kwargs or {}).items() if k in _API_CONFIG_KEYS
}
Impact
- Gemini 3 Flash (designed for speed) times out on simple extraction tasks
- Users cannot set
thinking_level: "minimal" to reduce latency
- Forces users to use older models like
gemini-2.5-flash instead of newer Gemini 3 models
Proposed Solution
Add thinking_config to _API_CONFIG_KEYS:
_API_CONFIG_KEYS: Final[set[str]] = {
'response_mime_type',
'response_schema',
'safety_settings',
'system_instruction',
'tools',
'stop_sequences',
'candidate_count',
'thinking_config', # Add this for Gemini 3 support
}
This would allow users to pass:
result = lx.extract(
text_or_documents=text,
prompt_description=prompt,
examples=examples,
model_id="gemini-3-flash-preview",
language_model_params={
"thinking_config": {"thinking_level": "minimal"}
}
)
References
Environment
- LangExtract version: 1.1.1
- Python: 3.12
- Models tested:
gemini-3-flash-preview, gemini-3-pro-preview
Summary
When using Gemini 3 models (
gemini-3-flash-preview,gemini-3-pro-preview) with LangExtract, the API calls timeout because Gemini 3 defaults tothinking_level: highwhich adds significant latency. There's currently no way to passthinking_configto reduce this latency.Problem
The
_API_CONFIG_KEYSallowlist inlangextract/providers/gemini.py(line 40-48) does not includethinking_config:This means any
thinking_configpassed vialanguage_model_paramsgets filtered out (line 186-188):Impact
thinking_level: "minimal"to reduce latencygemini-2.5-flashinstead of newer Gemini 3 modelsProposed Solution
Add
thinking_configto_API_CONFIG_KEYS:This would allow users to pass:
References
Environment
gemini-3-flash-preview,gemini-3-pro-preview