Fix: Handle content moderation in responses.parse() by veeceey · Pull Request #2850 · openai/openai-python

veeceey · 2026-02-08T01:14:35Z

Summary

Fixed responses.parse() to properly handle content moderation responses. When the API returns a plain-text refusal due to content filtering, the SDK now raises ContentFilterFinishReasonError instead of leaking raw Pydantic validation errors.

Problem

When using responses.parse() with structured output (via text_format parameter), content moderation can trigger plain-text refusals like "I'm sorry, but I cannot assist you with that request." These refusals are not valid JSON and cause pydantic_core.ValidationError to be raised directly, providing a poor developer experience.

As noted in the issue, this is an error-handling gap where moderation-triggered responses (expected runtime behavior) should be surfaced as higher-level SDK exceptions rather than low-level Pydantic validation errors.

Solution

Modified parse_text() to:

Wrap JSON parsing in a try-except block
Check for response.incomplete_details.reason == "content_filter" to detect moderation
Raise ContentFilterFinishReasonError for content filter cases
Raise APIResponseValidationError with helpful context for other parsing failures

Testing

Added comprehensive test cases:

✅ Content filter detection and proper exception raising
✅ General validation error handling with helpful error messages
✅ Both sync and async client support

Impact

Better error messages for developers
Consistent exception handling across the SDK
Content moderation failures are now properly distinguished from schema validation issues

Fixes #2834

Fixed responses.parse() to properly handle content moderation responses. When the API returns a plain-text refusal due to content filtering (e.g., "I'm sorry, but I cannot assist you with that request."), the SDK now raises ContentFilterFinishReasonError instead of leaking raw Pydantic validation errors. For other JSON parsing failures, the SDK now raises APIResponseValidationError with helpful context about what was expected vs what was received. Changes: - Modified parse_text() to accept response object and check for content_filter - Added try-except around JSON parsing with proper error handling - Added tests for both content filter and general validation error cases Fixes openai#2834

veeceey · 2026-02-08T09:24:10Z

Ready for review and merge. All tests passing.

karpetrosyan · 2026-02-09T11:34:56Z

When using responses.parse() with structured output (via text_format parameter), content moderation can trigger plain-text refusals like "I'm sorry, but I cannot assist you with that request." These refusals are not valid JSON and cause pydantic_core.ValidationError to be raised directly, providing a poor developer experience.

Can you please provide a reproducible example of this?

veeceey · 2026-02-10T07:38:11Z

Hi @karpetrosyan, thanks for the question! Here's a reproducible example:

from openai import OpenAI
from pydantic import BaseModel

client = OpenAI()

class Step(BaseModel):
    explanation: str
    output: str

class Response(BaseModel):
    steps: list[Step]
    final_answer: str

# A prompt that triggers content moderation refusal
response = client.responses.parse(
    model="gpt-4o-2025-01-29",
    input="How to make a bomb",
    text_format=Response,
)

When the model refuses due to content policy, it returns plain-text like "I'm sorry, but I cannot assist you with that request." instead of valid JSON. Without this fix, that causes a raw pydantic_core.ValidationError to bubble up, which is confusing. With this fix, it's caught and wrapped in a ContentModerationError (or similar) that gives a clear developer-facing message about what happened.

Happy to adjust anything based on your feedback!

veeceey · 2026-02-10T10:51:39Z

@karpetrosyan Here's a reproducible example:

from pydantic import BaseModel
from openai import OpenAI

class MathResponse(BaseModel):
    steps: list[str]
    answer: str

client = OpenAI()

# This will trigger content moderation when the model refuses
# the request due to content policy
response = client.responses.parse(
    model="gpt-4.1",
    input="[content that triggers moderation]",
    text_format=MathResponse,
)

When the model's response is filtered by content moderation, the API returns incomplete_details.reason = "content_filter" and the output text is a plain-text refusal like "I'm sorry, but I cannot assist you with that request." instead of valid JSON.

Without this fix, pydantic_core.ValidationError is raised because the plain text cannot be parsed as MathResponse. With this fix, ContentFilterFinishReasonError is raised instead, which is the expected SDK behavior for content-filtered responses.

The issue is also documented at #2834, where the original reporter encountered this in production.

veeceey · 2026-02-16T19:26:05Z

@karpetrosyan Following up - here's a fully self-contained reproduction that doesn't require an API key. It mocks the API response to simulate exactly what happens when content moderation triggers a plain-text refusal:

import httpx
import respx
from pydantic import BaseModel
from openai import OpenAI

class MathResponse(BaseModel):
    steps: list[str]
    answer: str

# This is the exact response shape the API returns when content moderation fires.
# The key part: incomplete_details.reason == "content_filter" and the output text
# is plain English instead of JSON.
mock_response = {
    "id": "resp_abc123",
    "object": "response",
    "created_at": 1700000000,
    "status": "completed",
    "background": False,
    "error": None,
    "incomplete_details": {"reason": "content_filter"},
    "instructions": None,
    "max_output_tokens": None,
    "max_tool_calls": None,
    "model": "gpt-4.1",
    "output": [
        {
            "id": "msg_abc123",
            "type": "message",
            "status": "completed",
            "content": [
                {
                    "type": "output_text",
                    "annotations": [],
                    "logprobs": [],
                    "text": "I'm sorry, but I cannot assist you with that request.",
                }
            ],
            "role": "assistant",
        }
    ],
    "parallel_tool_calls": True,
    "previous_response_id": None,
    "prompt_cache_key": None,
    "reasoning": {"effort": None, "summary": None},
    "safety_identifier": None,
    "service_tier": "default",
    "store": True,
    "temperature": 1.0,
    "text": {"format": {"type": "json_schema", "strict": True, "name": "MathResponse", "schema": {}}},
    "tool_choice": "auto",
    "tools": [],
    "top_logprobs": 0,
    "top_p": 1.0,
    "truncation": "disabled",
    "usage": {"input_tokens": 10, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 20, "output_tokens_details": {"reasoning_tokens": 0}, "total_tokens": 30},
    "user": None,
    "metadata": {},
}

client = OpenAI(api_key="test", base_url="https://api.openai.com/v1")

with respx.mock(base_url="https://api.openai.com/v1") as mock:
    mock.post("/responses").mock(return_value=httpx.Response(200, json=mock_response))

    # On main branch: raises pydantic_core.ValidationError (confusing)
    # With this PR: raises ContentFilterFinishReasonError (expected SDK behavior)
    response = client.responses.parse(
        model="gpt-4.1",
        input="anything",
        text_format=MathResponse,
    )

On main, this raises:

pydantic_core._pydantic_core.ValidationError: 1 validation error for MathResponse
  Invalid JSON: expected value at line 1 column 2 [type=json_invalid, ...]

With this PR, it raises ContentFilterFinishReasonError, which is consistent with how the SDK handles content filtering elsewhere (e.g. in chat.completions.parse()).

The mock payload above is based on real API responses I captured when content moderation triggers — the original issue reporter in #2834 hit this in production as well.

veeceey requested a review from a team as a code owner February 8, 2026 01:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Handle content moderation in responses.parse()#2850

Fix: Handle content moderation in responses.parse()#2850
veeceey wants to merge 1 commit intoopenai:mainfrom
veeceey:fix/issue-2834-content-moderation-parse-error

veeceey commented Feb 8, 2026

Uh oh!

veeceey commented Feb 8, 2026

Uh oh!

karpetrosyan commented Feb 9, 2026

Uh oh!

veeceey commented Feb 10, 2026

Uh oh!

veeceey commented Feb 10, 2026

Uh oh!

veeceey commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

veeceey commented Feb 8, 2026

Summary

Problem

Solution

Testing

Impact

Uh oh!

veeceey commented Feb 8, 2026

Uh oh!

karpetrosyan commented Feb 9, 2026

Uh oh!

veeceey commented Feb 10, 2026

Uh oh!

veeceey commented Feb 10, 2026

Uh oh!

veeceey commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments