[LangChain v1.0] Structured output

구조화된 출력(Structured output)을 사용하면 에이전트가 특정하고 예측 가능한 형식으로 데이터를 반환할 수 있습니다. 자연어 응답을 파싱하는 대신, 애플리케이션에서 직접 사용할 수 있는 JSON 객체, Pydantic 모델 또는 dataclass 형식의 구조화된 데이터를 얻을 수 있습니다.

LangChain의 create_agent는 구조화된 출력을 자동으로 처리합니다. 사용자가 원하는 구조화된 출력 스키마를 설정하면, 모델이 구조화된 데이터를 생성할 때 이를 캡처하고 검증하여 에이전트 상태의 'structured_response' 키에 반환합니다.

def create_agent(
    ...
    response_format: Union[
        ToolStrategy[StructuredResponseT],
        ProviderStrategy[StructuredResponseT],
        type[StructuredResponseT],
    ]
)

Response Format

에이전트가 구조화된 데이터를 반환하는 방법을 제어합니다:

ToolStrategy[StructuredResponseT]: 구조화된 출력을 위해 도구 호출을 사용
ProviderStrategy[StructuredResponseT]: 프로바이더 네이티브 구조화된 출력을 사용
type[StructuredResponseT]: 스키마 타입 - 모델 기능에 따라 자동으로 최적의 전략을 선택
None: 구조화된 출력 없음

스키마 타입이 직접 제공되면, LangChain은 자동으로 다음을 선택합니다:

네이티브 구조화된 출력을 지원하는 모델(예: OpenAI, Grok)의 경우 ProviderStrategy
그 외 모든 모델의 경우 ToolStrategy

구조화된 응답은 에이전트의 최종 상태에서 structured_response 키에 반환됩니다.

Provider strategy

일부 모델 프로바이더는 API를 통해 네이티브로 구조화된 출력을 지원합니다(현재 OpenAI와 Grok만 해당). 이는 사용 가능할 때 가장 신뢰할 수 있는 방법입니다.

이 전략을 사용하려면 ProviderStrategy를 구성합니다:

class ProviderStrategy(Generic[SchemaT]):
    schema: type[SchemaT]

schema (필수)

구조화된 출력 형식을 정의하는 스키마입니다. 지원되는 형식:

Pydantic models: 필드 검증이 있는 BaseModel 서브클래스
Dataclasses: 타입 어노테이션이 있는 Python dataclass
TypedDict: 타입이 지정된 딕셔너리 클래스
JSON Schema: JSON 스키마 사양이 포함된 딕셔너리

create_agent.response_format에 스키마 타입을 직접 전달하고 모델이 네이티브 구조화된 출력을 지원하는 경우, LangChain은 자동으로 ProviderStrategy를 사용합니다:

Pydantic Model

from pydantic import BaseModel
from langchain.agents import create_agent

class ContactInfo(BaseModel):
    """Contact information for a person."""
    name: str = Field(description="The name of the person")
    email: str = Field(description="The email address of the person")
    phone: str = Field(description="The phone number of the person")

agent = create_agent(
    model="gpt-5",
    tools=tools,
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"
    }]
})

result["structured_response"]
# ContactInfo(name='John Doe', email='john@example.com', phone='(555) 123-4567')

Dataclass

from dataclasses import dataclass
from langchain.agents import create_agent

@dataclass
class ContactInfo:
    """Contact information for a person."""
    name: str
    email: str
    phone: str

agent = create_agent(
    model="gpt-5",
    tools=tools,
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"
    }]
})

result["structured_response"]
# ContactInfo(name='John Doe', email='john@example.com', phone='(555) 123-4567')

TypedDict

from typing import TypedDict
from langchain.agents import create_agent

class ContactInfo(TypedDict):
    """Contact information for a person."""
    name: str
    email: str
    phone: str

agent = create_agent(
    model="gpt-5",
    tools=tools,
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"
    }]
})

result["structured_response"]
# {'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}

JSON Schema

from langchain.agents import create_agent

contact_schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string", "description": "The name of the person"},
        "email": {"type": "string", "description": "The email address of the person"},
        "phone": {"type": "string", "description": "The phone number of the person"}
    },
    "required": ["name", "email", "phone"]
}

agent = create_agent(
    model="gpt-5",
    tools=tools,
    response_format=contact_schema  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"
    }]
})

result["structured_response"]
# {'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}

프로바이더 네이티브 구조화된 출력은 모델 프로바이더가 스키마를 강제하기 때문에 높은 신뢰성과 엄격한 검증을 제공합니다. 사용 가능한 경우 이 방법을 사용하세요.

프로바이더가 선택한 모델에 대해 네이티브로 구조화된 출력을 지원하는 경우, response_format=ProductReview 대신 response_format=ToolStrategy(ProductReview)를 작성하는 것은 기능적으로 동일합니다. 어느 경우든 구조화된 출력이 지원되지 않으면 에이전트는 도구 호출 전략으로 폴백합니다.

Tool calling strategy

네이티브 구조화된 출력을 지원하지 않는 모델의 경우, LangChain은 도구 호출을 사용하여 동일한 결과를 얻습니다. 이는 도구 호출을 지원하는 모든 모델에서 작동하며, 대부분의 최신 모델이 이에 해당합니다.

이 전략을 사용하려면 ToolStrategy를 구성합니다:

class ToolStrategy(Generic[SchemaT]):
    schema: type[SchemaT]
    tool_message_content: str | None
    handle_errors: Union[
        bool,
        str,
        type[Exception],
        tuple[type[Exception], ...],
        Callable[[Exception], str],
    ]

schema (필수)

구조화된 출력 형식을 정의하는 스키마입니다. 지원되는 형식:

Pydantic models: 필드 검증이 있는 BaseModel 서브클래스
Dataclasses: 타입 어노테이션이 있는 Python dataclass
TypedDict: 타입이 지정된 딕셔너리 클래스
JSON Schema: JSON 스키마 사양이 포함된 딕셔너리
Union types: 여러 스키마 옵션. 모델이 컨텍스트에 따라 가장 적합한 스키마를 선택합니다.

tool_message_content

구조화된 출력이 생성될 때 반환되는 도구 메시지의 사용자 정의 콘텐츠입니다.

제공되지 않으면 기본적으로 구조화된 응답 데이터를 보여주는 메시지가 사용됩니다.

handle_errors

구조화된 출력 검증 실패에 대한 오류 처리 전략입니다. 기본값은 True입니다.

True: 기본 오류 템플릿으로 모든 오류를 캐치
str: 이 사용자 정의 메시지로 모든 오류를 캐치
type[Exception]: 기본 메시지로 이 예외 타입만 캐치
tuple[type[Exception], ...]: 기본 메시지로 이러한 예외 타입만 캐치
Callable[[Exception], str]: 오류 메시지를 반환하는 사용자 정의 함수
False: 재시도 없음, 예외가 전파됨

Pydantic Model

from pydantic import BaseModel, Field
from typing import Literal
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

class ProductReview(BaseModel):
    """Analysis of a product review."""
    rating: int | None = Field(description="The rating of the product", ge=1, le=5)
    sentiment: Literal["positive", "negative"] = Field(description="The sentiment of the review")
    key_points: list[str] = Field(description="The key points of the review. Lowercase, 1-3 words each.")

agent = create_agent(
    model="gpt-5",
    tools=tools,
    response_format=ToolStrategy(ProductReview)
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"
    }]
})

result["structured_response"]
# ProductReview(rating=5, sentiment='positive', key_points=['fast shipping', 'expensive'])

Dataclass

from dataclasses import dataclass
from typing import Literal
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

@dataclass
class ProductReview:
    """Analysis of a product review."""
    rating: int | None
    sentiment: Literal["positive", "negative"]
    key_points: list[str]

agent = create_agent(
    model="gpt-5",
    tools=tools,
    response_format=ToolStrategy(ProductReview)
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"
    }]
})

result["structured_response"]
# ProductReview(rating=5, sentiment='positive', key_points=['fast shipping', 'expensive'])

TypedDict

from typing import TypedDict, Literal
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

class ProductReview(TypedDict):
    """Analysis of a product review."""
    rating: int | None
    sentiment: Literal["positive", "negative"]
    key_points: list[str]

agent = create_agent(
    model="gpt-5",
    tools=tools,
    response_format=ToolStrategy(ProductReview)
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"
    }]
})

result["structured_response"]
# {'rating': 5, 'sentiment': 'positive', 'key_points': ['fast shipping', 'expensive']}

JSON Schema

from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

review_schema = {
    "type": "object",
    "properties": {
        "rating": {"type": "integer", "minimum": 1, "maximum": 5},
        "sentiment": {"type": "string", "enum": ["positive", "negative"]},
        "key_points": {"type": "array", "items": {"type": "string"}}
    },
    "required": ["rating", "sentiment", "key_points"]
}

agent = create_agent(
    model="gpt-5",
    tools=tools,
    response_format=ToolStrategy(review_schema)
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"
    }]
})

result["structured_response"]
# {'rating': 5, 'sentiment': 'positive', 'key_points': ['fast shipping', 'expensive']}

Union Types

from pydantic import BaseModel, Field
from typing import Union
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

class ContactInfo(BaseModel):
    name: str = Field(description="Person's name")
    email: str = Field(description="Email address")

class EventDetails(BaseModel):
    event_name: str = Field(description="Name of the event")
    date: str = Field(description="Event date")

agent = create_agent(
    model="gpt-5",
    tools=[],
    response_format=ToolStrategy(Union[ContactInfo, EventDetails])
)

result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Extract info: John Doe (john@email.com)"
    }]
})

result["structured_response"]
# ContactInfo(name='John Doe', email='john@email.com')

Custom tool message content

tool_message_content 매개변수를 사용하면 구조화된 출력이 생성될 때 대화 기록에 나타나는 메시지를 사용자 정의할 수 있습니다:

from pydantic import BaseModel, Field
from typing import Literal
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

class MeetingAction(BaseModel):
    """Action items extracted from a meeting transcript."""
    task: str = Field(description="The specific task to be completed")
    assignee: str = Field(description="Person responsible for the task")
    priority: Literal["low", "medium", "high"] = Field(description="Priority level")

agent = create_agent(
    model="gpt-5",
    tools=[],
    response_format=ToolStrategy(
        schema=MeetingAction,
        tool_message_content="Action item captured and added to meeting notes!"
    )
)

agent.invoke({
    "messages": [{
        "role": "user",
        "content": "From our meeting: Sarah needs to update the project timeline as soon as possible"
    }]
})

출력:

================================ Human Message =================================
From our meeting: Sarah needs to update the project timeline as soon as possible

================================== Ai Message ==================================
Tool Calls:
  MeetingAction (call_1)
 Call ID: call_1
  Args:
    task: Update the project timeline
    assignee: Sarah
    priority: high

================================= Tool Message =================================
Name: MeetingAction

Action item captured and added to meeting notes!

tool_message_content 없이는 최종 ToolMessage가 다음과 같이 됩니다:

================================= Tool Message =================================
Name: MeetingAction

Returning structured response: {'task': 'update the project timeline', 'assignee': 'Sarah', 'priority': 'high'}

Error handling

모델은 도구 호출을 통해 구조화된 출력을 생성할 때 실수를 할 수 있습니다. LangChain은 이러한 오류를 자동으로 처리하는 지능형 재시도 메커니즘을 제공합니다.

Multiple structured outputs error

모델이 여러 구조화된 출력 도구를 잘못 호출하면, 에이전트는 ToolMessage에 오류 피드백을 제공하고 모델에게 재시도를 요청합니다:

from pydantic import BaseModel, Field
from typing import Union
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

class ContactInfo(BaseModel):
    name: str = Field(description="Person's name")
    email: str = Field(description="Email address")

class EventDetails(BaseModel):
    event_name: str = Field(description="Name of the event")
    date: str = Field(description="Event date")

agent = create_agent(
    model="gpt-5",
    tools=[],
    response_format=ToolStrategy(Union[ContactInfo, EventDetails])
    # Default: handle_errors=True
)

agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Extract info: John Doe (john@email.com) is organizing Tech Conference on March 15th"
    }]
})

출력:

================================ Human Message =================================
Extract info: John Doe (john@email.com) is organizing Tech Conference on March 15th

================================== Ai Message ==================================
Tool Calls:
  ContactInfo (call_1)
 Call ID: call_1
  Args:
    name: John Doe
    email: john@email.com
  EventDetails (call_2)
 Call ID: call_2
  Args:
    event_name: Tech Conference
    date: March 15th

================================= Tool Message =================================
Name: ContactInfo

Error: Model incorrectly returned multiple structured responses (ContactInfo, EventDetails) when only one is expected.
Please fix your mistakes.

================================= Tool Message =================================
Name: EventDetails

Error: Model incorrectly returned multiple structured responses (ContactInfo, EventDetails) when only one is expected.
Please fix your mistakes.

================================== Ai Message ==================================
Tool Calls:
  ContactInfo (call_3)
 Call ID: call_3
  Args:
    name: John Doe
    email: john@email.com

================================= Tool Message =================================
Name: ContactInfo

Returning structured response: {'name': 'John Doe', 'email': 'john@email.com'}

Schema validation error

구조화된 출력이 예상 스키마와 일치하지 않으면, 에이전트는 구체적인 오류 피드백을 제공합니다:

from pydantic import BaseModel, Field
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

class ProductRating(BaseModel):
    rating: int | None = Field(description="Rating from 1-5", ge=1, le=5)
    comment: str = Field(description="Review comment")

agent = create_agent(
    model="gpt-5",
    tools=[],
    response_format=ToolStrategy(ProductRating),
    # Default: handle_errors=True
    system_prompt="You are a helpful assistant that parses product reviews. Do not make any field or value up."
)

agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Parse this: Amazing product, 10/10!"
    }]
})

출력:

================================ Human Message =================================
Parse this: Amazing product, 10/10!

================================== Ai Message ==================================
Tool Calls:
  ProductRating (call_1)
 Call ID: call_1
  Args:
    rating: 10
    comment: Amazing product

================================= Tool Message =================================
Name: ProductRating

Error: Failed to parse structured output for tool 'ProductRating': 1 validation error for ProductRating.rating
  Input should be less than or equal to 5 [type=less_than_equal, input_value=10, input_type=int].
Please fix your mistakes.

================================== Ai Message ==================================
Tool Calls:
  ProductRating (call_2)
 Call ID: call_2
  Args:
    rating: 5
    comment: Amazing product

================================= Tool Message =================================
Name: ProductRating

Returning structured response: {'rating': 5, 'comment': 'Amazing product'}

Error handling strategies

handle_errors 매개변수를 사용하여 오류 처리 방법을 사용자 정의할 수 있습니다:

사용자 정의 오류 메시지:

ToolStrategy(
    schema=ProductRating,
    handle_errors="Please provide a valid rating between 1-5 and include a comment."
)

handle_errors가 문자열인 경우, 에이전트는 항상 고정된 도구 메시지로 모델에게 재시도를 요청합니다:

================================= Tool Message =================================
Name: ProductRating

Please provide a valid rating between 1-5 and include a comment.

특정 예외만 처리:

ToolStrategy(
    schema=ProductRating,
    handle_errors=ValueError  # Only retry on ValueError, raise others
)

handle_errors가 예외 타입인 경우, 에이전트는 발생한 예외가 지정된 타입인 경우에만 재시도(기본 오류 메시지 사용)합니다. 다른 모든 경우에는 예외가 발생합니다.

여러 예외 타입 처리:

ToolStrategy(
    schema=ProductRating,
    handle_errors=(ValueError, TypeError)  # Retry on ValueError and TypeError
)

handle_errors가 예외 튜플인 경우, 에이전트는 발생한 예외가 지정된 타입 중 하나인 경우에만 재시도(기본 오류 메시지 사용)합니다. 다른 모든 경우에는 예외가 발생합니다.

사용자 정의 오류 핸들러 함수:

def custom_error_handler(error: Exception) -> str:
    if isinstance(error, StructuredOutputValidationError):
        return "There was an issue with the format. Try again."
    elif isinstance(error, MultipleStructuredOutputsError):
        return "Multiple structured outputs were returned. Pick the most relevant one."
    else:
        return f"Error: {str(error)}"

ToolStrategy(
    schema=ToolStrategy(Union[ContactInfo, EventDetails]),
    handle_errors=custom_error_handler
)

StructuredOutputValidationError에서:

================================= Tool Message =================================
Name: ToolStrategy

There was an issue with the format. Try again.

MultipleStructuredOutputsError에서:

================================= Tool Message =================================
Name: ToolStrategy

Multiple structured outputs were returned. Pick the most relevant one.

다른 오류에서:

================================= Tool Message =================================
Name: ToolStrategy

Error: <error message>

오류 처리 없음:

response_format=ToolStrategy(
    schema=ProductRating,
    handle_errors=False  # All errors raised
)

출처: https://docs.langchain.com/oss/python/langchain/structured-output

Langchain v1.0

저작자표시 (새창열림)

Response Format

Provider strategy

schema (필수)

Tool calling strategy

schema (필수)

tool_message_content

handle_errors

Custom tool message content

Error handling

Multiple structured outputs error

Schema validation error

Error handling strategies

Langchain v1.0

티스토리툴바