Structured Outputs by Example Star on GitHub
Stay updated when new content is added and get tips from the Instructor team
Getting Started

Getting Started with Structured Outputs

Edit

Learn the basics of structured LLM outputs with Instructor. This guide demonstrates how to extract consistent, validated data from language models.
Large language models are powerful, but extracting structured data can be challenging.
Structured outputs solve this by having LLMs return data in consistent, machine-readable formats.

Using a standard OpenAI client for unstructured output
from openai import OpenAI

client = OpenAI()

Ask for customer information in free text
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": "Extract customer: John Doe, age 35, email: john@example.com",
        }
    ],
)

print(response.choices[0].message.content)

Instructor solves these problems by combining LLMs with Pydantic validation:
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, EmailStr

Define a structured data model
class Customer(BaseModel):
    name: str = Field(description="Customer's full name")
    age: int = Field(description="Customer's age in years", ge=0, le=120)
    email: EmailStr = Field(description="Customer's email address")

Patch the OpenAI client with Instructor
client = instructor.from_openai(OpenAI())

Extract structured customer data
customer = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": "Extract customer: John Doe, age 35, email: john@example.com",
        }
    ],
    response_model=Customer,  # This is the key part
)

print(customer)  # Customer(name='John Doe', age=35, email='john@example.com')
print(f"Name: {customer.name}, Age: {customer.age}, Email: {customer.email}")



Complex data example:
from typing import List, Optional
from pydantic import BaseModel, Field
import instructor
from openai import OpenAI

Initialize the client with instructor
client = instructor.from_openai(OpenAI())

Define a complex, nested data structure
class Address(BaseModel):
    street: str
    city: str
    state: str
    zip_code: str


class Contact(BaseModel):
    email: Optional[str] = None
    phone: Optional[str] = None


class Person(BaseModel):
    name: str
    age: int
    occupation: str
    address: Address
    contact: Contact
    skills: List[str] = Field(description="List of professional skills")

Extract structured data
person = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {
            "role": "user",
            "content": """
        Extract detailed information for this person:
        John Smith is a 42-year-old software engineer living at 123 Main St, San Francisco, CA 94105.
        His email is john.smith@example.com and phone is 555-123-4567.
        John is skilled in Python, JavaScript, and cloud architecture.
        """,
        }
    ],
    response_model=Person,
)

Now you have a fully structured object
print(f"Name: {person.name}")
print(f"Location: {person.address.city}, {person.address.state}")
print(f"Skills: {', '.join(person.skills)}")

Running the Example

First, install Instructor and any dependencies
$ pip install instructor pydantic
Run the Python script
$ python getting-started.py

Further Information