docs / quickstart

Getting Started

Up and running with PromptAssert in under 5 minutes. Install, configure, write your first test, run it in CI.

01. Installation

PromptAssert is a Python library. Python 3.9+ required.

terminalpython
pip install promptassert

This installs the promptassert Python package and the promptassert CLI binary.

02. Configuration

Create a promptassert.config.yaml in the root of your project. This tells PromptAssert which providers to use and where to store snapshots.

promptassert.config.yamlyaml
# promptassert.config.yaml
version: 1
default_model: gpt-4o

providers:
  openai:
    api_key: ${OPENAI_API_KEY}
  anthropic:
    api_key: ${ANTHROPIC_API_KEY}

output_dir: .promptassert/snapshots
💡 Tip

Never hardcode API keys. Use environment variables. PromptAssert resolves ${VAR} syntax in config values automatically.

03. Write your first test

Create a tests/ directory in your project. Any file matching test_*.py is automatically discovered.

tests/test_support.pypython
from promptassert import PromptAssert

pa = PromptAssert(model="gpt-4o")

@pa.test
def test_customer_support_tone():
    response = pa.call(
        system="You are a helpful customer support agent.",
        prompt="A customer is asking about their order status."
    )
    # Tone checks
    pa.assert_sentiment(response, "positive")
    pa.assert_contains(response, ["order", "help", "assist"])
    pa.assert_not_contains(response, ["sorry", "unfortunately", "can't"])

    # Length constraints
    pa.assert_max_tokens(response, 300)

@pa.test
def test_refund_policy_accuracy():
    response = pa.call(
        system="You are a support agent. Our refund window is 30 days.",
        prompt="What is your refund policy?"
    )
    # Factual accuracy
    pa.assert_contains(response, ["30"])
    pa.assert_semantic_similarity(response, "30-day refund window", threshold=0.85)

@pa.test
def test_no_hallucination_on_pricing():
    response = pa.call(
        system="You are a product assistant. Do not invent pricing.",
        prompt="How much does the enterprise plan cost?"
    )
    # Should not make up numbers
    pa.assert_not_regex(response, r"\$[0-9]+")
    pa.assert_contains(response, ["contact", "sales", "pricing"])

04. Running your tests

The promptassert CLI discovers and runs all tests in the specified path.

terminalpython
# Run all tests
promptassert run tests/

# Run a single file
promptassert run tests/test_support.py

# Run against a different model (comparison mode)
promptassert run tests/ --model claude-3-5-sonnet-20241022

# Generate a snapshot baseline
promptassert snapshot --create tests/

# Check drift against baseline
promptassert snapshot --check tests/

05. Add to CI/CD

PromptAssert exits with code 1 when any assertion fails — exactly like pytest. Drop this GitHub Actions workflow into your repo and your prompts will be tested on every PR.

.github/workflows/prompt-tests.ymlyaml
# .github/workflows/prompt-tests.yml
name: Prompt Regression Tests

on:
  push:
    branches: [main]
  pull_request:

jobs:
  prompt-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Install PromptAssert
        run: pip install promptassert

      - name: Run prompt test suite
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: promptassert run tests/

      - name: Upload results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: promptassert-results
          path: .promptassert/results/

Assertion reference

All assertions raise PromptAssertionError on failure with a detailed diff message.

assert_contains(response, keywords)

Fails if none of the given keywords appear in the response.

pa.assert_contains(response, ["help", "assist"])
assert_not_contains(response, keywords)

Fails if any of the given keywords appear in the response.

pa.assert_not_contains(response, ["sorry", "unfortunately"])
assert_sentiment(response, expected)

Checks overall sentiment. Expected: "positive", "neutral", or "negative".

pa.assert_sentiment(response, "positive")
assert_max_tokens(response, max)

Fails if token count exceeds the specified maximum.

pa.assert_max_tokens(response, 200)
assert_semantic_similarity(response, expected, threshold)

Embeds both strings and checks cosine similarity. Threshold 0.0–1.0.

pa.assert_semantic_similarity(response, "30-day refund", threshold=0.85)
assert_regex(response, pattern)

Fails if the regex pattern does not match the response.

pa.assert_regex(response, r"order #[0-9]+")
assert_not_regex(response, pattern)

Fails if the regex pattern matches the response.

pa.assert_not_regex(response, r"\$[0-9]+")
assert_json_schema(response, schema)

Parses response as JSON and validates against a JSON Schema dict.

pa.assert_json_schema(response, {"type": "object", "required": ["status"]})

✓ You're set up

Your prompt test suite is running in CI. Here's what to do next:

  • Expand your test suite — add tests for every prompt in your app
  • Create snapshots with promptassert snapshot --create to baseline current behavior
  • Run cross-model comparisons before upgrading from gpt-4o to a new model
  • Upgrade to Pro for Slack alerts when tests fail in CI