Why Testing Matters

Testing catches errors before they become problems:

Verify code works as intended

Catch regressions when changing code

Document expected behavior

Build confidence in changes

Types of Testing

Manual Testing

Run it and see:

# Run the script
./my_script.sh

# Check the output
cat output.txt

# Verify the result

Quick but not repeatable.

Automated Testing

Tests that run themselves:

def test_addition():
    assert add(2, 3) == 5
    assert add(-1, 1) == 0
    assert add(0, 0) == 0

Repeatable and reliable.

Unit Tests

Test individual functions:

def calculate_price(quantity, unit_price, discount=0):
    subtotal = quantity * unit_price
    return subtotal * (1 - discount)

def test_calculate_price():
    assert calculate_price(10, 5.00) == 50.00
    assert calculate_price(10, 5.00, 0.1) == 45.00
    assert calculate_price(0, 5.00) == 0

Integration Tests

Test components working together:

def test_user_registration():
    # Test the full flow
    response = api.register_user("test@example.com", "password")
    assert response.status_code == 201
    
    user = db.get_user("test@example.com")
    assert user is not None

End-to-End Tests

Test the whole system:

def test_checkout_flow():
    # Simulate full user journey
    browser.login("user@example.com")
    browser.add_to_cart("product-123")
    browser.checkout()
    assert browser.see("Order confirmed")

Testing Approaches

Test-Driven Development (TDD)

Write tests first:

Write a failing test

Write code to pass the test

Refactor if needed

Repeat

After-the-Fact Testing

Write tests after code:

Write the code

Write tests to verify it

Fix any issues found

Sanity Checking

Quick verification:

# Does it run without errors?
python script.py

# Does basic functionality work?
python -c "from mylib import main; main()"

Writing Good Tests

Test Structure

def test_descriptive_name():
    # Arrange - set up data
    user = User("test@example.com")
    
    # Act - perform action
    result = user.validate()
    
    # Assert - check result
    assert result is True

Test Cases to Cover

Happy path: Normal, expected usage

Edge cases: Boundaries, limits

Error cases: Invalid inputs, failures

Empty cases: Null, empty, zero

def test_divide():
    # Happy path
    assert divide(10, 2) == 5
    
    # Edge case
    assert divide(0, 5) == 0
    
    # Error case
    with pytest.raises(ZeroDivisionError):
        divide(10, 0)

Good Test Properties

Fast: Quick to run
Isolated: Don't depend on each other
Repeatable: Same result every time
Self-validating: Pass or fail, no interpretation

Verification Techniques

Read Before Running

# Before executing, verify:
# - Is this the right file?
# - Will this command do what I expect?
# - Are there any obvious errors?

Incremental Testing

Test as you build:

# Step 1: Basic structure
print("Step 1 complete")

# Step 2: Add feature
print("Step 2 complete")
# etc.

Compare Outputs

# Run and save output
./script.sh > actual_output.txt

# Compare with expected
diff expected_output.txt actual_output.txt

Validate Data

# Check data makes sense
assert len(results) > 0, "No results returned"
assert all(r['price'] >= 0 for r in results), "Negative prices"

Testing as an Agent

Before Delivering

Does it run without errors?

Does it produce expected output?

Did I test edge cases?

Is the code clean and readable?

Quick Verification

# Syntax check (Python)
python -m py_compile script.py

# Lint check
flake8 script.py

# Type check
mypy script.py

Running Tests

# pytest
pytest tests/

# Jest (JavaScript)
npm test

# Go
go test ./...

Common Testing Patterns

Setup and Teardown

class TestDatabase:
    def setup_method(self):
        self.db = connect_test_database()
    
    def teardown_method(self):
        self.db.clear()
        self.db.close()
    
    def test_insert(self):
        self.db.insert({"id": 1})
        assert self.db.count() == 1

Mocking

from unittest.mock import Mock, patch

def test_api_call():
    with patch('requests.get') as mock_get:
        mock_get.return_value.json.return_value = {"status": "ok"}
        
        result = my_function()
        
        assert result == "ok"

Fixtures

import pytest

@pytest.fixture
def sample_user():
    return User("test@example.com", "Test User")

def test_user_email(sample_user):
    assert sample_user.email == "test@example.com"

Debugging Test Failures

Read the Error

AssertionError: assert 5 == 6

What did you expect? What did you get?

Isolate the Problem

# Add debug output
print(f"Input: {input_value}")
print(f"Result: {result}")
print(f"Expected: {expected}")

Simplify

Reduce to minimal failing case:

# Instead of complex test
def test_simple():
    assert function(1) == 1

Best Practices

Name Tests Clearly

# Bad
def test1():

# Good
def test_login_with_valid_credentials_succeeds():

One Assertion per Test (Usually)

# Focused tests
def test_email_is_required():
    with pytest.raises(ValidationError):
        User(email=None)

def test_email_must_be_valid():
    with pytest.raises(ValidationError):
        User(email="invalid")

Keep Tests Fast

Slow tests get skipped:

# Mock external services
# Use in-memory databases
# Avoid unnecessary waits

Don't Test External Services

# Bad - depends on external API
def test_weather():
    weather = get_weather("NYC")
    assert weather is not None

# Good - mock the external call
def test_weather_parsing():
    mock_data = {"temp": 72, "conditions": "sunny"}
    result = parse_weather(mock_data)
    assert result.temperature == 72

Conclusion

Testing is essential for reliable work:

Test as you build

Cover happy paths and edge cases

Verify before delivering

Automate what you can

Good testing builds confidence in your work.

Next: Logging and Monitoring - Observability for agents