Přeskočit obsah

Testování

Strategy + tooling pro 3 vrstvy testů: pgTAP (DB), pytest (Python), Vitest (Frontend).

Souhrn coverage

Vrstva Framework Testů Spuštění
DB pgTAP 134 inline runner / Docker / psql
Backend pytest + pytest-asyncio ~12 uv run pytest
Frontend Vitest + jsdom 7 npm test
E2E (manuální) curl smoke + browser

pgTAP (Database)

134 testů v tests/db/:

File Testů Co testuje
01_rls_tenants.sql 22 Cross-tenant deny pen tests napříč všemi tabulkami
02_storage_rls.sql 6 Bucket path-based RLS
03_migration.sql 14 Claim flow (v1 → v2)
04_grants.sql 38 Defense-in-depth + sweep test
05_ingestion.sql 28 Phase B nové tabulky
06_chunks.sql 26 Phase C chunks + HNSW

4 cesty spuštění

# Z Python s Supabase MCP:
mcp__claude_ai_Supabase__execute_sql(
    project_id="cubdrgjdkatyecrgckwp",
    query=open("tests/db/04_grants.sql").read(),
)

Rychlé. Žádný Docker. Trade-off: integrace z CI vyžaduje Supabase MCP server běžící.

2. Docker (linked project)

cd nemoreport-ai-backend-v2
supabase test db --linked

Nejvíc autoritativní (real Supabase test runner). Pomalé (Docker).

3. psql (remote)

# tests/db/run_remote.sh
psql "$DATABASE_URL" -f tests/db/01_rls_tenants.sql
psql "$DATABASE_URL" -f tests/db/04_grants.sql
# ...

Vyžaduje DATABASE_URL z Supabase Settings → Database. Skipuje Docker overhead.

4. CI (GitHub Actions, plánované)

Workflow file v .github/workflows/db-tests.yml:

on: [push, pull_request]
jobs:
  pgtap:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: supabase/postgres:17
        ...
    steps:
      - uses: actions/checkout@v4
      - run: supabase db reset
      - run: supabase test db --linked

(Aktuálně CI nenastavené.)

Standard test pattern

begin;
create extension if not exists pgtap;
select plan(N);

-- Setup helpers
create or replace function _as_user(...) returns void ...;
create or replace function _as_postgres() returns void ...;

-- Fixtures
create temporary table _ids as ...;
insert into nemoreport.<table> ...;

-- Tests
select is(...);  -- assert eq
select ok(...);  -- assert true
select throws_ok(...);  -- assert raises
select lives_ok(...);   -- assert no raises

select * from finish();
rollback;  -- nikdy commit, aby fixtures zůstaly izolované

Sweep test pattern

04_grants.sql má sweep test pro budoucí migrace:

select is(
  (select count(*)::int from (values ...) as t(name)
   where not has_table_privilege('service_role', 'nemoreport.' || name, 'SELECT, INSERT, UPDATE, DELETE')),
  0,
  'service_role má full CRUD na všech nemoreport tabulkách (sweep)'
);

Pokud někdo přidá novou tabulku bez service_role grantu, sweep test selže → blok merge.

Pytest (Backend)

tests/api/ — Python unit + integration tests.

cd nemoreport-ai-backend-v2

# Run all
uv run pytest

# Run specific
uv run pytest tests/api/test_auth.py -v

# Coverage
uv run pytest --cov=app --cov-report=html

Existující testy

  • tests/api/test_auth.py — JWT verification, JWKS cache, AuthUser construction
  • tests/api/test_health.py — health endpoint smoke
  • (více testů TBD post-Phase D)

Pattern

import pytest
from fastapi.testclient import TestClient
from app.main import app

@pytest.fixture
def client():
    return TestClient(app)

def test_health(client):
    r = client.get("/health")
    assert r.status_code == 200
    assert r.json()["status"] == "ok"

@pytest.mark.asyncio
async def test_async_function():
    result = await some_async_function()
    assert result == expected

Integration testy (TBD)

Pro full flow integration: - Mock Mistral / Gemini / Cohere SDKs (use respx pro httpx mocking) - Test Supabase přes test database (separate project nebo branch) - Test Redis přes ephemeral Docker container

Plánované pre-Phase D: - tests/integration/test_ingestion_pipeline.py — full upload → ready flow - tests/integration/test_retrieve.py — query → top-K validation

Vitest (Frontend)

tests/lib/ — TS unit tests.

cd nemoreport-ai-frontend-v2

# Run all (CI mode)
npm test

# Watch mode
npm run test:watch

Existující testy

  • tests/lib/api.test.ts (5 testů) — apiFetch wrapper, JWT injection, error mapping
  • tests/lib/feedback.test.ts (2 testy) — feedback submit logic

Setup

// tests/setup.ts
import { vi } from 'vitest'

// Inject NEXT_PUBLIC_* env pro testy
process.env.NEXT_PUBLIC_SUPABASE_URL = 'https://test.supabase.co'
process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY = 'test-anon'
process.env.NEXT_PUBLIC_BACKEND_URL = 'http://localhost:8000'

Pattern

import { describe, it, expect, vi } from 'vitest'
import { apiFetch } from '@/lib/api'

describe('apiFetch', () => {
  it('injects JWT from session', async () => {
    const mockSession = { access_token: 'token-xyz' }
    vi.spyOn(supabaseClient.auth, 'getSession').mockResolvedValue({ data: { session: mockSession } })

    const fetchSpy = vi.spyOn(global, 'fetch')
    await apiFetch('/me')

    expect(fetchSpy).toHaveBeenCalledWith(
      expect.any(String),
      expect.objectContaining({
        headers: expect.objectContaining({ Authorization: 'Bearer token-xyz' })
      })
    )
  })
})

Component testy (TBD)

Aktuálně netestujeme React komponenty. Plánované přidat: - RTL (React Testing Library) pro UploadZone, IngestionProgress, ChatWindow - E2E přes Playwright (TBD)

E2E manuální smoke testy

Auth flow

# 1. Open browser → /login
# 2. Email → magic link
# 3. Click link → /auth/confirm → cookies set → /
# 4. Reload → still authenticated (session persistence)

# Backend smoke
curl https://nemoreport-ai-backend-v2.sliplane.app/me \
  -H "Authorization: Bearer <jwt>"

Upload + ingestion

curl -X POST https://nemoreport-ai-backend-v2.sliplane.app/ingest \
  -H "Authorization: Bearer <jwt>" \
  -F "file=@test.pdf" \
  -F "title=Test report"

# → 202 Accepted s report_id
# → Watch Realtime channel `report-{id}` for status updates
# → Eventually status='ready'

Retrieval

curl -X POST https://nemoreport-ai-backend-v2.sliplane.app/admin/retrieve/<report_id> \
  -H "X-Admin-Hash: <hash>" \
  -H "Content-Type: application/json" \
  -d '{"query": "občanská vybavenost", "top_k": 5}'

Cohere rerank presence

# Look for `reranked: true` v response
# + `rerank_score` per chunk
# + `rerank_ms` v top-level metrics

CI/CD strategy (TBD)

Aktuálně žádné CI. Pro produkci recommended:

GitHub Actions workflow

name: CI
on: [push, pull_request]

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: uv run ruff check .

  typecheck:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: uv run mypy app/

  pytest:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: uv sync
      - run: uv run pytest

  pgtap:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: supabase/postgres:17
        env:
          POSTGRES_PASSWORD: postgres
        ports:
          - 5432:5432
    steps:
      - uses: actions/checkout@v4
      - run: supabase db reset
      - run: supabase test db --linked

  frontend-test:
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: nemoreport-ai-frontend-v2
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm test

  frontend-build:
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: nemoreport-ai-frontend-v2
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm run build

Quality gates

Aktuální (manual): - pgTAP 134/134 ✓ - Backend smoke (curl /health, /me, /ingest) - Frontend Vitest 7/7

Pro Phase D (plánované): - Golden set eval (recall@10 ≥ 0.85, MRR ≥ 0.5, faithfulness ≥ 0.9, p95 ≤ 700ms) - Component tests RTL - E2E Playwright tests pro klíčové flows

Test data

Test users

Setup.sql vytvoří 2 test users:

  • test-a@nemoreport.test (UUID aaaaaaaa-0000-0000-0000-000000000001)
  • test-b@nemoreport.test (UUID bbbbbbbb-0000-0000-0000-000000000002)

S personal_tenants + 1 reportem každý:

  • Report A: 11111111-...
  • Report B: 22222222-...

Test fixtures pro pgTAP používají tyto IDs.

Production test data

Aktuálně v produkci 21 reportů s chunks (po Phase C C.13 backfill). Může být použito pro retrieval eval pre-Phase D.

Performance testing (TBD)

Aktuálně žádné load testy. Pro pre-pilot:

  • k6 nebo locust scénáře:
  • 100 concurrent uploads (burst)
  • 1000 retrievals/min sustained
  • Cohere RPM limit (10) handling pri burst

Security testing (TBD)

  • pgTAP pokrývá RLS pen tests
  • Recommended pre-pilot: dedicated security audit (third-party)
  • OWASP ZAP nebo Burp Suite scan na public endpoints