Databáze¶
Detail Postgres schema, migrace a RLS policies.
Connection info¶
Project ref: cubdrgjdkatyecrgckwp
URL: https://cubdrgjdkatyecrgckwp.supabase.co
Postgres: 17 (Supabase managed)
Schema: nemoreport
Migrace¶
Všechny migrace jsou v supabase/migrations/ v backend repu, applied via:
NEBO inline přes Supabase MCP apply_migration s project_id="cubdrgjdkatyecrgckwp".
| # | Soubor | Co dělá |
|---|---|---|
| 0001 | 0001_auth_tenants.sql |
tenants + tenant_members + user_profiles + handle_new_user trigger + RLS |
| 0002 | 0002_reports_conversations.sql |
reports + conversations + messages + feedback |
| 0003 | 0003_llm_providers.sql |
llm_providers admin tabulka |
| 0004 | 0004_storage_buckets.sql |
3 R2 buckety + path-based RLS |
| 0005 | 0005_migration_tables.sql |
handle_claims + migration_events + user_events + import_manifest |
| 0006 | 0006_import_v1_function.sql |
nemoreport.import_v1(handle, user_id, tenant_id) PL/pgSQL |
| 0007 | 0007_jwt_hook.sql |
private.custom_access_token_hook(event) injektuje claims |
| 0008 | (skipnuto) | (původně schema cleanup, později merged jinam) |
| 0009 | 0009_grant_private_usage_to_auth_admin.sql |
GRANT USAGE ON SCHEMA private TO supabase_auth_admin |
| 0010 | 0010_grant_service_role_nemoreport.sql |
service_role full CRUD na celém schema |
| 0011 | 0011_revoke_authenticated_from_admin_tables.sql |
Defense-in-depth REVOKE od authenticated |
| 0012 | 0012_grant_messages_seq_to_authenticated.sql |
USAGE on messages_id_seq |
| Phase B | ||
| 0013 | 0013_ingestion_tables.sql |
attachments + parsed_sections + figures + parsed_tables + ingestion_jobs + reports rozšíření |
| Phase C | ||
| 0014 | 0014_chunks_setup.sql |
pgvector 0.8 + unaccent + czech_unaccent ts config + chunks tabulka + HNSW partial index |
| 0015 | 0015_search_chunks_rpc.sql |
search_chunks_by_folder() RPC (vector-only, MVP) |
| 0016 | 0016_hybrid_search_chunks.sql |
hybrid_search_chunks_by_folder() RPC (BM25 + vector + RRF) |
| 0017 | 0017_hybrid_search_prefix_match.sql |
_build_prefix_tsquery() helper, fix declension matching |
| 0018 | 0018_retrieval_log.sql |
retrieval_log table |
RLS policies¶
Všechny nemoreport.* tabulky mají RLS enabled.
Helper function¶
Kontroluje JWT claims personal_tenant_id + active_tenant_id NEBO tenant_members row pro current auth.uid().
Standard pattern per tabulka¶
ALTER TABLE nemoreport.<table> ENABLE ROW LEVEL SECURITY;
CREATE POLICY "<table>_select" ON nemoreport.<table> FOR SELECT
TO authenticated USING (private.user_has_tenant_access(tenant_id));
-- Pro user-writable tabulky (např. attachments user_upload):
CREATE POLICY "<table>_insert" ON nemoreport.<table> FOR INSERT
TO authenticated WITH CHECK (
private.user_has_tenant_access(tenant_id)
AND <conditions>
);
CREATE POLICY "<table>_update" ON nemoreport.<table> FOR UPDATE
TO authenticated USING (...) WITH CHECK (...);
CREATE POLICY "<table>_delete" ON nemoreport.<table> FOR DELETE
TO authenticated USING (...);
-- Pro worker-managed tabulky: žádná INSERT/UPDATE/DELETE policy (default deny).
-- + defense-in-depth REVOKE:
REVOKE INSERT, UPDATE, DELETE ON nemoreport.<table> FROM authenticated;
GRANT SELECT, INSERT, UPDATE, DELETE ON nemoreport.<table> TO service_role;
Per-tabulka summary¶
| Tabulka | SELECT pro authenticated | INSERT/UPDATE/DELETE |
|---|---|---|
tenants |
own tenant | service_role only |
tenant_members |
own | service_role |
user_profiles |
own | service_role |
reports |
tenant scoped | tenant scoped (user write) + service_role |
attachments |
tenant scoped | tenant + WITH CHECK source='user_upload' (Nette = service_role) |
parsed_sections |
tenant scoped | service_role only |
figures |
tenant scoped | service_role only |
parsed_tables |
tenant scoped | service_role only |
ingestion_jobs |
tenant scoped | service_role only |
chunks |
tenant scoped | service_role only |
retrieval_log |
tenant scoped | service_role only |
conversations |
tenant scoped | tenant + service_role |
messages |
tenant scoped | tenant + service_role |
feedback |
tenant scoped | tenant + service_role |
llm_providers |
(RLS deny vše + REVOKE) | service_role only |
handle_claims_preauth |
(RLS deny + REVOKE) | service_role only |
migration_events |
(RLS deny + REVOKE) | service_role only |
user_events |
(RLS deny + REVOKE) | service_role only |
import_manifest |
(RLS deny + REVOKE) | service_role only |
Indexy¶
chunks (Phase C — vector RAG)¶
chunks_pkey -- (id) PRIMARY KEY btree
idx_chunks_tsv_gin -- gin(tsv) — BM25 leg
idx_chunks_report -- (report_id, order_in_doc) — folder retrieval ordering
idx_chunks_tenant -- (tenant_id)
idx_chunks_section_filter -- (report_id, section_slug, content_type) PARTIAL where section_slug NOT NULL
idx_chunks_attachment -- (attachment_id) PARTIAL where attachment_id NOT NULL
idx_chunks_embedding_hnsw -- hnsw(embedding halfvec_cosine_ops) WITH (m=16, ef_construction=64) PARTIAL where embedding NOT NULL
reports¶
idx_reports_tenant_created -- (tenant_id, created_at DESC) PARTIAL where deleted_at IS NULL
idx_reports_status_pending -- (status, created_at) PARTIAL where status IN (active states)
idx_reports_nette_external -- UNIQUE (tenant_id, external_id) PARTIAL where source='nette' AND external_id IS NOT NULL
attachments¶
idx_attachments_report
idx_attachments_tenant
idx_attachments_status_pending -- (status, created_at) PARTIAL
idx_attachments_nette_unique -- UNIQUE (report_id, nette_id) PARTIAL where source='nette' AND nette_id IS NOT NULL
parsed_sections / figures / parsed_tables / ingestion_jobs / retrieval_log¶
Standard btree indexy: _report, _tenant, partial _pending, _section/_attachment partial.
Postgres extensions¶
| Extension | Verze | Použití |
|---|---|---|
| vector | 0.8.0 | pgvector — halfvec(1536), HNSW index, cosine similarity |
| unaccent | 1.1 | strip diakritiku v czech_unaccent ts config |
| pgcrypto | 1.3 | gen_random_uuid() |
| uuid-ossp | 1.1 | UUID legacy support |
| pgtap | 1.3.3 | Unit testing (test only) |
| pg_stat_statements | 1.11 | Query performance monitoring |
| supabase_vault | 0.3.1 | Secret storage (Supabase managed) |
| postgis | (avail) | NEPOUŽÍVÁME (GIS coordinates jsou v jsonb, ne PostGIS) |
RPC functions (Postgres SECURITY DEFINER)¶
nemoreport.import_v1(p_handle text, p_user_id uuid, p_tenant_id uuid) returns jsonb¶
Phase A.1 — claim flow pro v1 testery. Importuje reporty z public.* schema do nemoreport.*.
nemoreport.search_chunks_by_folder(p_query_vec halfvec(1536), p_report_id uuid, p_top_k int, p_ef_search int) returns table¶
Phase C.5 — vector-only retrieval (legacy MVP, Phase C.6 přepsala na hybrid).
nemoreport.hybrid_search_chunks_by_folder(p_query_text text, p_query_vec halfvec(1536), p_report_id uuid, p_top_k int, p_pre_fusion_n int, p_ef_search int, p_rrf_k int) returns table¶
Phase C.6 — hlavní retrieval RPC. Vrací top-K chunks s per-leg ranks + RRF score.
nemoreport._build_prefix_tsquery(p_query text) returns tsquery¶
Phase C.6 — interní helper, builduje prefix tsquery z user input (split tokens, OR, :* wildcard).
private.custom_access_token_hook(event jsonb) returns jsonb¶
Phase A — JWT custom hook. Injektuje personal_tenant_id + active_tenant_id claims.
nemoreport.set_updated_at() returns trigger¶
Helper pro BEFORE UPDATE triggers na tabulkách s updated_at.
nemoreport.handle_new_user() returns trigger¶
Phase A — AFTER INSERT ON auth.users trigger. Auto-creates personal tenant + user_profiles + tenant_members row.
nemoreport.handle_delete_user() returns trigger¶
Phase A — GDPR BEFORE DELETE ON auth.users. Hard-delete personal tenant + cascade na všechno.
Realtime publication¶
ALTER PUBLICATION supabase_realtime ADD TABLE nemoreport.reports;
ALTER PUBLICATION supabase_realtime ADD TABLE nemoreport.attachments;
Frontend subscribe na postgres_changes UPDATE events. NIKDY neemituje:
- chunks (frontend pulluje on-demand)
- figures (pulluje při status='ready')
- parsed_sections (read po finalize)
- retrieval_log (admin only)
pgTAP testy¶
Test files v tests/db/:
setup.sql— pre-test fixtures (User A/B + jejich reporty)01_rls_tenants.sql— RLS isolation pen tests (22 tests)02_storage_rls.sql— bucket path scoping (6 tests)03_migration.sql— claim flow (14 tests)04_grants.sql— defense-in-depth + sweep test (38 tests)05_ingestion.sql— Phase B (28 tests)06_chunks.sql— Phase C (26 tests)
Total: 134 testů.
Spuštění:
- Inline: přes mcp__claude_ai_Supabase__execute_sql (no Docker)
- Docker: supabase test db --linked
- psql: tests/db/run_remote.sh (přes connection string)
Detail viz Testování.
Backup + recovery¶
Supabase managed: - PITR (Point-in-time recovery) na Pro tier (default 7 dní retention) - Daily backups — 1 týden retention - Read replica — aktuálně NEPOUŽÍVÁME (1 cluster master pro write + read)
Pro produkci doporučeno upgrade na Pro plán + 14d PITR.
Performance¶
Indexes coverage¶
Všechny user-facing queries jsou indexed (verifikované přes pg_stat_statements):
SELECT FROM reports WHERE tenant_id = X→ idx_reports_tenant_createdSELECT FROM attachments WHERE report_id = X→ idx_attachments_reportSELECT FROM chunks WHERE report_id = X ORDER BY embedding <=> Y LIMIT K→ idx_chunks_embedding_hnsw + idx_chunks_report
HNSW tuning¶
SET LOCAL hnsw.iterative_scan = 'relaxed_order'; -- partial index recovery
SET LOCAL hnsw.ef_search = 100; -- folder scope default
ef_search per scope:
- section: 80 (small candidate pool)
- folder: 100 (default)
- multi_report: 150 (larger pool needed)
Vacuum + autovacuum¶
Supabase auto-managed. Per-table tuning není potřeba pro current scale (290 chunks).
Pro produkci 100K+ chunks: monitorovat pg_stat_user_tables a tweak autovacuum_vacuum_scale_factor per chunks tabulku.