> ## Documentation Index
> Fetch the complete documentation index at: https://docs.encoreos.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Knowledge Base System - Integration Documentation

> Feature ID: PF-61 Status: 📝 Planned Created: 2026-01-28 Version: v1.0.0 Spec Reference: specs/pf/specs/PF-61-knowledge-base-system.md (v1.0.0)

**Feature ID:** PF-61\
**Status:** 📝 Planned\
**Created:** 2026-01-28\
**Version:** v1.0.0\
**Spec Reference:** `specs/pf/specs/PF-61-knowledge-base-system.md` (v1.0.0)

***

## Version History

| Version | Date       | Changes                           |
| ------- | ---------- | --------------------------------- |
| v1.0.0  | 2026-01-28 | Initial integration documentation |

**See:** `docs/VERSIONS.md` for complete version history.

***

## File Naming

This document follows the naming convention: `{FEATURE_ID}-{FEATURE_NAME}-INTEGRATION.md`

***

## Document Structure

This integration document follows the standard structure:

1. Overview
2. Integration Patterns
3. Event Contracts
4. API Contracts
5. Platform Integration Layer
6. Error Handling
7. Security Considerations
8. Performance Considerations
9. Testing Requirements
10. Future Enhancements
11. References

***

## Cross-References

* **Spec:** `specs/pf/specs/PF-61-knowledge-base-system.md` (v1.0.0)
* **PF-60 Spec:** `specs/pf/specs/PF-60-rag-infrastructure.md` (v1.0.0)
* **PF-11 Spec:** `specs/pf/specs/PF-11-document-management.md` (v1.0.0)
* **Event Contracts:** `docs/architecture/integrations/EVENT_CONTRACTS.md` (v1.0.0)
* **API Contracts:** `docs/architecture/integrations/API_CONTRACTS.md` (v1.0.0)
* **Integration Matrix:** `docs/architecture/integrations/CROSS_CORE_INTEGRATIONS.md` (v1.0.0)

***

## Quick Reference

| Integration Type    | Pattern                    | Location                   | Status     |
| ------------------- | -------------------------- | -------------------------- | ---------- |
| PF-60 (RAG)         | Event-Based                | `domain_events` channel    | 📝 Planned |
| PF-11 (Documents)   | Platform Integration Layer | `/src/platform/documents/` | 📝 Planned |
| PF-30 (Permissions) | Direct Dependency          | Permission keys            | 📝 Planned |

***

## Decision Trees

### Should I use PF-61 Knowledge Base?

| Condition                          | Decision                |
| ---------------------------------- | ----------------------- |
| Need to curate AI training content | ✅ Use PF-61             |
| Need document embeddings           | ✅ Use PF-60 (RAG)       |
| Need document storage              | ✅ Use PF-11 (Documents) |

***

## Pattern Library

### Event Publishing Pattern

```typescript theme={null}
// Event payload contains organization_id and user_id for tenant isolation
interface KnowledgeArticlePublishedPayload {
  event_type: 'knowledge_article_published';
  organization_id: uuid; // Required for tenant isolation
  article_id: uuid;
  timestamp: timestamptz;
  user_id: uuid; // Required for user-initiated events
}

// Consumers fetch details under RLS with organization_id filter
const article = await supabase
  .from('pf_knowledge_base')
  .select('*')
  .eq('id', payload.article_id)
  .eq('organization_id', payload.organization_id) // Enforce tenant isolation
  .single();
```

***

## Common Mistakes

| Mistake                             | Fix                                        |
| ----------------------------------- | ------------------------------------------ |
| Including PII/PHI in event payloads | Use only IDs, fetch details under RLS      |
| Missing organization\_id validation | Always validate with `pf_has_org_access()` |
| Direct HTTP calls for embeddings    | Use event-based approach                   |

***

## Pre-Flight Checklist

Before implementing PF-61 integrations:

* [ ] Verify PF-60 (RAG Infrastructure) is available
* [ ] Verify PF-11 (Document Management) is available
* [ ] Review event contracts for correct payload structure
* [ ] Ensure RLS policies are in place
* [ ] Test event publishing and consumption

***

## Overview

PF-61 Knowledge Base System provides an admin UI for curating and managing AI training content. It integrates with PF-60 (RAG Infrastructure) for embedding generation and PF-11 (Document Management) for document import.

***

## Integration Patterns

### Pattern 1: Platform Integration Layer

**PF-11 (Document Management) Integration:**

* **Location:** `/src/platform/documents/`
* **Usage:** Document import feature
* **Public API:** Document selection and text extraction
* **Status:** 📝 Planned

### Pattern 2: Event-Based Integration

**PF-60 (RAG Infrastructure) Integration:**

* **Events Published:** `knowledge_article_published`, `knowledge_article_unpublished`
* **Channel:** `domain_events`
* **Status:** 📝 Planned

### Direct Dependency

**PF-60 (RAG Infrastructure):**

* **Usage:** Stores embeddings in `pf_document_embeddings` table
* **Edge Function:** `generate-embeddings` for embedding generation
* **Status:** 📝 Planned

**PF-30 (Permissions System):**

* **Usage:** Permission keys `pf.knowledge.view` and `pf.knowledge.manage`
* **Status:** 📝 Planned

***

## Event Contracts

### Event: `knowledge_article_published`

**Event:** `knowledge_article_published`\
**Channel:** `domain_events`\
**Publisher:** PF (PF-61)\
**Subscribers:** PF-60 (RAG Infrastructure)\
**Status:** 📝 Planned

#### Purpose

Notifies RAG infrastructure that a knowledge article has been published and is ready for embedding generation.

#### Trigger Conditions

* Article status changes from `draft` or `in_review` to `published` via UPDATE
* **Note:** Event only fires on UPDATE operations when status changes to `published`. The trigger is defined as `AFTER UPDATE` only. Consumers must fetch article details (title, tags, category) under RLS using `article_id`.

#### Payload Schema

```typescript theme={null}
interface KnowledgeArticlePublishedPayload {
  event_type: 'knowledge_article_published';
  organization_id: uuid; // Required for tenant isolation
  article_id: uuid;
  timestamp: timestamptz;
  user_id: uuid; // Required for user-initiated events
}
```

**Note:** Event payload includes `organization_id` and `user_id` for tenant isolation and audit purposes. Consumers must still fetch article details (title, tags, category) from `pf_knowledge_base` table using `article_id` and filter by `organization_id` under RLS policies.

#### Implementation

```sql theme={null}
CREATE OR REPLACE FUNCTION pf_publish_knowledge_article_published()
RETURNS TRIGGER AS $$
BEGIN
  IF NEW.status = 'published' AND (OLD.status IS NULL OR OLD.status != 'published') THEN
    PERFORM pg_notify('domain_events', json_build_object(
      'event_type', 'knowledge_article_published',
      'article_id', NEW.id,
      'timestamp', now()
    )::text);
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER SET search_path = public;

CREATE TRIGGER trigger_knowledge_article_published
AFTER UPDATE ON pf_knowledge_base
FOR EACH ROW
WHEN (OLD.status IS DISTINCT FROM NEW.status AND NEW.status = 'published')
EXECUTE FUNCTION pf_publish_knowledge_article_published();
```

#### Consumer Actions

| Consumer                   | Action                                    | Status        |
| -------------------------- | ----------------------------------------- | ------------- |
| PF-60 (RAG Infrastructure) | Generate embeddings for published article | 📝 Planned    |
| Event Consumer             | Log to pf\_audit\_logs                    | ✅ Implemented |

#### Testing Requirements

* [ ] Event payload structure validation
* [ ] Event fires on publish
* [ ] Correct organization\_id included
* [ ] PF-60 handles event correctly

***

### Event: `knowledge_article_unpublished`

**Event:** `knowledge_article_unpublished`\
**Channel:** `domain_events`\
**Publisher:** PF (PF-61)\
**Subscribers:** PF-60 (RAG Infrastructure)\
**Status:** 📝 Planned

#### Purpose

Notifies RAG infrastructure to remove embeddings for an unpublished article.

#### Trigger Conditions

* Article status changes from `published` to `draft`, `in_review`, or `archived` via UPDATE
* **Note:** Event only fires on UPDATE operations when status changes from `published`. The trigger is defined as `AFTER UPDATE` only. Consumers must fetch article details under RLS using `article_id`.

#### Payload Schema

```typescript theme={null}
interface KnowledgeArticleUnpublishedPayload {
  event_type: 'knowledge_article_unpublished';
  organization_id: uuid; // Required for tenant isolation
  article_id: uuid;
  timestamp: timestamptz;
  user_id: uuid; // Required for user-initiated events
}
```

**Note:** Event payload includes `organization_id` and `user_id` for tenant isolation and audit purposes. Consumers must filter by `organization_id` when processing events to enforce tenant boundaries.

#### Implementation

```sql theme={null}
CREATE OR REPLACE FUNCTION pf_publish_knowledge_article_unpublished()
RETURNS TRIGGER AS $$
BEGIN
  IF NEW.status != 'published' AND OLD.status = 'published' THEN
    PERFORM pg_notify('domain_events', json_build_object(
      'event_type', 'knowledge_article_unpublished',
      'article_id', NEW.id,
      'timestamp', now()
    )::text);
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER SET search_path = public;

CREATE TRIGGER trigger_knowledge_article_unpublished
AFTER UPDATE ON pf_knowledge_base
FOR EACH ROW
WHEN (OLD.status = 'published' AND NEW.status != 'published')
EXECUTE FUNCTION pf_publish_knowledge_article_unpublished();
```

#### Consumer Actions

| Consumer                   | Action                                    | Status        |
| -------------------------- | ----------------------------------------- | ------------- |
| PF-60 (RAG Infrastructure) | Delete embeddings for unpublished article | 📝 Planned    |
| Event Consumer             | Log to pf\_audit\_logs                    | ✅ Implemented |

#### Testing Requirements

* [ ] Event payload structure validation
* [ ] Event fires on unpublish
* [ ] Correct organization\_id included
* [ ] PF-60 handles event correctly

***

## API Contracts

### Database Function: Embedding Generation

**Function:** `pf_knowledge_base_embedding_trigger()`\
**Provider:** PF-61\
**Consumer:** PF-60 (via trigger)\
**Status:** 📝 Planned

#### Function Signature

```sql theme={null}
pf_knowledge_base_embedding_trigger()
RETURNS TRIGGER
```

#### Purpose

Triggers embedding generation when article is published. Currently uses direct HTTP call via `net.http_post` extension, but should be replaced with event-based approach for better reliability.

#### Request Schema

**Trigger Context (No Request Body):**

* Trigger fires automatically on INSERT/UPDATE when article status changes to `published`
* Trigger context provides: `NEW.id`, `NEW.organization_id`, `NEW.title`, `NEW.content`, `NEW.published_at`

**HTTP Request to `generate-embeddings` Edge Function:**

```typescript theme={null}
{
  article_id: uuid;
  title: string;
  content: string;
  publish_timestamp: timestamptz;
  metadata: {
    organization_id: uuid;
    category?: string;
  };
}
```

#### Response Schema

**Success Response (200 OK):**

```typescript theme={null}
{
  success: true;
  embedding_id: uuid;
  status: 'indexed';
  chunks_processed: number;
}
```

**Failure Response (4xx/5xx):**

```typescript theme={null}
{
  success: false;
  error: string;
  status: 'failed';
}
```

#### Fallback Queue Message Schema

If `net.http_post` extension is not available, message is queued:

```typescript theme={null}
{
  source_type: 'knowledge_article';
  source_id: uuid;
  organization_id: uuid;
  content: string;
  metadata: {
    title: string;
    category?: string;
    publish_timestamp: timestamptz;
  };
}
```

#### Failure Contract

If embedding generation fails:

* Article remains `published` but `is_indexed` is set to `false`
* Error details stored in `pf_knowledge_base.indexing_error` (if column exists)
* Retry mechanism available (manual or automatic after 5 minutes)

#### Implementation Notes

* Uses `net.http_post` extension to call `/functions/v1/generate-embeddings` edge function
* Falls back to queue-based approach if extension not available
* Sets `is_indexed: false` if embedding generation fails
* All requests include `organization_id` for tenant isolation

***

## Platform Integration Layer

### PF-11 (Document Management) Integration

**Location:** `/src/platform/documents/`\
**Public API:**

* `useDocuments()` - List documents
* `useDocument(id)` - Get document details
* `extractDocumentText(documentId)` - Extract text content

**Usage Example:**

```typescript theme={null}
import { useDocuments } from '@/platform/documents';

function DocumentImportDialog() {
  const { data: documents } = useDocuments();
  // Select document and extract text
}
```

**Status:** 📝 Planned

***

## Error Handling

### Embedding Generation Failures

**Scenario:** Embedding generation fails during publish

**Handling:**

* Article is marked as `published` but `is_indexed: false`
* Warning shown to admin
* Retry mechanism available (manual or automatic after 5 minutes)
* Error logged for debugging

### Document Import Failures

**Scenario:** Document has no extractable text

**Handling:**

* Show clear error message to user
* Allow manual text entry as fallback
* Log error for debugging

***

## Security Considerations

### Multi-Tenant Isolation

* All events include `organization_id` for tenant isolation
* RLS policies enforce organization boundaries
* Event payloads validated for organization context

### PHI/PII Handling

* No PHI/PII in event payloads (only IDs)
* Knowledge content may contain sensitive info, but events only include metadata
* Access controls same as source documents

***

## Performance Considerations

### Event Processing

* Events are asynchronous (non-blocking)
* Embedding generation happens in background
* No impact on article publish performance

### API Calls

* Document import uses existing PF-11 APIs
* No additional API overhead
* Caching used where appropriate

***

## Testing Requirements

### Integration Tests

* [ ] Event publishing on article publish
* [ ] Event publishing on article unpublish
* [ ] PF-60 receives and processes events
* [ ] Document import flow
* [ ] Embedding generation flow

### RLS Tests

* [ ] Events include correct organization\_id
* [ ] Cross-organization event isolation
* [ ] Permission checks for event triggers

***

## Future Enhancements

### Queue-Based Embedding Generation

Replace direct HTTP calls with queue-based approach:

* Create `pf_embedding_queue` table
* Background job processes queue
* Better reliability and retry handling

### Event Versioning

* Support event schema versioning
* Backward compatibility for event consumers

***

## References

* **Spec:** `specs/pf/specs/PF-61-knowledge-base-system.md`
* **PF-60 Spec:** `specs/pf/specs/PF-60-rag-infrastructure.md`
* **PF-11 Spec:** `specs/pf/specs/PF-11-document-management.md`
* **Event Contracts:** `docs/architecture/integrations/EVENT_CONTRACTS.md`
* **Integration Matrix:** `docs/architecture/integrations/CROSS_CORE_INTEGRATIONS.md`

***

**Last Updated:** 2026-01-28
