llama-farm / rag-skills

Install for your project team

Run this command in your project directory to install the skill for your entire team:

mkdir -p .claude/skills/rag-skills && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/1723" && unzip -o skill.zip -d .claude/skills/rag-skills && rm skill.zip

New-Item -Path ".claude/skills/rag-skills" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/1723" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".claude/skills/rag-skills" -Force; Remove-Item "skill.zip"

Project Skills

This skill will be saved in .claude/skills/rag-skills/ and checked into git. All team members will have access to it automatically.

Important: Please verify the skill by reviewing its instructions before using it.

Install skill for Codex

Run one of these commands to install the skill depending on your needs:

Project Local ($CWD/.codex/skills)

mkdir -p .codex/skills/rag-skills && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/1723" && unzip -o skill.zip -d .codex/skills/rag-skills && rm skill.zip

New-Item -Path ".codex/skills/rag-skills" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/1723" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".codex/skills/rag-skills" -Force; Remove-Item "skill.zip"

User Global (~/.codex/skills)

mkdir -p ~/.codex/skills/rag-skills && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/1723" && unzip -o skill.zip -d ~/.codex/skills/rag-skills && rm skill.zip

New-Item -Path "$HOME/.codex/skills/rag-skills" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/1723" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath "$HOME/.codex/skills/rag-skills" -Force; Remove-Item "skill.zip"

Scope	Location	Suggested Use
REPO	`$CWD/.codex/skills`	Project directory. Teams can check in skills most relevant to a working folder here.
REPO	`$CWD/../.codex/skills`	A folder above CWD. Organizations can check in skills relevant to a shared area.
REPO	`$REPO_ROOT/.codex/skills`	Top-most root folder. Relevant to everyone using the repository.
USER	`$CODEX_HOME/skills`	Personal folder (`~/.codex/skills`). Curate skills that apply to any repository.

Install skill for GitHub Copilot

Run one of these commands to install the skill depending on your needs:

Project (.github/skills)

mkdir -p .github/skills/rag-skills && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/1723" && unzip -o skill.zip -d .github/skills/rag-skills && rm skill.zip

New-Item -Path ".github/skills/rag-skills" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/1723" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".github/skills/rag-skills" -Force; Remove-Item "skill.zip"

Personal (~/.copilot/skills)

mkdir -p ~/.copilot/skills/rag-skills && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/1723" && unzip -o skill.zip -d ~/.copilot/skills/rag-skills && rm skill.zip

New-Item -Path "$HOME/.copilot/skills/rag-skills" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/1723" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath "$HOME/.copilot/skills/rag-skills" -Force; Remove-Item "skill.zip"

Scope	Location	Suggested Use
Project	`.github/skills/`	Repository-specific skills. Checked into git for the whole team.
Personal	`~/.copilot/skills/`	Personal skills available across all your projects.

Install skill for Google Antigravity

Run one of these commands to install the skill depending on your needs:

Workspace (.agent/skills)

mkdir -p .agent/skills/rag-skills && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/1723" && unzip -o skill.zip -d .agent/skills/rag-skills && rm skill.zip

New-Item -Path ".agent/skills/rag-skills" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/1723" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".agent/skills/rag-skills" -Force; Remove-Item "skill.zip"

Global (~/.gemini/antigravity/skills)

mkdir -p ~/.gemini/antigravity/skills/rag-skills && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/1723" && unzip -o skill.zip -d ~/.gemini/antigravity/skills/rag-skills && rm skill.zip

New-Item -Path "$HOME/.gemini/antigravity/skills/rag-skills" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/1723" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath "$HOME/.gemini/antigravity/skills/rag-skills" -Force; Remove-Item "skill.zip"

Scope	Location	Suggested Use
Workspace	`.agent/skills/`	Workspace-specific skills for project workflows and conventions.
Global	`~/.gemini/antigravity/skills/`	Personal skills available across all workspaces.

RAG-specific best practices for LlamaIndex, ChromaDB, and Celery workers. Covers ingestion, retrieval, embeddings, and performance.

Data & Analytics

0 views

0 installs

Source: https://github.com/llama-farm/llamafarm/tree/main/.claude/skills/rag-skills

Skill Content

---
name: rag-skills
description: RAG-specific best practices for LlamaIndex, ChromaDB, and Celery workers. Covers ingestion, retrieval, embeddings, and performance.
allowed-tools: Read, Grep, Glob
user-invocable: false
---

# RAG Skills for LlamaFarm

Framework-specific patterns and code review checklists for the RAG component.

**Extends**: [python-skills](../python-skills/SKILL.md) - All Python best practices apply here.

## Component Overview

| Aspect | Technology | Version |
|--------|------------|---------|
| Python | Python | 3.11+ |
| Document Processing | LlamaIndex | 0.13+ |
| Vector Storage | ChromaDB | 1.0+ |
| Task Queue | Celery | 5.5+ |
| Embeddings | Universal/Ollama/OpenAI | Multiple |

## Directory Structure

```
rag/
├── api.py                 # Search and database APIs
├── celery_app.py          # Celery configuration
├── main.py                # Entry point
├── core/
│   ├── base.py            # Document, Component, Pipeline ABCs
│   ├── factories.py       # Component factories
│   ├── ingest_handler.py  # File ingestion with safety checks
│   ├── blob_processor.py  # Binary file processing
│   ├── settings.py        # Pydantic settings
│   └── logging.py         # RAGStructLogger
├── components/
│   ├── embedders/         # Embedding providers
│   ├── extractors/        # Metadata extractors
│   ├── parsers/           # Document parsers (LlamaIndex)
│   ├── retrievers/        # Retrieval strategies
│   └── stores/            # Vector stores (ChromaDB, FAISS)
├── tasks/                 # Celery tasks
│   ├── ingest_tasks.py    # File ingestion
│   ├── search_tasks.py    # Database search
│   ├── query_tasks.py     # Complex queries
│   ├── health_tasks.py    # Health checks
│   └── stats_tasks.py     # Statistics
└── utils/
    └── embedding_safety.py  # Circuit breaker, validation
```

## Quick Reference

| Topic | File | Key Points |
|-------|------|------------|
| LlamaIndex | [llamaindex.md](llamaindex.md) | Document parsing, chunking, node conversion |
| ChromaDB | [chromadb.md](chromadb.md) | Collections, embeddings, distance metrics |
| Celery | [celery.md](celery.md) | Task routing, error handling, worker config |
| Performance | [performance.md](performance.md) | Batching, caching, deduplication |

## Core Patterns

### Document Dataclass

```python
from dataclasses import dataclass, field
from typing import Any

@dataclass
class Document:
    content: str
    metadata: dict[str, Any] = field(default_factory=dict)
    id: str = field(default_factory=lambda: str(uuid.uuid4()))
    source: str | None = None
    embeddings: list[float] | None = None
```

### Component Abstract Base Class

```python
from abc import ABC, abstractmethod

class Component(ABC):
    def __init__(
        self,
        name: str | None = None,
        config: dict[str, Any] | None = None,
        project_dir: Path | None = None,
    ):
        self.name = name or self.__class__.__name__
        self.config = config or {}
        self.logger = RAGStructLogger(__name__).bind(name=self.name)
        self.project_dir = project_dir

    @abstractmethod
    def process(self, documents: list[Document]) -> ProcessingResult:
        pass
```

### Retrieval Strategy Pattern

```python
class RetrievalStrategy(Component, ABC):
    @abstractmethod
    def retrieve(
        self,
        query_embedding: list[float],
        vector_store,
        top_k: int = 5,
        **kwargs
    ) -> RetrievalResult:
        pass

    @abstractmethod
    def supports_vector_store(self, vector_store_type: str) -> bool:
        pass
```

### Embedder with Circuit Breaker

```python
class Embedder(Component):
    DEFAULT_FAILURE_THRESHOLD = 5
    DEFAULT_RESET_TIMEOUT = 60.0

    def __init__(self, ...):
        super().__init__(...)
        self._circuit_breaker = CircuitBreaker(
            failure_threshold=config.get("failure_threshold", 5),
            reset_timeout=config.get("reset_timeout", 60.0),
        )
        self._fail_fast = config.get("fail_fast", True)

    def embed_text(self, text: str) -> list[float]:
        self.check_circuit_breaker()
        try:
            embedding = self._call_embedding_api(text)
            self.record_success()
            return embedding
        except Exception as e:
            self.record_failure(e)
            if self._fail_fast:
                raise EmbedderUnavailableError(str(e)) from e
            return [0.0] * self.get_embedding_dimension()
```

## Review Checklist Summary

When reviewing RAG code:

1. **LlamaIndex** (Medium priority)
   - Proper chunking configuration
   - Metadata preservation during parsing
   - Error handling for unsupported formats

2. **ChromaDB** (High priority)
   - Thread-safe client access
   - Proper distance metric selection
   - Metadata type compatibility

3. **Celery** (High priority)
   - Task routing to correct queue
   - Error logging with context
   - Proper serialization

4. **Performance** (Medium priority)
   - Batch processing for embeddings
   - Deduplication enabled
   - Appropriate caching

See individual topic files for detailed checklists with grep patterns.

llama-farm / rag-skills

Install for your project team

Download skill

Enable skills in Claude

Upload to Claude

Install skill for Codex

Install skill for GitHub Copilot

Install skill for Google Antigravity

Skill Content