multigres / local-cluster-manager

Install for your project team

Run this command in your project directory to install the skill for your entire team:

mkdir -p .claude/skills/local-cluster-manager && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/2700" && unzip -o skill.zip -d .claude/skills/local-cluster-manager && rm skill.zip

New-Item -Path ".claude/skills/local-cluster-manager" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/2700" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".claude/skills/local-cluster-manager" -Force; Remove-Item "skill.zip"

Project Skills

This skill will be saved in .claude/skills/local-cluster-manager/ and checked into git. All team members will have access to it automatically.

Important: Please verify the skill by reviewing its instructions before using it.

Install skill for Codex

Run one of these commands to install the skill depending on your needs:

Project Local ($CWD/.codex/skills)

mkdir -p .codex/skills/local-cluster-manager && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/2700" && unzip -o skill.zip -d .codex/skills/local-cluster-manager && rm skill.zip

New-Item -Path ".codex/skills/local-cluster-manager" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/2700" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".codex/skills/local-cluster-manager" -Force; Remove-Item "skill.zip"

User Global (~/.codex/skills)

mkdir -p ~/.codex/skills/local-cluster-manager && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/2700" && unzip -o skill.zip -d ~/.codex/skills/local-cluster-manager && rm skill.zip

New-Item -Path "$HOME/.codex/skills/local-cluster-manager" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/2700" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath "$HOME/.codex/skills/local-cluster-manager" -Force; Remove-Item "skill.zip"

Scope	Location	Suggested Use
REPO	`$CWD/.codex/skills`	Project directory. Teams can check in skills most relevant to a working folder here.
REPO	`$CWD/../.codex/skills`	A folder above CWD. Organizations can check in skills relevant to a shared area.
REPO	`$REPO_ROOT/.codex/skills`	Top-most root folder. Relevant to everyone using the repository.
USER	`$CODEX_HOME/skills`	Personal folder (`~/.codex/skills`). Curate skills that apply to any repository.

Install skill for GitHub Copilot

Run one of these commands to install the skill depending on your needs:

Project (.github/skills)

mkdir -p .github/skills/local-cluster-manager && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/2700" && unzip -o skill.zip -d .github/skills/local-cluster-manager && rm skill.zip

New-Item -Path ".github/skills/local-cluster-manager" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/2700" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".github/skills/local-cluster-manager" -Force; Remove-Item "skill.zip"

Personal (~/.copilot/skills)

mkdir -p ~/.copilot/skills/local-cluster-manager && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/2700" && unzip -o skill.zip -d ~/.copilot/skills/local-cluster-manager && rm skill.zip

New-Item -Path "$HOME/.copilot/skills/local-cluster-manager" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/2700" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath "$HOME/.copilot/skills/local-cluster-manager" -Force; Remove-Item "skill.zip"

Scope	Location	Suggested Use
Project	`.github/skills/`	Repository-specific skills. Checked into git for the whole team.
Personal	`~/.copilot/skills/`	Personal skills available across all your projects.

Install skill for Google Antigravity

Run one of these commands to install the skill depending on your needs:

Workspace (.agent/skills)

mkdir -p .agent/skills/local-cluster-manager && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/2700" && unzip -o skill.zip -d .agent/skills/local-cluster-manager && rm skill.zip

New-Item -Path ".agent/skills/local-cluster-manager" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/2700" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath ".agent/skills/local-cluster-manager" -Force; Remove-Item "skill.zip"

Global (~/.gemini/antigravity/skills)

mkdir -p ~/.gemini/antigravity/skills/local-cluster-manager && curl -L -o skill.zip "https://fastmcp.me/Skills/Download/2700" && unzip -o skill.zip -d ~/.gemini/antigravity/skills/local-cluster-manager && rm skill.zip

New-Item -Path "$HOME/.gemini/antigravity/skills/local-cluster-manager" -ItemType Directory -Force; Invoke-WebRequest -Uri "https://fastmcp.me/Skills/Download/2700" -OutFile "skill.zip"; Expand-Archive -Path "skill.zip" -DestinationPath "$HOME/.gemini/antigravity/skills/local-cluster-manager" -Force; Remove-Item "skill.zip"

Scope	Location	Suggested Use
Workspace	`.agent/skills/`	Workspace-specific skills for project workflows and conventions.
Global	`~/.gemini/antigravity/skills/`	Personal skills available across all workspaces.

Manage local multigres cluster components (multipooler, pgctld, multiorch, multigateway) - start/stop services, view logs, connect with psql, test S3 backups locally

0 views

0 installs

Source: https://github.com/multigres/multigres/tree/main/.claude/skills/mt-local-cluster

Skill Content

---
name: "Local Cluster Manager"
description: "Manage local multigres cluster components (multipooler, pgctld, multiorch, multigateway) - start/stop services, view logs, connect with psql, test S3 backups locally"
---

# Local Cluster Manager

Manage local multigres cluster - both cluster-wide operations and individual components.

## When to Use This Skill

Invoke this skill when the user asks to:

- Start/stop/restart the entire cluster or individual components
- Start cluster with observability (OTel, Grafana, Prometheus)
- Teardown and restart the full stack (cluster + observability)
- View logs for any component
- Connect to multipooler or multigateway with psql
- Check status of cluster components
- Check multipooler topology status (PRIMARY/REPLICA roles)
- Check if PostgreSQL instances are in recovery mode
- Test S3 backups (initialize cluster with S3, create/list/restore backups)
- Configure or troubleshoot S3 backup settings

## Performance Optimization

Parse `./multigres_local/multigres.yaml` once when this skill is first invoked and cache the cluster configuration in memory for the duration of the conversation. Use the cached data for all subsequent commands. Only re-parse if the user explicitly asks to "reload config" or if a command fails due to stale config.

## Cluster-Wide Operations

**Start entire cluster**:

```bash
./bin/multigres cluster start
```

**Stop entire cluster**:

```bash
./bin/multigres cluster stop
```

**Stop entire cluster and delete all cluster data**:

```bash
./bin/multigres cluster stop --clean
```

**Check cluster status**:

```bash
./bin/multigres cluster status
```

**Initialize new cluster**:

```bash
./bin/multigres cluster init
```

**Get all multipoolers from topology**:

```bash
./bin/multigres getpoolers
```

Returns JSON with all multipoolers, their cells, service IDs, ports, and pooler directories.

**Get detailed status for a specific multipooler**:

```bash
./bin/multigres getpoolerstatus --cell <cell-name> --service-id <service-id>
```

Returns detailed status including:

- `pooler_type`: 1 = PRIMARY, 2 = REPLICA
- `postgres_role`: "primary" or "standby"
- `postgres_running`: Whether PostgreSQL is running
- `wal_position`: Current WAL position
- `consensus_term`: Current consensus term
- `primary_status`: (for PRIMARY) connected followers and sync replication config
- `replication_status`: (for REPLICA) replication lag and primary connection info

Example:

```bash
./bin/multigres getpoolerstatus --cell zone1 --service-id thhcdhbp
```

**Check PostgreSQL recovery mode directly**:

```bash
psql -h <pooler-dir>/pg_sockets -p <pg-port> -U postgres -d postgres -c "SELECT pg_is_in_recovery();"
```

Returns `t` (true) if in recovery/standby mode, `f` (false) if primary.

## S3 Backup Testing

Test S3 backups using AWS S3. When the user wants to test S3 backups:

**Configuration Caching**: When S3 configuration values are first provided, cache them in memory for the duration of the conversation. Reuse these cached values for all subsequent S3 operations. Only re-prompt if:

- The user explicitly asks to change the configuration
- A command fails due to invalid/expired credentials
- The values have never been provided in this conversation

1. **Prompt for S3 configuration** using AskUserQuestion (only if not already cached):
   - Path to AWS credentials file (e.g., `./.staging-aws` or `~/.aws/credentials`)
   - S3 backup URL (e.g., `s3://bucket-name/backups/`)
   - AWS region (e.g., `us-east-1`)

2. **Check/source credentials**:

```bash
# Check if AWS credentials are already set
env | grep AWS_

# If not, source the credentials file (path from user)
source <credentials-file-path>

# Verify credentials are now set
env | grep AWS_
```

**IMPORTANT**:

- NEVER commit AWS credentials files to git
- Avoid printing credentials to the terminal
- Credentials file should contain: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN (if using temporary credentials)

3. **Initialize cluster with S3**:

```bash
./bin/multigres cluster stop --clean
rm -rf multigres_local
./bin/multigres cluster init \
  --backup-url=<s3-url-from-user> \
  --region=<region-from-user>
```

4. **Start cluster** (use standard cluster start command)

5. **Verify S3 configuration**:

```bash
grep -r "aws_access_key_id\|aws_secret_access_key\|region\|repo1-s3" ./multigres_local/data/pooler_*/pgbackrest.conf
```

Should see AWS credentials and S3 configuration in all pgbackrest.conf files.

### Backup Commands

**Create backup**:

```bash
./bin/multigres cluster backup
```

**List all backups**:

```bash
./bin/multigres cluster list-backups
```

**Restore from backup**:

```bash
./bin/multigres cluster restore --backup-label <label>
```

### Troubleshooting S3 Issues

**Missing/expired credentials**:

```bash
# Re-source credentials file
source <credentials-file-path>

# Verify they're set
env | grep AWS_ | wc -l  # Should show 3+ environment variables

# Reinitialize cluster to pick up new credentials
./bin/multigres cluster stop --clean
rm -rf multigres_local
./bin/multigres cluster init --backup-url=<s3-url> --region=<region>
```

**Check pgbackrest logs for errors**:

```bash
# View recent errors
tail -100 ./multigres_local/data/pooler_*/pg_data/log/pgbackrest-*.log

# Follow logs in real-time
tail -f ./multigres_local/data/pooler_*/pg_data/log/pgbackrest-*.log
```

**Verify S3 bucket access**:

```bash
# Use AWS CLI to test bucket access (if installed)
aws s3 ls <s3-bucket-path> --region <region>
```

## Observability Stack

Start the observability stack (Grafana + Prometheus + Loki + Tempo) for metrics, traces, and logs visualization.

**Start cluster with observability**:

```bash
# 1. Start observability stack (separate terminal, runs in foreground)
demo/local/run-observability.sh

# 2. Start cluster with OTel export (separate terminal)
demo/local/multigres-with-otel.sh cluster start --config-path <config-path>
```

**Generate traffic with pgbench**:

```bash
PGPASSWORD=postgres pgbench -h localhost -p 15432 -U postgres -i postgres
PGPASSWORD=postgres pgbench -h localhost -p 15432 -U postgres -c 4 -j 2 -T 300 -P 5 postgres
```

**View telemetry**:

- Grafana Dashboard: <http://localhost:3000/d/multigres-overview>
- Grafana Explore (ad-hoc PromQL): <http://localhost:3000/explore>
- Prometheus UI: <http://localhost:9090>

**Teardown** (stop in this order to avoid OTel export errors):

```bash
# 1. Stop the cluster first
./bin/multigres cluster stop --config-path <config-path>

# 2. Stop the observability stack
docker rm -f multigres-observability
```

**Full restart**:

```bash
# Teardown
./bin/multigres cluster stop --config-path <config-path>
docker rm -f multigres-observability

# Start
demo/local/run-observability.sh          # terminal 1
demo/local/multigres-with-otel.sh cluster start --config-path <config-path>  # terminal 2
```

**Observability ports**:

| Service     | Port |
| ----------- | ---- |
| Grafana     | 3000 |
| OTLP (HTTP) | 4318 |
| Prometheus  | 9090 |
| Loki        | 3100 |
| Tempo       | 3200 |

## Individual Component Operations

### Configuration

1. **Parse the config**: Read `./multigres_local/multigres.yaml` to discover available components and their IDs

2. **Component ID mapping**:
   - multipooler IDs: extracted from `.provisioner-config.cells.<zone>.multipooler.service-id`
   - pgctld uses the same IDs as multipooler
   - multiorch has separate IDs for each zone
   - multigateway has separate IDs for each zone

3. **If no ID provided**: Use AskUserQuestion to let the user select which instance to operate on
   - Show available IDs with their zone names
   - Example: "xf42rpl6 (zone1)", "hm9hmxzm (zone2)", "n6t8hvgl (zone3)"

### Commands

**Stop pgctld**:

```bash
./bin/pgctld stop --pooler-dir <pooler-dir-from-config>
```

**Start pgctld**:

```bash
./bin/pgctld start --pooler-dir <pooler-dir-from-config>
```

**Restart pgctld (as standby)**:

```bash
./bin/pgctld restart --pooler-dir <pooler-dir-from-config> --as-standby
```

**Check pgctld status**:

```bash
./bin/pgctld status --pooler-dir <pooler-dir-from-config>
```

**View logs**:

- multipooler: `./multigres_local/logs/dbs/postgres/multipooler/[id].log`
- pgctld: `./multigres_local/logs/dbs/postgres/pgctld/[id].log`
- multiorch: `./multigres_local/logs/dbs/postgres/multiorch/[id].log`
- multigateway: `./multigres_local/logs/dbs/postgres/multigateway/[id].log`
- PostgreSQL: `./multigres_local/data/pooler_[id]/pg_data/postgresql.log`

**Tail logs**:

```bash
tail -f <log-path>
```

**Connect to multipooler** (via Unix socket):

```bash
psql -h <pooler-dir>/pg_sockets -p <pg-port> -U postgres -d postgres
```

Where:

- pooler-dir is from `.provisioner-config.cells.<zone>.multipooler.pooler-dir`
- pg-port is from `.provisioner-config.cells.<zone>.pgctld.pg-port`
- PostgreSQL socket is at `<pooler-dir>/pg_sockets/.s.PGSQL.<pg-port>`

Example:

```bash
psql -h ./multigres_local/data/pooler_xf42rpl6/pg_sockets -p 25432 -U postgres -d postgres
```

**Connect to multigateway** (via TCP):

```bash
psql -h localhost -p <pg-port> -U postgres -d postgres
```

Where:

- pg-port is from `.provisioner-config.cells.<zone>.multigateway.pg-port`

Example:

```bash
psql -h localhost -p 15432 -U postgres -d postgres
```

### Config Paths

Extract from YAML config at `.provisioner-config.cells.<zone>.pgctld.pooler-dir`

## Examples

**Cluster-wide:**

User: "start the cluster"

- Execute: `./bin/multigres cluster start`

User: "stop cluster"

- Execute: `./bin/multigres cluster stop`

User: "cluster status"

- Execute: `./bin/multigres cluster status`

User: "show me all multipoolers" or "get poolers"

- Execute: `./bin/multigres getpoolers`

User: "check if multipoolers are in recovery" or "check multipooler status"

- Parse config to get all zones and service IDs
- Execute: `./bin/multigres getpoolerstatus --cell <zone> --service-id <id>` for each
- Display pooler_type (PRIMARY/REPLICA) and postgres_role (primary/standby)

User: "check zone1 multipooler status"

- Look up service ID for zone1
- Execute: `./bin/multigres getpoolerstatus --cell zone1 --service-id <id>`

**Observability:**

User: "start cluster with otel" or "start cluster with observability"

- Start `demo/local/run-observability.sh` (if not running)
- Start `demo/local/multigres-with-otel.sh cluster start --config-path <path>`

User: "teardown everything" or "stop everything"

- Stop cluster: `./bin/multigres cluster stop --config-path <path>`
- Stop observability: `docker rm -f multigres-observability`

User: "restart everything" or "full restart"

- Teardown, then start observability + cluster

User: "push traffic" or "generate load"

- Run pgbench init + pgbench with `-P 5` for progress

**Individual components:**

User: "stop pgctld"

- Read config to find available pgctld instances
- Ask user which one to stop (zone1, zone2, or zone3)
- Execute stop command with selected pooler-dir

User: "restart pgctld xf42rpl6 as standby"

- Look up pooler-dir for xf42rpl6 in config
- Execute: `./bin/pgctld restart --pooler-dir /path/to/pooler_xf42rpl6 --as-standby`

User: "logs multipooler hm9hmxzm"

- Show: `./multigres_local/logs/dbs/postgres/multipooler/hm9hmxzm.log`

User: "tail pgctld"

- Ask which instance
- Tail the corresponding log file

User: "connect to multipooler zone1" or "psql multipooler xf42rpl6"

- Look up pooler-dir and pg-port from config
- Show: `psql -h <pooler-dir>/pg_sockets -p <pg-port> -U postgres -d postgres`

User: "connect to multigateway" or "psql multigateway"

- Ask which zone
- Show: `psql -h localhost -p <pg-port> -U postgres -d postgres`

multigres / local-cluster-manager

Install for your project team

Download skill

Enable skills in Claude

Upload to Claude

Install skill for Codex

Install skill for GitHub Copilot

Install skill for Google Antigravity

Skill Content