benjaminjackson / exa-research

Use when the user mentions Exa research OR when the workflow benefits from complex, multi-step research and other exa-ai approaches are not yielding satisfactory results.

0 views
0 installs

Skill Content

---
name: exa-research
description: Use when the user mentions Exa research OR when the workflow benefits from complex, multi-step research and other exa-ai approaches are not yielding satisfactory results.
---

# Exa Research Tasks

Manage asynchronous research tasks with exa-ai for complex, multi-step research workflows.

**Use `--help` to see available commands and verify usage before running:**
```bash
exa-ai <command> --help
```

## Working with Complex Shell Commands

When using the Bash tool with complex shell syntax, follow these best practices for reliability:

1. **Run commands directly**: Capture JSON output directly rather than nesting command substitutions
2. **Parse in subsequent steps**: Use `jq` to parse output in a follow-up command if needed
3. **Avoid nested substitutions**: Complex nested `$(...)` can be fragile; break into sequential steps

Example:
```bash
# Less reliable: nested command substitution
results=$(exa-ai research-start --instructions "query" | jq -r '.result')

# More reliable: run directly, then parse
exa-ai research-start --instructions "query"
# Then in a follow-up command if needed:
exa-ai research-get research_id | jq -r '.result'
```

## Cost Optimization

### Pricing
Research is the most expensive Exa endpoint:
- **Agent search**: $0.005 per search operation
- **Standard page read**: $0.005 per page
- **Pro page read**: $0.010 per page (2x standard)
- **Reasoning tokens**: $0.000005 per token

**Cost strategy:**
- **Avoid research unless required**: Most expensive option (2-10x cost premium over other endpoints)
- Use only for autonomous, multi-step reasoning tasks that justify the cost
- For simpler queries, use `search`, `answer`, or `get-contents` instead
- Consider using `exa-research` (standard) instead of `exa-research-pro` unless you need the higher quality

## Research Overview

Research tasks are asynchronous operations that allow you to:
- Run complex, multi-step research workflows
- Process large amounts of information over time
- Monitor progress of long-running research
- Get structured output from comprehensive research

### When to Use Research vs Search

**Use research-start** when:
- The research requires multiple steps or complex reasoning
- You need comprehensive analysis of a topic
- The task will take significant time to complete
- You want structured, synthesized output

**Use search** (from exa-core) when:
- You need immediate results
- The query is straightforward
- You want quick factual information

## Commands

### research-start
Initiate a new research task with instructions.

```bash
exa-ai research-start --instructions "Find the top 10 Ruby performance optimization techniques"
```

For detailed options and examples, consult [REFERENCE.md](REFERENCE.md#research-start).

### research-get
Check status and retrieve results of a research task.

```bash
exa-ai research-get research_abc123
```

For detailed options and examples, consult [REFERENCE.md](REFERENCE.md#research-get).

### research-list
List all your research tasks with pagination.

```bash
exa-ai research-list --limit 10
```

For detailed options and examples, consult [REFERENCE.md](REFERENCE.md#research-list).

## Research Models

- **exa-research** (default): Balanced speed and quality
- **exa-research-pro**: Higher quality, more comprehensive results
- **exa-research-fast**: Faster results, good for simpler research

## Quick Examples

### Simple Research
```bash
exa-ai research-start \
  --instructions "Find the latest breakthroughs in quantum computing"
```

### Research with Structured Output
```bash
exa-ai research-start \
  --instructions "Compare TypeScript vs Flow for type checking" \
  --output-schema '{
    "type":"object",
    "properties":{
      "typescript":{
        "type":"object",
        "properties":{
          "pros":{"type":"array","items":{"type":"string"}},
          "cons":{"type":"array","items":{"type":"string"}}
        }
      },
      "flow":{
        "type":"object",
        "properties":{
          "pros":{"type":"array","items":{"type":"string"}},
          "cons":{"type":"array","items":{"type":"string"}}
        }
      }
    }
  }'
```

### Background Research Workflow
```bash
# Start research
research_id=$(exa-ai research-start \
  --instructions "Analyze competitor landscape for project management tools" | jq -r '.research_id')

# Check status later
status=$(exa-ai research-get $research_id | jq -r '.status')

# Get results when complete
if [ "$status" = "completed" ]; then
  exa-ai research-get $research_id | jq -r '.result'
fi
```

### Use Pro Model for Comprehensive Research
```bash
exa-ai research-start \
  --instructions "Comprehensive analysis of microservices vs monolithic architecture with case studies" \
  --model exa-research-pro \
  --events
```

### Shared Requirements

<shared-requirements>

## Schema Design

### MUST: Use object wrapper for schemas

**Applies to**: answer, search, find-similar, get-contents

When using schema parameters (`--output-schema` or `--summary-schema`), always wrap properties in an object:

```json
{"type":"object","properties":{"field_name":{"type":"string"}}}
```

**DO NOT** use bare properties without the object wrapper:
```json
{"properties":{"field_name":{"type":"string"}}}  // ❌ Missing "type":"object"
```

**Why**: The Exa API requires a valid JSON Schema with an object type at the root level. Omitting this causes validation errors.

**Examples**:
```bash
# ✅ CORRECT - object wrapper included
exa-ai search "AI news" \
  --summary-schema '{"type":"object","properties":{"headline":{"type":"string"}}}'

# ❌ WRONG - missing object wrapper
exa-ai search "AI news" \
  --summary-schema '{"properties":{"headline":{"type":"string"}}}'
```

---

## Output Format Selection

### MUST NOT: Mix toon format with jq

**Applies to**: answer, context, search, find-similar, get-contents

`toon` format produces YAML-like output, not JSON. DO NOT pipe toon output to jq for parsing:

```bash
# ❌ WRONG - toon is not JSON
exa-ai search "query" --output-format toon | jq -r '.results'

# ✅ CORRECT - use JSON (default) with jq
exa-ai search "query" | jq -r '.results[].title'

# ✅ CORRECT - use toon for direct reading only
exa-ai search "query" --output-format toon
```

**Why**: jq expects valid JSON input. toon format is designed for human readability and produces YAML-like output that jq cannot parse.

### SHOULD: Choose one output approach

**Applies to**: answer, context, search, find-similar, get-contents

Pick one strategy and stick with it throughout your workflow:

1. **Approach 1: toon only** - Compact YAML-like output for direct reading
   - Use when: Reading output directly, no further processing needed
   - Token savings: ~40% reduction vs JSON
   - Example: `exa-ai search "query" --output-format toon`

2. **Approach 2: JSON + jq** - Extract specific fields programmatically
   - Use when: Need to extract specific fields or pipe to other commands
   - Token savings: ~80-90% reduction (extracts only needed fields)
   - Example: `exa-ai search "query" | jq -r '.results[].title'`

3. **Approach 3: Schemas + jq** - Structured data extraction with validation
   - Use when: Need consistent structured output across multiple queries
   - Token savings: ~85% reduction + consistent schema
   - Example: `exa-ai search "query" --summary-schema '{...}' | jq -r '.results[].summary | fromjson'`

**Why**: Mixing approaches increases complexity and token usage. Choosing one approach optimizes for your use case.

---

## Shell Command Best Practices

### MUST: Run commands directly, parse separately

**Applies to**: monitor, search (websets), research, and all skills using complex commands

When using the Bash tool with complex shell syntax, run commands directly and parse output in separate steps:

```bash
# ❌ WRONG - nested command substitution
webset_id=$(exa-ai webset-create --search '{"query":"..."}' | jq -r '.webset_id')

# ✅ CORRECT - run directly, then parse
exa-ai webset-create --search '{"query":"..."}'
# Then in a follow-up command:
webset_id=$(cat output.json | jq -r '.webset_id')
```

**Why**: Complex nested `$(...)` command substitutions can fail unpredictably in shell environments. Running commands directly and parsing separately improves reliability and makes debugging easier.

### MUST NOT: Use nested command substitutions

**Applies to**: All skills when using complex multi-step operations

Avoid nesting multiple levels of command substitution:

```bash
# ❌ WRONG - deeply nested
result=$(exa-ai search "$(cat query.txt | tr '\n' ' ')" --num-results $(cat config.json | jq -r '.count'))

# ✅ CORRECT - sequential steps
query=$(cat query.txt | tr '\n' ' ')
count=$(cat config.json | jq -r '.count')
exa-ai search "$query" --num-results $count
```

**Why**: Nested command substitutions are fragile and hard to debug when they fail. Sequential steps make each operation explicit and easier to troubleshoot.

### SHOULD: Break complex commands into sequential steps

**Applies to**: All skills when working with multi-step workflows

For readability and reliability, break complex operations into clear sequential steps:

```bash
# ❌ Less maintainable - everything in one line
exa-ai webset-create --search '{"query":"startups","count":1}' | jq -r '.webset_id' | xargs -I {} exa-ai webset-search-create {} --query "AI" --behavior override

# ✅ More maintainable - clear steps
exa-ai webset-create --search '{"query":"startups","count":1}'
webset_id=$(jq -r '.webset_id' < output.json)
exa-ai webset-search-create $webset_id --query "AI" --behavior override
```

**Why**: Sequential steps are easier to understand, debug, and modify. Each step can be verified independently.

</shared-requirements>