benchflow-ai / trivy-offline-vulnerability-scanning

Use Trivy vulnerability scanner in offline mode to discover security vulnerabilities in dependency files. This skill covers setting up offline scanning, executing Trivy against package lock files, and generating JSON vulnerability reports without requiring internet access.

0 views
0 installs

Skill Content

---
name: trivy-offline-vulnerability-scanning
description: Use Trivy vulnerability scanner in offline mode to discover security vulnerabilities in dependency files. This skill covers setting up offline scanning, executing Trivy against package lock files, and generating JSON vulnerability reports without requiring internet access.
---

# Trivy Offline Vulnerability Scanning

This skill provides guidance on using Trivy, an open-source security scanner, to discover vulnerabilities in software dependencies using offline mode.

## Overview

Trivy is a comprehensive vulnerability scanner that can analyze various targets including container images, filesystems, and dependency lock files. **Offline scanning** is crucial for:
- Air-gapped environments without internet access
- Reproducible security audits with fixed vulnerability databases
- Faster CI/CD pipelines avoiding network latency
- Compliance requirements for controlled environments

## Why Offline Mode?

### Challenges with Online Scanning
- Network dependency introduces failure points
- Database updates can cause inconsistent results across runs
- Slower execution due to download times
- Security policies may restrict external connections

### Benefits of Offline Scanning
- **Reproducibility**: Same database = same results
- **Speed**: No network overhead
- **Reliability**: No external dependencies
- **Compliance**: Works in restricted environments

## Trivy Database Structure

Trivy's vulnerability database consists of:
- **trivy.db**: SQLite database containing CVE information
- **metadata.json**: Database version and update timestamp

Database location: `<cache-dir>/db/trivy.db`

## Offline Scanning Workflow

### Step 1: Verify Database Existence

Before scanning, ensure the offline database is available:

```python
import os
import sys

TRIVY_CACHE_PATH = './trivy-cache'

# Check for database file
db_path = os.path.join(TRIVY_CACHE_PATH, "db", "trivy.db")
if not os.path.exists(db_path):
    print(f"[!] Error: Trivy database not found at {db_path}")
    print("    Download database first with:")
    print(f"    trivy image --download-db-only --cache-dir {TRIVY_CACHE_PATH}")
    sys.exit(1)
```

### Step 2: Construct Trivy Command

Key flags for offline scanning:

| Flag | Purpose |
|------|---------|
| `fs <target>` | Scan filesystem/file (e.g., package-lock.json) |
| `--format json` | Output in JSON format for parsing |
| `--output <file>` | Save results to file |
| `--scanners vuln` | Scan only for vulnerabilities (not misconfigs) |
| `--skip-db-update` | **Critical**: Do not update database |
| `--offline-scan` | Enable offline mode |
| `--cache-dir <path>` | Path to pre-downloaded database |

```python
import subprocess

TARGET_FILE = 'package-lock.json'
OUTPUT_FILE = 'trivy_report.json'
TRIVY_CACHE_PATH = './trivy-cache'

command = [
    "trivy", "fs", TARGET_FILE,
    "--format", "json",
    "--output", OUTPUT_FILE,
    "--scanners", "vuln",
    "--skip-db-update",      # Prevent online updates
    "--offline-scan",         # Enable offline mode
    "--cache-dir", TRIVY_CACHE_PATH
]
```

### Step 3: Execute Scan

```python
try:
    result = subprocess.run(
        command,
        capture_output=True,
        text=True,
        check=False  # Don't raise exception on non-zero exit
    )
    
    if result.returncode != 0:
        print("[!] Trivy scan failed:")
        print(result.stderr)
        sys.exit(1)
    
    print("[*] Scan completed successfully")
    print(f"[*] Results saved to: {OUTPUT_FILE}")
    
except FileNotFoundError:
    print("[!] Error: 'trivy' command not found")
    print("    Install Trivy: https://aquasecurity.github.io/trivy/latest/getting-started/installation/")
    sys.exit(1)
```

## Complete Example

```python
import os
import sys
import subprocess

def run_trivy_offline_scan(target_file, output_file, cache_dir='./trivy-cache'):
    """
    Execute Trivy vulnerability scan in offline mode.
    
    Args:
        target_file: Path to file to scan (e.g., package-lock.json)
        output_file: Path to save JSON results
        cache_dir: Path to Trivy offline database
    """
    print(f"[*] Starting Trivy offline scan...")
    print(f"    Target: {target_file}")
    print(f"    Database: {cache_dir}")
    
    # Verify database exists
    db_path = os.path.join(cache_dir, "db", "trivy.db")
    if not os.path.exists(db_path):
        print(f"[!] Error: Database not found at {db_path}")
        sys.exit(1)
    
    # Build command
    command = [
        "trivy", "fs", target_file,
        "--format", "json",
        "--output", output_file,
        "--scanners", "vuln",
        "--skip-db-update",
        "--offline-scan",
        "--cache-dir", cache_dir
    ]
    
    # Execute
    try:
        result = subprocess.run(command, capture_output=True, text=True)
        
        if result.returncode != 0:
            print("[!] Scan failed:")
            print(result.stderr)
            sys.exit(1)
        
        print("[*] Scan completed successfully")
        return output_file
        
    except FileNotFoundError:
        print("[!] Trivy not found. Install from:")
        print("    https://aquasecurity.github.io/trivy/")
        sys.exit(1)

# Usage
if __name__ == "__main__":
    run_trivy_offline_scan(
        target_file='package-lock.json',
        output_file='trivy_report.json'
    )
```

## JSON Output Structure

Trivy outputs vulnerability data in this format:

```json
{
  "Results": [
    {
      "Target": "package-lock.json",
      "Vulnerabilities": [
        {
          "VulnerabilityID": "CVE-2021-44906",
          "PkgName": "minimist",
          "InstalledVersion": "1.2.5",
          "FixedVersion": "1.2.6",
          "Severity": "CRITICAL",
          "Title": "Prototype Pollution in minimist",
          "PrimaryURL": "https://avd.aquasec.com/nvd/cve-2021-44906",
          "CVSS": {
            "nvd": { "V3Score": 9.8 }
          }
        }
      ]
    }
  ]
}
```

## Common Issues

### Issue: "failed to initialize DB"
**Cause**: Database not found or corrupted  
**Solution**: Re-download database or check `--cache-dir` path

### Issue: Scan finds no vulnerabilities when they exist
**Cause**: Database is outdated  
**Solution**: Download newer database (before going offline)

### Issue: "command not found: trivy"
**Cause**: Trivy not installed or not in PATH  
**Solution**: Install Trivy following official documentation

## Dependencies

### Required Tools
- **Trivy**: Version 0.40.0 or later recommended
  ```bash
  # Installation (example for Debian/Ubuntu)
  wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | apt-key add -
  echo "deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main" | tee -a /etc/apt/sources.list.d/trivy.list
  apt-get update
  apt-get install trivy
  ```

### Python Modules
- `subprocess` (standard library)
- `os` (standard library)
- `sys` (standard library)

## References

- [Trivy Documentation](https://aquasecurity.github.io/trivy/)
- [Trivy Offline Mode Guide](https://aquasecurity.github.io/trivy/latest/docs/advanced/air-gap/)
- [CVE Database](https://cve.mitre.org/)