itsmostafa / bedrock

AWS Bedrock foundation models for generative AI. Use when invoking foundation models, building AI applications, creating embeddings, configuring model access, or implementing RAG patterns.

0 views
0 installs

Skill Content

---
name: bedrock
description: AWS Bedrock foundation models for generative AI. Use when invoking foundation models, building AI applications, creating embeddings, configuring model access, or implementing RAG patterns.
last_updated: "2026-01-07"
doc_source: https://docs.aws.amazon.com/bedrock/latest/userguide/
---

# AWS Bedrock

Amazon Bedrock provides access to foundation models (FMs) from AI companies through a unified API. Build generative AI applications with text generation, embeddings, and image generation capabilities.

## Table of Contents

- [Core Concepts](#core-concepts)
- [Common Patterns](#common-patterns)
- [CLI Reference](#cli-reference)
- [Best Practices](#best-practices)
- [Troubleshooting](#troubleshooting)
- [References](#references)

## Core Concepts

### Foundation Models

Pre-trained models available through Bedrock:
- **Claude** (Anthropic): Text generation, analysis, coding
- **Titan** (Amazon): Text, embeddings, image generation
- **Llama** (Meta): Open-weight text generation
- **Mistral**: Efficient text generation
- **Stable Diffusion** (Stability AI): Image generation

### Model Access

Models must be enabled in your account before use:
- Request access in Bedrock console
- Some models require acceptance of EULAs
- Access is region-specific

### Inference Types

| Type | Use Case | Pricing |
|------|----------|---------|
| **On-Demand** | Variable workloads | Per token |
| **Provisioned Throughput** | Consistent high-volume | Hourly commitment |
| **Batch Inference** | Async large-scale | Discounted per token |

## Common Patterns

### Invoke Model (Text Generation)

**AWS CLI:**

```bash
# Invoke Claude
aws bedrock-runtime invoke-model \
  --model-id anthropic.claude-3-sonnet-20240229-v1:0 \
  --content-type application/json \
  --accept application/json \
  --body '{
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Explain AWS Lambda in 3 sentences."}
    ]
  }' \
  response.json

cat response.json | jq -r '.content[0].text'
```

**boto3:**

```python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def invoke_claude(prompt, max_tokens=1024):
    response = bedrock.invoke_model(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': max_tokens,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    result = json.loads(response['body'].read())
    return result['content'][0]['text']

# Usage
response = invoke_claude('What is Amazon S3?')
print(response)
```

### Streaming Response

```python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def stream_claude(prompt):
    response = bedrock.invoke_model_with_response_stream(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    for event in response['body']:
        chunk = json.loads(event['chunk']['bytes'])
        if chunk['type'] == 'content_block_delta':
            yield chunk['delta'].get('text', '')

# Usage
for text in stream_claude('Write a haiku about cloud computing.'):
    print(text, end='', flush=True)
```

### Generate Embeddings

```python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def get_embedding(text):
    response = bedrock.invoke_model(
        modelId='amazon.titan-embed-text-v2:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'inputText': text,
            'dimensions': 1024,
            'normalize': True
        })
    )

    result = json.loads(response['body'].read())
    return result['embedding']

# Usage
embedding = get_embedding('AWS Lambda is a serverless compute service.')
print(f'Embedding dimension: {len(embedding)}')
```

### Conversation with History

```python
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

class Conversation:
    def __init__(self, system_prompt=None):
        self.messages = []
        self.system = system_prompt

    def chat(self, user_message):
        self.messages.append({
            'role': 'user',
            'content': user_message
        })

        body = {
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': self.messages
        }

        if self.system:
            body['system'] = self.system

        response = bedrock.invoke_model(
            modelId='anthropic.claude-3-sonnet-20240229-v1:0',
            contentType='application/json',
            accept='application/json',
            body=json.dumps(body)
        )

        result = json.loads(response['body'].read())
        assistant_message = result['content'][0]['text']

        self.messages.append({
            'role': 'assistant',
            'content': assistant_message
        })

        return assistant_message

# Usage
conv = Conversation(system_prompt='You are an AWS solutions architect.')
print(conv.chat('What database should I use for a chat application?'))
print(conv.chat('What about for time-series data?'))
```

### List Available Models

```bash
# List all foundation models
aws bedrock list-foundation-models \
  --query 'modelSummaries[*].[modelId,modelName,providerName]' \
  --output table

# Filter by provider
aws bedrock list-foundation-models \
  --by-provider anthropic \
  --query 'modelSummaries[*].modelId'

# Get model details
aws bedrock get-foundation-model \
  --model-identifier anthropic.claude-3-sonnet-20240229-v1:0
```

### Request Model Access

```bash
# List model access status
aws bedrock list-foundation-model-agreement-offers \
  --model-id anthropic.claude-3-sonnet-20240229-v1:0
```

## CLI Reference

### Bedrock (Control Plane)

| Command | Description |
|---------|-------------|
| `aws bedrock list-foundation-models` | List available models |
| `aws bedrock get-foundation-model` | Get model details |
| `aws bedrock list-custom-models` | List fine-tuned models |
| `aws bedrock create-model-customization-job` | Start fine-tuning |
| `aws bedrock list-provisioned-model-throughputs` | List provisioned capacity |

### Bedrock Runtime (Data Plane)

| Command | Description |
|---------|-------------|
| `aws bedrock-runtime invoke-model` | Invoke model synchronously |
| `aws bedrock-runtime invoke-model-with-response-stream` | Invoke with streaming |
| `aws bedrock-runtime converse` | Multi-turn conversation API |
| `aws bedrock-runtime converse-stream` | Streaming conversation |

### Bedrock Agent Runtime

| Command | Description |
|---------|-------------|
| `aws bedrock-agent-runtime invoke-agent` | Invoke a Bedrock agent |
| `aws bedrock-agent-runtime retrieve` | Query knowledge base |
| `aws bedrock-agent-runtime retrieve-and-generate` | RAG query |

## Best Practices

### Cost Optimization

- **Use appropriate models**: Smaller models for simple tasks
- **Set max_tokens**: Limit output length when possible
- **Cache responses**: For repeated identical queries
- **Batch when possible**: Use batch inference for bulk processing
- **Monitor usage**: Set up CloudWatch alarms for cost

### Performance

- **Use streaming**: For better user experience with long outputs
- **Connection pooling**: Reuse boto3 clients
- **Regional deployment**: Use closest region to reduce latency
- **Provisioned throughput**: For consistent high-volume workloads

### Security

- **Least privilege IAM**: Only grant needed model access
- **VPC endpoints**: Keep traffic private
- **Guardrails**: Implement content filtering
- **Audit with CloudTrail**: Track model invocations

### IAM Permissions

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
      ]
    }
  ]
}
```

## Troubleshooting

### AccessDeniedException

**Causes:**
- Model access not enabled in console
- IAM policy missing `bedrock:InvokeModel`
- Wrong model ID or region

**Debug:**

```bash
# Check model access status
aws bedrock list-foundation-models \
  --query 'modelSummaries[?modelId==`anthropic.claude-3-sonnet-20240229-v1:0`]'

# Test IAM permissions
aws iam simulate-principal-policy \
  --policy-source-arn arn:aws:iam::123456789012:role/my-role \
  --action-names bedrock:InvokeModel \
  --resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"
```

### ModelNotReadyException

**Cause:** Model is still being provisioned or temporarily unavailable.

**Solution:** Implement retry with exponential backoff:

```python
import time
from botocore.exceptions import ClientError

def invoke_with_retry(bedrock, body, max_retries=3):
    for attempt in range(max_retries):
        try:
            return bedrock.invoke_model(
                modelId='anthropic.claude-3-sonnet-20240229-v1:0',
                body=json.dumps(body)
            )
        except ClientError as e:
            if e.response['Error']['Code'] == 'ModelNotReadyException':
                time.sleep(2 ** attempt)
            else:
                raise
    raise Exception('Max retries exceeded')
```

### ThrottlingException

**Causes:**
- Exceeded on-demand quota
- Too many concurrent requests

**Solutions:**
- Request quota increase
- Implement exponential backoff
- Consider provisioned throughput

### ValidationException

**Common issues:**
- Invalid model ID
- Malformed request body
- max_tokens exceeds model limit

**Debug:**

```python
# Check model-specific requirements
aws bedrock get-foundation-model \
  --model-identifier anthropic.claude-3-sonnet-20240229-v1:0 \
  --query 'modelDetails.inferenceTypesSupported'
```

## References

- [Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/)
- [Bedrock API Reference](https://docs.aws.amazon.com/bedrock/latest/APIReference/)
- [Bedrock Runtime API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Operations_Amazon_Bedrock_Runtime.html)
- [Model Parameters](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html)
- [Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)