| name | vertex-engine-inspector |
| description | Vertex AI Agent Engine runtime inspector and health monitor. Validates deployed agents, Code Execution Sandbox settings, Memory Bank configuration, A2A protocol compliance, and production readiness. Triggers: "inspect agent engine", "validate agent deployment", "check code execution sandbox", "monitor agent health" |
| allowed-tools | Read, Grep, Glob, Bash |
| version | 1.0.0 |
What This Skill Does
Expert inspector for the Vertex AI Agent Engine managed runtime. Performs comprehensive validation of deployed agents including runtime configuration, security posture, performance settings, A2A protocol compliance, and production readiness scoring.
When This Skill Activates
Trigger Phrases
- "Inspect Vertex AI Engine agent"
- "Validate Agent Engine deployment"
- "Check Code Execution Sandbox configuration"
- "Verify Memory Bank settings"
- "Monitor agent health"
- "Agent Engine production readiness"
- "A2A protocol compliance check"
- "Agent Engine security audit"
Use Cases
- Pre-production deployment validation
- Post-deployment health monitoring
- Security compliance audits
- Performance optimization reviews
- Troubleshooting agent issues
- Configuration drift detection
Inspection Categories
1. Runtime Configuration ✅
- Model selection (Gemini 2.5 Pro/Flash)
- Tools enabled (Code Execution, Memory Bank, custom)
- VPC configuration
- Resource allocation
- Scaling policies
2. Code Execution Sandbox 🔒
- Security: Isolated environment, no external network access
- State Persistence: TTL validation (1-14 days)
- IAM: Least privilege permissions
- Performance: Timeout and resource limits
- Concurrent Executions: Max concurrent code runs
Critical Checks:
✅ State TTL between 7-14 days (optimal for production)
✅ Sandbox type is SECURE_ISOLATED
✅ IAM permissions limited to required GCP services only
✅ Timeout configured appropriately
⚠️ State TTL < 7 days may cause premature session loss
❌ State TTL > 14 days not allowed by Agent Engine
3. Memory Bank Configuration 🧠
- Enabled Status: Persistent memory active
- Retention Policy: Max memories, retention days
- Storage Backend: Firestore encryption & region
- Query Performance: Indexing, caching, latency
- Auto-Cleanup: Quota management
Critical Checks:
✅ Max memories >= 100 (prevents conversation truncation)
✅ Indexing enabled (fast query performance)
✅ Auto-cleanup enabled (prevents quota exhaustion)
✅ Encrypted at rest (Firestore default)
⚠️ Low memory limit may truncate long conversations
4. A2A Protocol Compliance 🔗
- AgentCard: Available at
/.well-known/agent-card - Task API:
POST /v1/tasks:sendresponds correctly - Status API:
GET /v1/tasks/{task_id}accessible - Protocol Version: 1.0 compliance
- Required Fields: name, description, tools, version
Compliance Report:
✅ AgentCard accessible and valid
✅ Task submission API functional
✅ Status polling API functional
✅ Protocol version 1.0
❌ Missing AgentCard fields: [...]
❌ Task API not responding (check IAM/networking)
5. Security Posture 🛡️
- IAM Roles: Least privilege validation
- VPC Service Controls: Perimeter protection
- Model Armor: Prompt injection protection
- Encryption: At-rest and in-transit
- Service Account: Proper configuration
- Secret Management: No hardcoded credentials
Security Score:
🟢 SECURE (90-100%): Production ready
🟡 NEEDS ATTENTION (70-89%): Address issues before prod
🔴 INSECURE (<70%): Do not deploy to production
6. Performance Metrics 📊
- Auto-Scaling: Min/max instances configured
- Resource Limits: CPU, memory appropriate
- Latency: P50, P95, P99 within SLOs
- Throughput: Requests per second
- Token Usage: Cost tracking
- Error Rate: < 5% target
Health Status:
🟢 HEALTHY: Error rate < 5%, latency < 3s (p95)
🟡 DEGRADED: Error rate 5-10% or latency 3-5s
🔴 UNHEALTHY: Error rate > 10% or latency > 5s
7. Monitoring & Observability 📈
- Cloud Monitoring: Dashboards configured
- Alerting: Policies for errors, latency, costs
- Logging: Structured logs aggregated
- Tracing: OpenTelemetry enabled
- Error Tracking: Cloud Error Reporting
Observability Score:
✅ All 5 pillars configured: Metrics, Logs, Traces, Alerts, Dashboards
⚠️ Missing alerts for critical scenarios
❌ No monitoring configured (production blocker)
Production Readiness Scoring
Scoring Matrix
| Category | Weight | Checks |
|---|---|---|
| Security | 30% | 6 checks (IAM, VPC-SC, encryption, etc.) |
| Performance | 25% | 6 checks (scaling, limits, SLOs, etc.) |
| Monitoring | 20% | 6 checks (dashboards, alerts, logs, etc.) |
| Compliance | 15% | 5 checks (audit logs, DR, privacy, etc.) |
| Reliability | 10% | 5 checks (multi-region, failover, etc.) |
Overall Readiness Status
🟢 PRODUCTION READY (85-100%)
- All critical checks passed
- Minor optimizations recommended
- Safe to deploy
🟡 NEEDS IMPROVEMENT (70-84%)
- Some important checks failed
- Address issues before production
- Staging deployment acceptable
🔴 NOT READY (<70%)
- Critical failures present
- Do not deploy to production
- Fix blocking issues first
Inspection Workflow
Phase 1: Configuration Analysis
1. Connect to Agent Engine
2. Retrieve agent metadata
3. Parse runtime configuration
4. Extract Code Execution settings
5. Extract Memory Bank settings
6. Document VPC configuration
Phase 2: Protocol Validation
1. Test AgentCard endpoint
2. Validate AgentCard structure
3. Test Task API (POST /v1/tasks:send)
4. Test Status API (GET /v1/tasks/{id})
5. Verify A2A protocol version
Phase 3: Security Audit
1. Review IAM roles and permissions
2. Check VPC Service Controls
3. Validate encryption settings
4. Scan for hardcoded secrets
5. Verify Model Armor enabled
6. Assess service account security
Phase 4: Performance Analysis
1. Query Cloud Monitoring metrics
2. Calculate error rate (last 24h)
3. Analyze latency percentiles
4. Review token usage and costs
5. Check auto-scaling behavior
6. Validate resource limits
Phase 5: Production Readiness
1. Run all checklist items (28 checks)
2. Calculate category scores
3. Calculate overall score
4. Determine readiness status
5. Generate recommendations
6. Create action plan
Tool Permissions
Read-only inspection - Cannot modify configurations:
- Read: Analyze agent configuration files
- Grep: Search for security issues
- Glob: Find related configuration
- Bash: Query GCP APIs (read-only)
Example Inspection Report
Agent ID: gcp-deployer-agent
Deployment Status: RUNNING
Inspection Date: 2025-12-09
Runtime Configuration:
Model: gemini-2.5-flash
Code Execution: ✅ Enabled (TTL: 14 days)
Memory Bank: ✅ Enabled (retention: 90 days)
VPC: ✅ Configured (private-vpc-prod)
A2A Protocol Compliance:
AgentCard: ✅ Valid
Task API: ✅ Functional
Status API: ✅ Functional
Protocol Version: 1.0
Security Posture:
IAM: ✅ Least privilege (score: 95%)
VPC-SC: ✅ Enabled
Model Armor: ✅ Enabled
Encryption: ✅ At-rest & in-transit
Overall: 🟢 SECURE (92%)
Performance Metrics (24h):
Request Count: 12,450
Error Rate: 2.3% 🟢
Latency (p95): 1,850ms 🟢
Token Usage: 450K tokens
Cost Estimate: $12.50/day
Production Readiness:
Security: 92% (28/30 points)
Performance: 88% (22/25 points)
Monitoring: 95% (19/20 points)
Compliance: 80% (12/15 points)
Reliability: 70% (7/10 points)
Overall Score: 87% 🟢 PRODUCTION READY
Recommendations:
1. Enable multi-region deployment (reliability +10%)
2. Configure automated backups (compliance +5%)
3. Add circuit breaker pattern (reliability +5%)
4. Optimize memory bank indexing (performance +3%)
Integration with Other Plugins
Works with jeremy-adk-orchestrator
- Orchestrator deploys agents
- Inspector validates deployments
- Feedback loop for optimization
Works with jeremy-vertex-validator
- Validator checks code before deployment
- Inspector validates runtime after deployment
- Complementary pre/post checks
Works with jeremy-adk-terraform
- Terraform provisions infrastructure
- Inspector validates provisioned agents
- Ensures IaC matches runtime
Troubleshooting Guide
Issue: Agent not responding
Inspector checks:
- VPC configuration allows traffic
- IAM permissions correct
- Agent Engine status is RUNNING
- No quota limits exceeded
Issue: High error rate
Inspector checks:
- Model configuration appropriate
- Resource limits not exceeded
- Code Execution sandbox not timing out
- Memory Bank not quota-exhausted
Issue: Slow response times
Inspector checks:
- Auto-scaling configured
- Code Execution TTL appropriate
- Memory Bank indexing enabled
- Caching strategy implemented
Version History
- 1.0.0 (2025): Initial release with Agent Engine GA support, Code Execution Sandbox, Memory Bank, A2A protocol validation
References
- Agent Engine: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview
- Code Execution: https://cloud.google.com/agent-builder/agent-engine/code-execution/overview
- Memory Bank: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/overview
- A2A Protocol: https://google.github.io/adk-docs/a2a/