DevOps
hyperpod-nccl
Diagnose NCCL failures and related training issues on HyperPod GPU clusters (EKS or Slurm): training hangs, AllReduce timeouts, EFA or libfabric errors, rendezvous failures, EFA TC…
DevOps
aws-serverless-production-readiness
Assess AWS Lambda serverless workloads for production readiness covering roles, event sources, retries, DLQs, concurrency, idempotency, observability, deployment safety, performanc…
Engineering
to-be-architecture
Design target AWS architecture covering ECS Fargate/EKS/Serverless selection, VPC topology with public-private-isolated subnets, database choice (RDS/Aurora/DynamoDB), IAM boundari…
Engineering
cloud-architect
Senior Cloud Architect specializing in AWS, Azure, and GCP multi-cloud strategies with expertise in cost optimization, infrastructure design, and enterprise cloud migration. Use wh…
DevOps
hyperpod-node-debugger
Diagnose and remediate per-node issues on HyperPod clusters (EKS or Slurm): unhealthy or unresponsive nodes, on-node EFA, GPU hardware faults (XID, ECC, NVLink), Slurm node states,…
DevOps
oci-multi-cloud-architect
Design and review multi-cloud architectures connecting Oracle Cloud Infrastructure with AWS, Azure, Google Cloud, on-premises, or SaaS using VPN, FastConnect, Direct Connect, Expre…
DevOps
oma-tf-infra
Infrastructure-as-code specialist for multi-cloud provisioning using Terraform across AWS, GCP, Azure, and Oracle Cloud. Use for terraform plan/apply, state management, compute, da…
DevOps
terrashark
Prevent Terraform/OpenTofu hallucinations by diagnosing and fixing failure modes: identity churn, secret exposure, blast-radius mistakes, CI drift, and compliance gate gaps. Use wh…
Engineering
aws-network-architect
Designs, reviews, and troubleshoots AWS network, hybrid, and multi-cloud connectivity across VPCs, Transit Gateway, Direct Connect, VPN, Cloud WAN, Route 53 Resolver, private DNS, …
Design
aws-drawio-architecture-diagrams
Generates professional AWS architecture diagrams in draw.io XML format using official AWS Architecture Icons. Handles VPC layouts, multi-tier architectures, serverless designs, net…
DevOps
hyperpod-slurm-debugger
Diagnose Slurm scheduler and node-daemon issues on HyperPod Slurm clusters: nodes stuck down or drained, unexpected reboots, slurmd failures, jobs stuck PENDING or COMPLETING, inco…
DevOps
aws-observability-and-cost-readiness
Wires AWS observability and cost readiness for workloads or accounts after runtime decisions. Produces CloudWatch resources, ADOT and X-Ray tracing, SLO dashboards, Cost Explorer, …
DevOps
aws-cost-optimization-governor
Reviews AWS cost optimization and FinOps posture across Cost Explorer, Budgets, Cost Optimization Hub, Compute Optimizer, Savings Plans, Reserved Instances, tagging, showback, idle…
DevOps
debugging-lambda-timeouts
Debug AWS Lambda timeout failures by analyzing function configuration, CloudWatch logs and metrics, VPC/networking, cold starts, memory constraints, and downstream dependencies. Us…
DevOps
aws-serverless-rollout-corrector
Update AWS serverless rollout definitions for Lambda, API Gateway, EventBridge, SQS, SNS, event wiring, aliases, versions, and deployment configuration. Apply to repository-level r…
Engineering
aws-transform
Perform code upgrades, migrations, and transformations using the AWS Transform CLI. Handles language upgrades, SDK and framework migrations, performance optimization, x86-to-Gravit…
DevOps
oraclecloud-migration-deep-dive
Migrate workloads from AWS or Azure to OCI — IAM translation, networking mapping, compute image import, and data migration. Use when planning an AWS-to-OCI or Azure-to-OCI migratio…
Security
offensive-cloud
Cloud security methodology covering credential harvesting from IMDS and instance roles, enumeration with cloud-native tools, IAM privilege escalation, persistence via backdoor iden…
DevOps
aws-data-protection-backup-steward
Reviews AWS backup and data protection implementation across AWS Backup, EBS/RDS/EFS/S3 recovery patterns, vaults, vault lock, retention, encryption, cross-account/cross-Region cop…
DevOps
aws-observability
Builds, configures, debugs, and optimizes AWS observability using CloudWatch, X-Ray, CloudTrail, and ADOT. Covers Log Insights queries, alarm configuration, dashboard design, custo…
DevOps
launching-ec2-instance-with-best-practices
Launches EC2 instances using secure, cost-efficient defaults: AMI selection, burstable sizing, least-privilege IAM, hardened security groups, encrypted EBS, and tagging. Follows AW…
DevOps
aws-workload-runtime-and-deployment
Select AWS compute primitives and design load balancing, autoscaling, and deployment after network and identity foundations exist. Produces compute choices, ALB/NLB posture, autosc…
DevOps
hyperpod-ssm
Execute remote commands and transfer files on HyperPod cluster nodes via AWS Systems Manager. Primary interface for node access when direct SSH is unavailable; supports diagnostics…
DevOps
aws-observability-incident-responder
Investigates broad AWS incidents and observability gaps using CloudWatch metrics, logs, alarms, traces, EventBridge events, service health, runbooks, timelines, blast radius, root-…
DevOps
aws-waf-reliability-review
Review AWS workloads against the Well-Architected Framework Reliability Pillar. Covers quotas, architecture, change management, backup/DR strategy, and failure isolation for availa…
Security
aws-vault-mfa-iam
Diagnose IAM API failures when MFA policies block requests and configure aws-vault. Handles cases where sts:GetCallerIdentity succeeds but iam:* calls return InvalidClientTokenId, …
DevOps
hyperpod-performance-debugger
Diagnose performance issues on HyperPod clusters: uneven NCCL bandwidth, poor filesystem throughput, host-side signals such as Xid, ECC, NVLink, EFA reachability, and FSx saturatio…
Data
aws-rds-aurora-performance-investigator
Investigates Amazon RDS and Aurora incidents involving latency, connection exhaustion, slow queries, lock waits, storage pressure, CPU/I/O saturation, replica lag, failover behavio…
DevOps
aws-waf-cost-optimization-review
Review AWS workloads against the Well-Architected Framework Cost Optimization Pillar. Covers visibility, tagging, commitments, rightsizing, Spot adoption, and idle resources for sp…
DevOps
connecting-lambda-to-api-gateway
Connect an existing AWS Lambda function to Amazon API Gateway by creating REST or HTTP APIs with resource/method setup, proxy integration, permissions, and deployment. Handles CORS…
Security
secrets
Audit codebases for leaked secrets and hardcoded credentials. Generate .env templates and configure management with AWS Secrets Manager, Vault, Doppler, or GCP Secret Manager, incl…
Security
node-aws-security-audit
Perform comprehensive security audits on Node.js, JavaScript, and TypeScript codebases. Scans for OWASP Top 10 vulnerabilities, insecure patterns, dependency risks, and generates a…
DevOps
aws-cloudformation-vpc
AWS CloudFormation patterns for VPC infrastructure including VPCs, subnets, route tables, NAT gateways, and internet gateways. Supports parameters, outputs, mappings, conditions, a…
DevOps
agent-cloud-architect
Expert cloud architect for multi-cloud strategies, scalable architectures, and cost-effective solutions. Masters AWS, Azure, and GCP with focus on security, performance, and resili…
DevOps
aws-sdk-python-usage
AWS SDK for Python (boto3/botocore) patterns. Covers client and resource creation, session configuration, error handling, paginators, waiters, S3 transfers, DynamoDB operations, an…
Security
aws-private-ca-issuer-review
Reviews AWS ACM Private CA issuer configurations for cert-manager, including AWSPCAIssuer, AWSPCAClusterIssuer, IRSA policy, certificate template ARNs, CRL configuration, and cross…
DevOps
hyperpod-version-checker
Check and compare software versions on HyperPod nodes: NVIDIA drivers, CUDA, cuDNN, NCCL, EFA, AWS OFI NCCL, GDRCopy, MPI, Neuron SDK, Python, and PyTorch. Detect mismatches and ve…
DevOps
alchemy-deploy-integration
Deploys Alchemy-powered Web3 applications to Vercel, Cloud Run, and AWS. Use when deploying dApps with server-side Alchemy SDK access, configuring API key secrets, or setting up RP…
DevOps
hyperpod-cluster-debugger
Diagnose and fix HyperPod (EKS or Slurm) cluster issues including deployment failures, EFA health, node replacement, CloudFormation errors, and autoscaler conflicts. Includes pre-f…
DevOps
aws-ticket-triage-escalation-coordinator
Triage AWS tickets and alerts using priority, ownership, evidence, context, escalation paths, OpsCenter, health signals, and safe next steps. Apply for non-destructive coordination…
DevOps
deploy-pipeline
Sets up CI/CD pipelines and deployment workflows across providers including GitHub Actions, Vercel, Railway, Fly.io, AWS, and VPS environments, with secrets management and automate…
DevOps
aws-sdk-java-v2-s3
Amazon S3 development patterns using AWS SDK for Java 2.x. Covers bucket operations, object uploads and downloads, multipart transfers, presigned URLs, Transfer Manager, and client…
DevOps
cloud
Create cloud provider architecture diagrams using PlantUML syntax with official AWS, Azure, GCP, and Alibaba Cloud service icons. Best for multi-service cloud topologies and migrat…
Business
distribution-channels
Plans product distribution via marketplaces, app stores, or third-party platforms including marketplace listings, Figma plugins, Chrome extensions, AWS Marketplace, Shopify apps, a…
DevOps
aws-dr-and-multi-region-readiness
Designs and rehearses AWS disaster-recovery and multi-region posture including multi-AZ baseline, tier-driven topology, cross-region replication, Route 53 failover, and measured RP…
Security
aws-penetration-testing
Delivers techniques for penetration testing AWS environments, covering IAM enumeration, privilege escalation, SSRF to metadata, S3 exploitation, Lambda code extraction, and persist…
DevOps
bedrock-agentcore-deployment
Deploys production AI agents using Amazon Bedrock AgentCore patterns. Includes starter toolkit, direct code deploy, container deployment, CI/CD pipelines, and infrastructure-as-cod…
AI / ML
bedrock-agentcore
Builds, deploys, and operates production AI agents on Amazon Bedrock AgentCore. Covers Runtime, Gateway, Browser, Code Interpreter, and Identity services for agent development and …
Security
aws-s3-data-perimeter-governor
Reviews Amazon S3 data perimeter and exposure posture including Block Public Access, policies, encryption, replication, logging, and cross-account access. Prefer for S3 data exposu…
Security
aws-security-posture-hardening
Reviews AWS security across Security Hub, GuardDuty, Inspector, Macie, Config, CloudTrail, IAM policies, and exposure findings. Includes remediation guidance and compliance evidenc…
Data
ingesting-into-data-lake
Import data into an AWS data lake from S3, JDBC databases, Redshift, Snowflake, BigQuery, DynamoDB, or Glue tables. Supports one-time loads, recurring pipelines, and migrations to …
Engineering
cloud-architect
Expert cloud architect specializing in AWS/Azure/GCP multi-cloud infrastructure design, advanced IaC (Terraform/OpenTofu/CDK), FinOps cost optimization, and modern architectural pa…
Automation
aws-non-destructive-task-automation-advisor
Designs AWS non-destructive task automation using EventBridge, Step Functions, Lambda, Systems Manager Automation, SNS, SQS, approvals, notifications, reporting, and evidence gathe…
Security
aws-waf-security-review
Review AWS workloads against the Well-Architected Framework Security Pillar: identity foundations, detective controls, infrastructure protection, data protection, and incident resp…
Data
exploring-data-catalog
Inventory and audit AWS Glue Data Catalog assets across S3 Tables, Redshift, and Iceberg catalogs. Use for catalog-wide discovery, not for locating specific datasets or running que…
DevOps
github-actions-matrix-orchestrator
Generates GitHub Actions matrix strategies via the REST API and workflow dispatch events. Supports conditional job inclusion and OIDC token federation for cross-account AWS deploym…
DevOps
serverless-patterns-advanced
Applies advanced serverless patterns: Lambda idempotency with persistence layers, cost modeling versus containers, and CloudWatch Insights queries for cold starts, duration, and er…
DevOps
aws-resilience-bcdr-review
Reviews AWS resilience and business continuity strategy across RTO/RPO, dependency maps, multi-AZ, multi-Region, failover/failback, game days, runbooks, drift, and recovery validat…
DevOps
creating-ec2-image-builder-pipeline
Sets up a complete EC2 Image Builder pipeline that creates custom AMIs, distributes to target regions, and generates launch templates with IAM roles and infrastructure configuratio…
DevOps
serverless-patterns
Implements serverless patterns on AWS including cold start optimization, event source mapping, Step Functions, Lambda Powertools, idempotency, cost modeling, and X-Ray observabilit…
Showing the top 60 of 529. See the full list →