Agent in the Loop: Architecture for Highload Data Pipeline Recovery [ukr]
A real-world-inspired architecture talk about embedding an AI agent into the operational workflow of a highload data pipeline. We walk through a cascade failure scenario: corrupted data enters the pipeline, Kafka queues get stuck, storage pressure grows, thousands of Kubernetes pods start failing and rescheduling, etcd degrades, and PostgreSQL becomes a secondary pressure point.
Then we show how an agent built with AWS Bedrock AgentCore, LangChain, and MCP/Gateway could detect early signals, isolate corrupted messages, suggest human-approved fixes, protect cluster stability, and turn noisy telemetry into actionable recovery steps.
Kyrylo Dubovyk
AI Solutions Architect at EPAM Systems | Founder “Digital Brain”
- Over 12 years of professional experience at the intersection of business, operations, automation, and enterprise architecture; has focused on practical AI engineering since 2022
- Designs enterprise GenAI solutions: LLM platforms, RAG/GraphRAG, agentic systems, MCP integrations, LLMOps, governance, observability, and Responsible AI
- At EPAM, works on AI reference architectures, reusable blueprints, and production-ready approaches for the secure deployment of AI in Azure/AWS/Kubernetes environments
- Author and founder of the Digital Brain pet project—a personal AI system with memory, a knowledge graph, pattern extraction, and a multi-agent architecture
- Believes that the true value of AI lies not in the “magic of models,” but in systems that combine business context, engineering reliability, and real benefits for people
Maksym Borodin
Systems Architect @ EPAM
- 20+ років в ІТ - від С++ розробника до системного архітектора, від систем контролю безпеки АЕС до створення eGoverment private cloud platform
- Specializes in private and public cloud technologies, with a focus on AWS
- Has developed systems for processing massive amounts of data
- AI enthusiast