Skip to main content

Universal Image MCP - One Server, Three AI Image Providers

· 5 min read
Manu Mishra
Distinguished Solutions Architect, Author & Researcher in AI & Cloud

Universal Image MCP Architecture

I use AI-generated images constantly—architecture diagrams, UI mockups, interface concepts, responsive design visualizations. Different models excel at different tasks, and I often compare outputs across providers to pick the best result.

The existing MCP servers I found had a common limitation: hardcoded model IDs. When providers deprecated models, tools broke. When new models launched, they weren't available until someone updated the code.

So I built Universal Image MCP—a server that fetches models dynamically from provider APIs. New models appear automatically. Deprecated ones disappear. No code changes required.

Serverless Streaming Analytics with S3 Tables & Firehose

· 9 min read
Manu Mishra
Distinguished Solutions Architect, Author & Researcher in AI & Cloud

S3 Tables Architecture

Introduction

Modern businesses need to analyze streaming data in real-time to make faster decisions. Whether it's monitoring IoT sensors, tracking user behavior, or processing financial transactions, the ability to query fresh data immediately is critical. However, building a streaming analytics pipeline traditionally requires managing complex infrastructure and dealing with data format conversions.

This solution shows how to build a serverless real-time streaming analytics pipeline using Amazon S3 Tables and Amazon Kinesis Data Firehose. By combining streaming ingestion with Apache Iceberg's analytics-optimized format, you can query data within minutes of generation—without managing any servers or data transformation jobs.

GitHub Repository: https://github.com/manu-mishra/s3table-firehose-lambda-terraform-demo

Recursive Language Models on AWS with Strands Agents

· 12 min read
Manu Mishra
Distinguished Solutions Architect, Author & Researcher in AI & Cloud

RLM on AWS Architecture

Introduction

Modern large language models face a fundamental limitation: context windows. While frontier models now reach 1 million tokens (Nova Premier, Claude Sonnet 4.5), workloads analyzing entire codebases, document collections, or multi-hour conversations can easily exceed 10 million tokens—far beyond any single model's capacity.

This post demonstrates Recursive Language Models (RLMs), an inference strategy from MIT CSAIL research that enables scaling to inputs far beyond context windows. What makes this implementation special: Strands Agents and Amazon Bedrock AgentCore reduce what could be weeks of glue code and deployment work to just a few hours of development.

AWS Re:Invent 2025, Reinvented — Powered by MCP

· 4 min read
Manu Mishra
Distinguished Solutions Architect, Author & Researcher in AI & Cloud

Every year, AWS re:Invent brings together thousands of builders, leaders, and innovators to explore the future of cloud. In 2025, the catalog is bigger than ever — 1,843 sessions across 53 areas of interest and 19 industries. Inspiring, yes — but also overwhelming.

That's why I built the re:Invent 2025 MCP Server: a comprehensive Model Context Protocol server that transforms how professionals navigate AWS's flagship conference, providing intelligent access to the complete session catalog with advanced search capabilities and speaker discovery.

Google's EmbeddingGemma on AWS Lambda - A Curiosity-Driven Experiment

· 6 min read
Manu Mishra
Distinguished Solutions Architect, Author & Researcher in AI & Cloud

EmbeddingGemma on AWS Lambda

Note: This is a curiosity-driven experiment, not a production recommendation. For real workloads, Amazon SageMaker is the right choice. This project explores what's possible when you push serverless boundaries.

1. The idea

After my BitNet Lambda experiment, I kept thinking: what about embeddings? I had text generation working on Lambda, but what about the other half of modern AI applications?

Google's EmbeddingGemma caught my attention—300M parameters, multilingual, designed for efficiency. Could it work on Lambda? Only one way to find out.

So I fired up Amazon Q Developer and started experimenting.