Microsoft Research Blog

MMCTAgent: Enabling multimodal reasoning over large video and image collections

November 12, 2025 | Akshay Nambi, Kavyansh Chourasia, and Tanuja Ganu
MMCTAgent enables dynamic multimodal reasoning with iterative planning and reflection. Built on Microsoft’s AutoGen framework, it integrates language, vision, and temporal understanding for complex tasks like long video and image analysis.

Recent Posts

  1. Three white icons on a blue-to-purple gradient background: the first icon shows an image/photo; the second icon depicts a computer monitor with vertical bars; the third icon displays three connected circles with user silhouettes.

    MMCTAgent: Enabling multimodal reasoning over large video and image collections

    November 12, 2025 | Akshay Nambi, Kavyansh Chourasia, and Tanuja Ganu

    MMCTAgent enables dynamic multimodal reasoning with iterative planning and reflection. Built on Microsoft’s AutoGen framework, it integrates language, vision, and temporal understanding for complex tasks like long video and image analysis.

  2. Three white icons on a blue-to-green gradient background: the first icon shows a circle with connected nodes, the second shows a circuit, and the third shows a flowchart

    BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI

    November 11, 2025

    BlueCodeAgent is an end-to-end blue-teaming framework built to boost code security using automated red-teaming processes, data, and safety rules to guide LLMs’ defensive decisions. Dynamic testing reduces false positives in vulnerability detection.

  3. Three white icons on a blue-to-purple gradient background: the first icon shows a node cluster, the second shows a person in front of a screen with another person, the third is a magnifying glass

    When industry knowledge meets PIKE-RAG: The innovation behind Signify’s customer service boost

    November 6, 2025 | Industry Innovation Center

    A collaboration between Signify and Microsoft Research shows how PIKE-RAG improves enterprise knowledge systems, delivering a 12% increase in accuracy and faster, more reliable answers.

  4. Four white icons on a blue-to-purple gradient background: the first icon shows a node cluster, the second shows two persons, the third is a building, and the fourth is a location pin

    Magentic Marketplace: an open-source simulation environment for studying agentic markets

    November 5, 2025

    AI agents are poised to transform digital marketplaces. To explore what can happen when AI agents interact and transact at scale, we built Magentic Marketplace, an open-source simulation environment for studying agentic market designs.

  5. Three white icons on a gradient background transitioning from blue to purple to pink. From left to right: a globe with a magnifying glass representing internet search, a central circle connected to smaller circles symbolizing network connectivity, and a checklist with two checkmarks and one empty box indicating task management.

    Tool-space interference in the MCP era: Designing for agent compatibility at scale

    September 11, 2025 | Adam Fourney, Tyler Payne, Maya Murad, and Saleema Amershi

    As agentic AI ushers in a new era marked by tool expansion, systems are converging, and complexity is rising. Microsoft Research explores the Model Context Protocol (MCP) as a new standard for agent collaboration across fragmented tool ecosystems.

  6. Three white icons on a gradient background transitioning from blue to green. From left to right: network node icon, lightbulb-shaped icon with a path tool icon in the center; a monitor icon showing a web browser icon

    RenderFormer: How neural networks are reshaping 3D rendering

    September 10, 2025 | Yue Dong

    RenderFormer, from Microsoft Research, is the first model to show that a neural network can learn a complete graphics rendering pipeline. It’s designed to support full-featured 3D rendering using only machine learning—no traditional graphics computation required.

  7. Two white line icons on a gradient background transitioning from blue to pink. From left to right: icon representing a set of gears; an icon representing three connected nodes each containing a user icon

    Breaking the networking wall in AI infrastructure

    September 9, 2025 | Paolo Costa

    Datacenter memory and network limits are restraining AI system performance. MOSAIC uses microLEDs and a wide-and-slow optical architecture to deliver faster, longer, more reliable, and energy efficient connections that could transform AI cluster designs.