← Today's Issue / AI / May 18, 2026
Developer tools

Gemini API File Search adds multimodal RAG with citations developers can show

Google’s File Search tool now handles images and text together, supports custom metadata filters, and returns page-level citations — useful upgrades for document-heavy AI apps.

Gemini API File Search is now multimodal: build efficient, verifiable RAG Google Keyword Blog 3 min
Gemini API File Search adds multimodal RAG with citations developers can show
Google’s official social image for the Gemini API File Search update.

Google has expanded the Gemini API’s File Search tool with three features that will sound familiar to anyone who has tried to build retrieval-augmented generation for real clients: multimodal support, custom metadata and page-level citations.

The update means File Search can now process images and text together, powered by Gemini Embedding 2. Google’s example is a creative agency searching an archive for a visual asset by emotional tone or visual style rather than filename. That is a good illustration of why multimodal retrieval matters. Many business knowledge bases are not just PDFs and Markdown files; they are screenshots, brand assets, scanned brochures, slide decks, diagrams and product imagery.

The second addition, custom metadata, is less flashy but probably more important in production. Developers can attach key-value labels to unstructured data — for example department, status, client, product line or document type — and then filter at query time. That helps avoid one of the classic RAG failure modes: the model retrieves a plausible but irrelevant chunk because the corpus is too broad. A support assistant should not answer from draft legal documents; a sales assistant should not pull from an obsolete pitch deck; a client portal should not cross-contaminate tenants.

The third update is page-level citations. When File Search pulls an answer from a large PDF, Google says it can now tie the response back to the page number for each piece of indexed information. That is essential for higher-trust use cases. Users are more likely to accept an AI-generated summary if they can click through to the precise page in the source file. Reviewers are more likely to approve an internal tool if they can audit where claims came from.

For a Laravel or agency team, this is the kind of platform feature that can reduce build time. Instead of wiring together separate embedding, storage, metadata filtering and citation layers, teams can prototype against a managed API. The trade-off is platform dependence and the need to evaluate retrieval quality carefully. “Has citations” does not automatically mean “is correct”; it means the answer is easier to check.

The practical use cases are immediate: client document portals, planning applications over PDFs and drawings, internal sales libraries, design asset search, policy assistants, onboarding knowledge bases, or support agents that must quote the right manual page.

The most useful RAG feature is often not a smarter answer. It is an answer the user can verify. Google’s update pushes Gemini’s file tooling in that direction.

· · ·