Prakul Agarwal

3 results

Introducing Semantic Caching and a Dedicated MongoDB LangChain Package for Gen AI Apps

We are in an unprecedented time in history where developers can build transformative AI applications quickly, without being AI experts themselves. This ability is enabling new classes of applications that can better serve customers with conversational AI for assistance and automation, advanced reasoning and analysis using AI-powered retrieval, and recommendation systems. Behind this revolution are large language models (LLMs) that can be prompted to solve for a wide range of use cases. However, LLMs have various limitations, like knowledge cutoff and a tendency to hallucinate. To overcome these limitations, they must be integrated with proprietary enterprise data sources to build reliable, relevant, and high-quality generative AI applications. That’s where MongoDB plays a critical role in the modern generative AI stack. Developers use MongoDB Atlas Vector Search as a vital part of the generative AI technique known as retrieval-augmented generation (RAG). RAG is the process of feeding LLMs the supplementary data necessary to ground their responses, ensuring they're dependable and precise. LangChain has been a critical part of this journey since the public launch of Atlas Vector Search, enabling developers to build better retriever systems powered by vector search and store conversation history in the operational database. Today, we are excited to announce support for two enhancements: Semantic cache powered by Atlas vector search, which improves the performance of your apps A dedicated LangChain-MongoDB package for Python and JS/TS developers, enabling them to build advanced applications even more efficiently The MongoDB Atlas integration with LangChain can now power all the database requirements for building modern generative AI applications: vector search, semantic caching (currently only available in Python), and conversation history. Earlier, we announced the launch of MongoDB LangChain Templates , which enable the developers to quickly deploy RAG applications, and provided a reference implementation of a basic RAG template using MongoDB Atlas Vector Search and OpenAI and a more advanced Parent-document Retrieval RAG template using MongoDB Atlas Vector Search. We are excited about our partnership with LangChain and will continue innovating. Improve LLM application performance with semantic cache Semantic cache improves the performance of LLM applications by caching responses based on the semantic meaning or context within the queries themselves. This is different from a traditional cache that works based on exact keyword matching. In the era of LLM the value of semantic cache is increasing tremendously, enabling sophisticated user experiences that closely mimic human interactions. For example, if two different users enter two different prompts, “give me suggestions for a comedy movie” and “recommend a comedy movie”, the semantic cache can understand that the intent behind the queries are same and return a similar response, even though different keywords are used, whereas a traditional cache will fail. Figure 1: Semantic cache using MongoDB Atlas Vector Search Check out this video walkthrough for the semantic cache: Accelerate development with a dedicated package With a dedicated LangChain-MongoDB package, MongoDB is even more deeply integrated with LangChain. The Python and Javascript packages contain the following LangChain Integrations: MongoDBAtlasVectorSearch ( Vector stores ) and MongoDBChatMessageHistory ( Chat Messages Memory ). In addition, the Python package includes the MongoDBAtlasSemanticCache ( LLM Caching ). The new package langchain-mongodb contains all the MongoDB-specific implementations and needs to be installed separately from langchain, which includes all the core abstractions. Earlier, everything was in the same package, making it challenging to correctly version and communicate what version should be used and whether any breaking changes were made. Find out more about the langchain-mongodb package: Python: Source code , LangChain docs , MongoDB docs Javascript: Source code , LangChain.js docs , MongoDB docs Get started today Check out this accompanying tutorial and notebook on building advanced RAG with MongoDB and LangChain, which contains a walkthrough and use cases for using semantic cache, vector search, and chat message history. Check out the “ PDFtoChat ” app to see langchain-mongodb JS in action. It allows you to have a conversation with your proprietary PDFs using AI and is built with MongoDB Atlas, LangChain.js, and TogetherAI. It’s an end-to-end SaaS-in-a-box app and includes user authentication, saving PDFs, and saving chats per PDF. Read the excellent overview of semantic caching using LangChain and MongoDB.

March 20, 2024

Announcing LangChain Templates for MongoDB Atlas

Since announcing the public preview of MongoDB Atlas Vector Search back in June, we’ve seen tremendous adoption by developers working to build AI-powered applications. The ability to store, index, and query vector embeddings right alongside their operational data in a single, unified platform dramatically boosts engineering velocity while keeping their technology footprint streamlined and efficient. Atlas Vector Search is used by developers as a key part of the Retrieval-Augmented Generation (RAG) pattern. RAG is used to feed LLMs with the additional data they need to ground their responses, providing outputs that are reliable, relevant, and accurate for the business. One of the key enabling technologies being used to bring external data into LLMs is LangChain. Just one example is healthcare innovator Inovaare who is building AI with MongoDB and LangChain for document classification, information extraction and enrichment, and chatbots over medical data. Now making it even easier for developers to build AI-powered apps, we are excited to announce our partnership with LangChain in the launch of LangChain Templates ! We have worked with LangChain to create a RAG template using MongoDB Atlas Vector Search and OpenAI . This easy-to-use template can help developers build and deploy a Chatbot application over their own proprietary data. LangChain Templates offer a reference architecture that’s easily deployable as a REST API using LangServe . We have also been working with LangChain to release the latest features of Atlas Vector Search, like the recently announced dedicated vector search aggregation stage $vectorSearch, to both the MongoDB LangChain python integration as well as the MongoDB LangChain Javascript integration . Similarly, we will continue working with LangChain to create more templates, that will allow developers to bring their ideas to production faster. If you’re building AI-powered apps on MongoDB, we’d love to hear from you. Sign up to our AI Innovators program where successful applicants receive no-cost MongoDB Atlas credits to develop apps, access to technical resources, and the opportunity to showcase your work to the broader AI community.

November 2, 2023

MongoDB Atlas Vector Search Makes Real-Time AI a Reality with Confluent

Today, we’re excited to announce our new integration with Confluent Cloud . MongoDB Atlas Vector Search users now have simple access to data streams across their entire business, enabling them to build cutting-edge Generative AI applications that are grounded in a real-time, contextual, and trustworthy knowledge base. Think of an application like ChatGPT, but if it knew everything about your private enterprise data, including constant awareness of what’s happening in the world and your business right now. Atlas Vector Search allows you to search intelligently across any unstructured data, using the power of Large Language models (LLMs). With Confluent’s data streaming platform, you can provide a continuous supply of AI-ready data for the development of sophisticated customer experiences, bridging the gap between legacy data systems and the modern data stack. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. High-value, trusted AI applications require real-time data Real-time AI needs real-time data from across your organization. The promise of real-time AI is only unlocked when models have all the freshest contextual data they need to respond just in time with the most accurate, relevant, and helpful information. However, building these real-time data connections across on-prem, multi-cloud, public, and private cloud environments for AI use cases is not trivial. Traditional data integration and processing tools are batch-based and inflexible, creating an untenable number of tightly coupled point-to-point connections that are hard to scale and lack governance. As a result, the data made available is stale and of low fidelity. This introduces unavoidable latency into the AI application and may outright block implementation altogether. The difficulty in gaining access to high-quality, ready-to-use, contextual, and trustworthy data in real-time is hindering developer agility and the pace of AI innovation. Confluent's data streaming platform fuels MongoDB Atlas Vector Search with real-time data With the MongoDB Kafka Connector , users can easily configure MongoDB Atlas as a destination for customer 360 data from Confluent Cloud. This data is converted into vector embeddings using various machine learning models (OpenAI, HuggingFace, and more) and orchestrated by Atlas Triggers. Then using Atlas Vector Search, this data can be indexed and searched efficiently to power use cases such as semantic search, recommendation engines, Q&A systems, and many others. We demonstrate a Chatbot for e-commerce that will allow users to ask natural language questions to discover what they need and then get recommendations on products to buy that suit their preferences. Some of the data required in this scenario includes the currently available inventory, the shipping options, and their session browsing history. The users can refine their product recommendations using a conversational interface, all the while ensuring that the products being recommended are rooted in real-time data. The benefits of being able to effectively use real-time data are immense, almost critical, in this scenario, since recommending a product that’s not available or can’t be delivered to a user’s location in the time frame they require would mean a lost sale and a dissatisfied customer. The inventory data is rapidly changing - products go in and out of stock constantly. Hence the customer chat/assistant application will need to quickly come up with new sets of recommendations. With Confluent, MongoDB Atlas Vector Search users can break down data silos, promote data reusability, improve engineering agility, and foster greater trust throughout their organization. This allows more teams to securely and confidently unlock the full potential of all their data with MongoDB Atlas Vector Search. Confluent enables organizations to make real-time contextual inferences on an astonishing amount of data by bringing well-curated, trustworthy streaming data to AI systems, vector databases, and AI-powered applications. With easy access to data streams from across their entire business, MongoDB Atlas Vector Search users can now: Create a real-time knowledge base: Build a shared source of real-time truth for all your operational and analytical data, no matter where it lives for sophisticated model building and fine-tuning Bring real-time context at query time: Convert raw data into meaningful chunks with real-time enrichment and continually update your embedding databases for your GenAI use cases Build governed, secured, and trusted AI: Establish data lineage, quality, and traceability, providing all your teams with a clear understanding of data origin, movement, transformations, and usage Experiment, scale, and innovate faster: Reduce innovation friction as new AI apps and models become available. Decouple data from your data science tools and production AI apps to test and build faster MongoDB Atlas Vector Search and Confluent enable simple development of real-time AI applications Our new Confluent integration enables all your teams to tap into a continuously enriched real-time knowledge base, so they can quickly scale and build AI-enabled applications using trusted data streams. Here’s a demo video to demonstrate how this works: Getting started Get started by creating a MongoDB Atlas account if you don't already have one. Just click on “Register.” MongoDB offers a free-forever Atlas cluster in the public cloud service of your choice. To learn more about Atlas Vector Search, visit the product page . Not yet a Confluent customer? Start your free trial of Confluent Cloud today. New sign-ups receive $400 to spend during their first 30 days—no credit card required.

September 26, 2023