Docs Menu
Docs Home
/
Atlas
/ / /

Use Vertex AI Extensions for Natural Language MongoDB Queries

Note

Vertex AI Extensions are in preview and subject to change. Contact your Google Cloud representative to learn how to access this feature.

In addition to using Vertex AI with Atlas Vector Search to implement RAG, you can use Vertex AI Extensions to further customize how you use Vertex AI models to interact with Atlas. In this tutorial, you create a Vertex AI Extension that allows you to query your data in Atlas in real-time by using natural language.

Diagram of workflow with Vertex AI Extensions and MongoDB Atlas
click to enlarge

This tutorial uses the following components to enable natural language querying with Atlas:

  • Google Cloud Vertex AI SDK to manage AI models and enable custom extensions for Vertex AI. This tutorial uses the Gemini 1.5 Pro model.

  • Google Cloud Run to deploy a function that serves as an API endpoint between Vertex AI and Atlas.

  • OpenAPI 3 Specification for MongoDB API to define how natural language queries map to MongoDB operations. To learn more, see OpenAPI Specification.

  • Vertex AI Extensions to enable real-time interaction with Atlas from Vertex AI and configure how natural language queries are processed.

  • Google Cloud Secrets Manager to store your MongoDB API keys.

Note

For detailed code and set-up instructions, see the GitHub repository for this example.

Before you start, you must have the following:

  • A MongoDB Atlas account. To sign up, use the Google Cloud Marketplace or register a new account.

  • An Atlas cluster with the sample dataset loaded. To learn more, see Create a Cluster.

  • A Google Cloud project.

  • A Google Cloud Storage bucket for storing the OpenAPI specification.

  • The following APIs enabled for your project:

    • Cloud Build API

    • Cloud Functions API

    • Cloud Logging API

    • Cloud Pub/Sub API

  • A Colab Enterprise environment.

In this section, you create a Google Cloud Run function that serves as an API endpoint between Vertex AI Extension and your Atlas cluster. The function handles authentication, connects to your Atlas cluster, and performs database operations based on the requests from Vertex AI.

1

In the Google Cloud console, open the Cloud Run page and click Write a function.

2
  1. Specify a function name and Google Cloud region where you want to deploy your function.

  2. Select the latest Python version available as a Runtime.

  3. In the Authentication section, select Allow unauthenticated invocations.

  4. Use the default values for the remaining settings, and then click Next.

For detailed configuration steps, refer to the Cloud Run documentation.

3

Paste the following code into their respective files:

4
  1. Rename the Entry Point as mongodb_crud.

  2. Click Deploy to deploy the function.

  3. Copy and store the HTTPS Endpoint for triggering the Cloud Function locally.

  4. Navigate to the Details page for the function and copy and store the service account name used by the function.

In this section, you create a Vertex AI Extension that enables natural language querying on your data in Atlas by using the Gemini 1.5 Pro model. This extension uses an OpenAPI specification and the Cloud Run function you created to map natural language to database operations and query your data in Atlas.

To implement this extension, you use an interactive Python notebook, which allows you to run Python code snippets individually. For this tutorial, you create a notebook named mongodb-vertex-ai-extension.ipynb in an Colab Enterprise environment.

Copy and paste the following code into your notebook.

1
  1. Authenticate your Google Cloud account and set the project ID.

    from google.colab import auth
    auth.authenticate_user("GCP project id")
    !gcloud config set project {"GCP project id"}
  2. Install the required dependencies.

    !pip install --force-reinstall --quiet google_cloud_aiplatform
    !pip install --force-reinstall --quiet langchain==0.0.298
    !pip install --upgrade google-auth
    !pip install bigframes==0.26.0
  3. Restart the kernel.

    import IPython
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
  4. Set the environment variables.

    Replace the sample values with the correct values that correspond to your project.

    import os
    # These are sample values; replace them with the correct values that correspond to your project
    os.environ['PROJECT_ID'] = 'gcp project id' # GCP Project ID
    os.environ['REGION'] = "us-central1" # Project Region
    os.environ['STAGING_BUCKET'] = "gs://vertexai_extensions" # GCS Bucket location
    os.environ['EXTENSION_DISPLAY_HOME'] = "MongoDb Vertex API Interpreter" # Extension Config Display Name
    os.environ['EXTENSION_DESCRIPTION'] = "This extension makes api call to mongodb to do all crud operations" # Extension Config Description
    os.environ['MANIFEST_NAME'] = "mdb_crud_interpreter" # OPEN API Spec Config Name
    os.environ['MANIFEST_DESCRIPTION'] = "This extension makes api call to mongodb to do all crud operations" # OPEN API Spec Config Description
    os.environ['OPENAPI_GCS_URI'] = "gs://vertexai_extensions/mongodbopenapispec.yaml" # OPEN API GCS URI
    os.environ['API_SECRET_LOCATION'] = "projects/787220387490/secrets/mdbapikey/versions/1" # API KEY secret location
    os.environ['LLM_MODEL'] = "gemini-1.5-pro" # LLM Config
2

Download the Open API specification from GitHub and upload the YAML file to the Google Cloud Storage bucket.

from google.cloud import aiplatform
from google.cloud.aiplatform.private_preview import llm_extension
PROJECT_ID = os.environ['PROJECT_ID']
REGION = os.environ['REGION']
STAGING_BUCKET = os.environ['STAGING_BUCKET']
aiplatform.init(
project=PROJECT_ID,
location=REGION,
staging_bucket=STAGING_BUCKET,
)
3

The following manifest is a structured JSON object that configures key components for the extension. Replace <service-account> with the service account name used by your Cloud Run function.

from google.cloud import aiplatform
from vertexai.preview import extensions
mdb_crud = extensions.Extension.create(
display_name = os.environ['EXTENSION_DISPLAY_HOME'],
# Optional.
description = os.environ['EXTENSION_DESCRIPTION'],
manifest = {
"name": os.environ['MANIFEST_NAME'],
"description": os.environ['MANIFEST_DESCRIPTION'],
"api_spec": {
"open_api_gcs_uri": (
os.environ['OPENAPI_GCS_URI']
),
},
"authConfig": {
"authType": "OAUTH",
"oauthConfig": {"service_account": "<service-account>"}
},
},
)
mdb_crud
4

Validate the extension and print the operation schema and parameters:

print("Name:", mdb_crud.gca_resource.name)
print("Display Name:", mdb_crud.gca_resource.display_name)
print("Description:", mdb_crud.gca_resource.description)
import pprint
pprint.pprint(mdb_crud.operation_schemas())

In Vertex AI, click Extensions in the left navigation menu. Your new extension named MongoDB Vertex API Interpreter appears in the list of extensions.

The following examples demonstrates two different natural language queries you can use to query your data in Atlas:

Back

Google Vertex AI