Skip to main content
Open In ColabOpen on GitHub

LangChain PlainID Integration Guide

Installation

Based on your environment, you can install the library using pip:

!pip install langchain_plainid

Setup with PlainID

  1. Retrieve your PlainID credentials to access the platform - client ID and client secret.
  2. Find your PlainID base URL. For production use: https://platform-product.us1.plainid.io (note: starts with platform-product.)

Security Note: Do not store credentials in code. Use environment variables or secret managers.

Category Filtering Setup in PlainID

You need to configure the relevant Ruleset in PlainID. For example, for template categories:

# METADATA
# custom:
# plainid:
# kind: Ruleset
# name: All
ruleset(asset, identity, requestParams, action) if {
asset.template == "categories"
}

Also configure asset types like contract, HR in PlainID.

from langchain_plainid import PlainIDCategorizer, PlainIDPermissionsProvider

permissions_provider = PlainIDPermissionsProvider(
client_id="your_client_id",
client_secret="your_client_secret",
base_url="https://platform-product.us1.plainid.io",
plainid_categories_resource_type="categories"
)

plainid_categorizer = PlainIDCategorizer(
classifier_provider=<classifier>,
permissions_provider=permissions_provider
)

query = "I'd like to know the weather forecast for today"
result = plainid_categorizer.invoke(query)

Category Classifiers

1. LLMCategoryClassifierProvider

from langchain_plainid import LLMCategoryClassifierProvider
llm_classifier = LLMCategoryClassifierProvider(llm=OllamaLLM(model="llama2"))

2. ZeroShotCategoryClassifierProvider

from langchain_plainid import ZeroShotCategoryClassifierProvider
zeroshot_classifier = ZeroShotCategoryClassifierProvider()

Anonymizer Setup in PlainID

Example Ruleset for entities template:

# METADATA
# custom:
# plainid:
# kind: Ruleset
# name: PERSON
ruleset(asset, identity, requestParams, action) if {
asset.template == "entities"
asset["path"] == "PERSON"
action.id in ["MASK"]
}
from langchain_plainid import PlainIDPermissionsProvider, PlainIDAnonymizer

permissions_provider = PlainIDPermissionsProvider(
client_id="your_client_id",
client_secret="your_client_secret",
base_url="https://platform-product.us1.plainid.io",
plainid_entities_resource_type="entities"
)

plainid_anonymizer = PlainIDAnonymizer(
permissions_provider=permissions_provider,
encrypt_key="your_encryption_key"
)

query = "What's the name of the person who is responsible for the contract?"
result = plainid_anonymizer.invoke(query)

Full Chain Example

chain = plainid_categorizer | llm | vector_store | plainid_anonymizer | output_parser

Retriever Setup

Rulesets for customer template:

ruleset(asset, identity, requestParams, action) if {
asset.template == "customer"
asset["country"] == "Sweden"
asset["country"] != "Russia"
contains(asset["country"], "we")
startswith(asset["country"], "Sw")
}
ruleset(asset, identity, requestParams, action) if {
asset.template == "customer"
asset["country"] in ["aaa", "bbb"]
asset["age"] <= 11111
endswith(asset["country"], "wwww")
}
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document
from langchain_plainid import PlainIDRetriever

documents = [
Document("Stockholm is the capital of Sweden.", metadata={"country": "Sweden", "age": 5}),
Document("Oslo is the capital of Norway.", metadata={"country": "Norway", "age": 5}),
Document("Copenhagen is the capital of Denmark.", metadata={"country": "Denmark", "age": 5}),
Document("Helsinki is the capital of Finland.", metadata={"country": "Finland", "age": 5}),
Document("Malmö is a city in Sweden.", metadata={"country": "Sweden", "age": 5}),
]

vector_store = Chroma.from_documents(documents, embeddings)
plainid_retriever = PlainIDRetriever(vectorstore=vector_store, filter_provider=filter_provider)
docs = plainid_retriever.invoke("What is the capital of Sweden?")

Supported Vector Stores and Limitations

Vector StoreNot Supported Operators
FAISSSTARTSWITH, ENDSWITH, CONTAINS
ChromaIN, NOT_IN, STARTSWITH, ENDSWITH, CONTAINS