Debugging, testing, and monitoring LLM applications.
## LangSmith MCP Server: LLM Application Observability The **LangSmith MCP Server** integrates LangChain's observability platform into Google Antigravity. This tool provides comprehensive tracing, evaluation, and monitoring for LLM applications, helping developers understand and improve their AI systems. ### Why LangSmith MCP? LangSmith provides essential LLM observability: - **Full Tracing**: Visualize every step of LLM chains - **Evaluation**: Systematic testing of LLM outputs - **Debugging**: Identify issues in complex pipelines - **Monitoring**: Track production performance - **Antigravity Native**: AI-assisted debugging ### Key Features #### 1. Automatic Tracing ```python import os from langsmith import traceable from langchain_openai import ChatOpenAI os.environ["LANGCHAIN_TRACING_V2"] = "true" os.environ["LANGCHAIN_API_KEY"] = "your-langsmith-key" @traceable def my_pipeline(question: str) -> str: llm = ChatOpenAI() # Every call is automatically traced response = llm.invoke(question) return response.content # View traces in LangSmith dashboard result = my_pipeline("What is the capital of France?") ``` #### 2. Evaluation Datasets ```python from langsmith import Client client = Client() # Create evaluation dataset dataset = client.create_dataset("QA Evaluation") client.create_examples( inputs=[{"question": "What is 2+2?"}], outputs=[{"answer": "4"}], dataset_id=dataset.id ) # Run evaluations from langchain.smith import run_on_dataset results = run_on_dataset( client=client, dataset_name="QA Evaluation", llm_or_chain_factory=create_chain, evaluation=evaluators ) ``` #### 3. Custom Evaluators ```python from langsmith.evaluation import StringEvaluator @StringEvaluator def check_correctness(run, example): prediction = run.outputs["answer"] reference = example.outputs["answer"] # Custom evaluation logic score = calculate_similarity(prediction, reference) return { "key": "correctness", "score": score, "comment": f"Similarity: {score:.2f}" } ``` ### Configuration ```json { "mcpServers": { "langsmith": { "command": "npx", "args": ["-y", "@anthropic/mcp-langsmith"], "env": { "LANGCHAIN_API_KEY": "ls__your-key", "LANGCHAIN_PROJECT": "your-project" } } } } ``` ### Use Cases **Debugging**: Trace through complex LLM chains to identify where outputs go wrong. **Regression Testing**: Build evaluation datasets to catch regressions in prompt changes. **Production Monitoring**: Track latency, costs, and success rates of production LLM calls. The LangSmith MCP Server brings comprehensive LLM observability to Antigravity development.
{
"mcpServers": {
"langsmith": {}
}
}