Data version control and ML pipeline management.
## DVC MCP Server: Data Version Control for ML Projects The **DVC MCP Server** integrates Data Version Control into Google Antigravity, enabling Git-like version control for machine learning datasets, models, and experiments. This open-source tool tracks ML artifacts alongside your code for fully reproducible machine learning. ### Why DVC MCP? DVC solves ML reproducibility challenges: - **Data Versioning**: Track datasets like code - **Experiment Tracking**: Compare ML experiments easily - **Pipeline Management**: Define reproducible workflows - **Storage Agnostic**: Works with any cloud storage - **Git Integration**: Seamless workflow with existing repos ### Key Features #### 1. Data Versioning ```bash # Initialize DVC in your project dvc init # Track large data files dvc add data/training-dataset.csv git add data/training-dataset.csv.dvc .gitignore git commit -m "Add training dataset v1" # Push data to remote storage dvc push ``` #### 2. ML Pipelines ```yaml # dvc.yaml - Define reproducible pipelines stages: preprocess: cmd: python preprocess.py deps: - data/raw - preprocess.py outs: - data/processed train: cmd: python train.py deps: - data/processed - train.py outs: - models/model.pkl metrics: - metrics.json ``` #### 3. Experiment Tracking ```bash # Run experiments with different parameters dvc exp run -S train.learning_rate=0.01 dvc exp run -S train.learning_rate=0.001 # Compare experiments dvc exp show # Apply best experiment to workspace dvc exp apply exp-12345 ``` ### Configuration ```json { "mcpServers": { "dvc": { "command": "npx", "args": ["-y", "@anthropic/mcp-dvc"], "env": { "DVC_REMOTE": "s3://your-bucket/dvc-storage", "AWS_ACCESS_KEY_ID": "your-key", "AWS_SECRET_ACCESS_KEY": "your-secret" } } } } ``` ### Use Cases **Dataset Management**: Version large datasets and share them across team members without bloating Git repositories. **Experiment Comparison**: Track hyperparameters, metrics, and model artifacts for systematic ML experimentation. **Reproducible Training**: Anyone can reproduce your exact training environment with `dvc repro`. The DVC MCP Server brings proper version control to ML workflows in Antigravity for reproducible machine learning.
{
"mcpServers": {
"dvc": {}
}
}