Google Antigravity for Python Developers: Data Science, ML & Jupyter Workflows | Antigravity AI Directory | Google Antigravity Directory
Google Antigravity for Python Developers: Data Sci... Tutorials Google Antigravity for Python Developers: Data Science, ML & Jupyter Workflows Google Antigravity for Python Developers: Data Science, ML & Jupyter Workflows
Google Antigravity isn't just for web developers. Python programmers and data scientists can leverage Gemini 3 for powerful AI-assisted workflows. This guide covers everything from environment setup to advanced ML pipelines.
Setting Up Python in Antigravity
Python Extension Installation
Open Extensions panel (Cmd+Shift+X)
Install these essential extensions:
ms-python.python
ms-python.vscode-pylance
ms-python.black-formatter
charliermarsh.ruff
ms-toolsai.jupyter
Configure Python Settings
// .antigravity/settings.json
{
"python.defaultInterpreterPath": ".venv/bin/python",
"python.analysis.typeCheckingMode": "basic",
"python.analysis.autoImportCompletions": true,
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter",
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": true
}
},
"ruff.enable": true,
"ruff.organizeImports": true
}
Virtual Environment Setup
# Create virtual environment
python -m venv .venv
# Activate (macOS/Linux)
source .venv/bin/activate
# Activate (Windows)
.venv\\Scripts\\activate
# Install data science stack
pip install pandas numpy matplotlib seaborn scikit-learn jupyter
Jupyter Notebook Integration
Native Jupyter Support
Antigravity has built-in Jupyter support. Create a .ipynb file and start coding.
AI-Assisted Notebook Development
You: "Create a Jupyter notebook for exploratory data analysis of a CSV file"
Gemini 3 generates:
# Cell 1: Imports and Setup
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
# Configure display
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
plt.style.use('seaborn-v0_8-whitegrid')
%matplotlib inline
Related Posts Antigravity + Supabase: Build a Complete App in 30 Minutes (Step-by-Step Tutorial) Build a full-stack application with Google Antigravity and Supabase in under 30 minutes. This hands-on tutorial covers authentication, database, real-time subscriptions, and deployment—complete with copy-paste prompts.
Next.js 15 + Antigravity: The Ultimate React Developer Workflow (2025) Master the perfect Next.js 15 development workflow with Google Antigravity. Learn Server Components, Server Actions, App Router patterns, and production-ready prompts that 10x your React development speed.
Antigravity for Python Developers: Django, FastAPI & Flask Complete Workflows (2025) Master Google Antigravity for Python development. Complete guide covering Django, FastAPI, and Flask workflows with production-ready prompts, MCP server integrations, and real-world examples.
# Cell 2: Load Data
def load_data(filepath: str) -> pd.DataFrame:
"""Load CSV file with automatic type inference."""
df = pd.read_csv(filepath, parse_dates=True, infer_datetime_format=True)
print(f"Loaded {len(df):,} rows and {len(df.columns)} columns")
return df
# Load your data
df = load_data("data/dataset.csv")
df.head()
# Cell 3: Data Overview
def data_overview(df: pd.DataFrame) -> None:
"""Print comprehensive data overview."""
print("=" * 50)
print("DATASET OVERVIEW")
print("=" * 50)
print(f"\nShape: {df.shape[0]:,} rows × {df.shape[1]} columns")
print("\nColumn Types:")
print(df.dtypes.value_counts())
print("\nMissing Values:")
missing = df.isnull().sum()
missing = missing[missing > 0].sort_values(ascending=False)
if len(missing) > 0:
print(missing)
else:
print("No missing values!")
print("\nNumeric Summary:")
print(df.describe())
data_overview(df)
# Cell 4: Visualizations
def create_distribution_plots(df: pd.DataFrame, numeric_cols: list = None):
"""Create distribution plots for numeric columns."""
if numeric_cols is None:
numeric_cols = df.select_dtypes(include=[np.number]).columns.tolist()
n_cols = min(3, len(numeric_cols))
n_rows = (len(numeric_cols) + n_cols - 1) // n_cols
fig, axes = plt.subplots(n_rows, n_cols, figsize=(5*n_cols, 4*n_rows))
axes = axes.flatten() if n_rows * n_cols > 1 else [axes]
for i, col in enumerate(numeric_cols):
sns.histplot(df[col], kde=True, ax=axes[i])
axes[i].set_title(f'Distribution of {col}')
# Hide empty subplots
for j in range(i + 1, len(axes)):
axes[j].set_visible(False)
plt.tight_layout()
plt.show()
create_distribution_plots(df)
Running Notebooks with Gemini 3
Execute cells with AI context - Gemini understands previous cells
Debug errors inline - Ask "Why is this cell failing?"
Generate visualizations - "Add a correlation heatmap for numeric columns"
Pandas with AI Assistance
Data Manipulation You: "Clean this dataframe: remove duplicates, fill missing values with median for numeric columns, and standardize column names"
def clean_dataframe(df: pd.DataFrame) -> pd.DataFrame:
"""
Comprehensive dataframe cleaning.
- Removes duplicates
- Fills missing numeric values with median
- Standardizes column names to snake_case
"""
# Create a copy
df_clean = df.copy()
# Standardize column names
df_clean.columns = (
df_clean.columns
.str.lower()
.str.replace(' ', '_')
.str.replace('[^a-z0-9_]', '', regex=True)
)
# Remove duplicates
initial_rows = len(df_clean)
df_clean = df_clean.drop_duplicates()
print(f"Removed {initial_rows - len(df_clean):,} duplicate rows")
# Fill missing numeric values with median
numeric_cols = df_clean.select_dtypes(include=[np.number]).columns
for col in numeric_cols:
missing_count = df_clean[col].isnull().sum()
if missing_count > 0:
median_val = df_clean[col].median()
df_clean[col] = df_clean[col].fillna(median_val)
print(f"Filled {missing_count} missing values in '{col}' with median ({median_val:.2f})")
return df_clean
df_clean = clean_dataframe(df)
Complex Transformations You: "Create a feature engineering pipeline for this customer dataset"
from sklearn.preprocessing import StandardScaler, LabelEncoder
from datetime import datetime
def engineer_features(df: pd.DataFrame) -> pd.DataFrame:
"""
Feature engineering for customer dataset.
"""
df_eng = df.copy()
# Date-based features
if 'signup_date' in df_eng.columns:
df_eng['signup_date'] = pd.to_datetime(df_eng['signup_date'])
df_eng['days_since_signup'] = (datetime.now() - df_eng['signup_date']).dt.days
df_eng['signup_month'] = df_eng['signup_date'].dt.month
df_eng['signup_dayofweek'] = df_eng['signup_date'].dt.dayofweek
# Aggregate features
if 'transaction_amount' in df_eng.columns:
customer_stats = df_eng.groupby('customer_id').agg({
'transaction_amount': ['sum', 'mean', 'count', 'std']
}).reset_index()
customer_stats.columns = ['customer_id', 'total_spend', 'avg_spend',
'transaction_count', 'spend_std']
df_eng = df_eng.merge(customer_stats, on='customer_id', how='left')
# Recency features
if 'last_purchase_date' in df_eng.columns:
df_eng['last_purchase_date'] = pd.to_datetime(df_eng['last_purchase_date'])
df_eng['days_since_purchase'] = (datetime.now() - df_eng['last_purchase_date']).dt.days
df_eng['is_active'] = (df_eng['days_since_purchase'] < 30).astype(int)
# Binning continuous variables
if 'age' in df_eng.columns:
df_eng['age_group'] = pd.cut(
df_eng['age'],
bins=[0, 25, 35, 50, 65, 100],
labels=['18-25', '26-35', '36-50', '51-65', '65+']
)
return df_eng
df_features = engineer_features(df_clean)
print(f"Created {len(df_features.columns) - len(df_clean.columns)} new features")
Machine Learning Workflows
Building ML Pipelines You: "Create a complete ML pipeline for predicting customer churn"
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score
import joblib
class ChurnPredictor:
"""End-to-end churn prediction pipeline."""
def __init__(self):
self.numeric_features = ['age', 'total_spend', 'days_since_purchase',
'transaction_count']
self.categorical_features = ['region', 'subscription_type']
self.pipeline = None
self.best_model = None
def build_pipeline(self, model):
"""Build preprocessing + model pipeline."""
numeric_transformer = StandardScaler()
categorical_transformer = OneHotEncoder(handle_unknown='ignore')
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, self.numeric_features),
('cat', categorical_transformer, self.categorical_features)
]
)
return Pipeline([
('preprocessor', preprocessor),
('classifier', model)
])
def train(self, X: pd.DataFrame, y: pd.Series):
"""Train and compare multiple models."""
models = {
'Logistic Regression': LogisticRegression(max_iter=1000),
'Random Forest': RandomForestClassifier(n_estimators=100, random_state=42),
'Gradient Boosting': GradientBoostingClassifier(random_state=42)
}
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
results = {}
for name, model in models.items():
pipeline = self.build_pipeline(model)
# Cross-validation
cv_scores = cross_val_score(pipeline, X_train, y_train, cv=5, scoring='roc_auc')
# Fit and evaluate
pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)
y_prob = pipeline.predict_proba(X_test)[:, 1]
results[name] = {
'cv_score': cv_scores.mean(),
'cv_std': cv_scores.std(),
'test_auc': roc_auc_score(y_test, y_prob),
'pipeline': pipeline
}
print(f"\n{name}:")
print(f" CV AUC: {cv_scores.mean():.4f} (+/- {cv_scores.std()*2:.4f})")
print(f" Test AUC: {roc_auc_score(y_test, y_prob):.4f}")
# Select best model
best_name = max(results, key=lambda x: results[x]['test_auc'])
self.pipeline = results[best_name]['pipeline']
self.best_model = best_name
print(f"\nBest Model: {best_name}")
print("\nClassification Report:")
print(classification_report(y_test, self.pipeline.predict(X_test)))
return results
def predict(self, X: pd.DataFrame) -> np.ndarray:
"""Predict churn probability."""
return self.pipeline.predict_proba(X)[:, 1]
def save(self, path: str):
"""Save trained pipeline."""
joblib.dump(self.pipeline, path)
print(f"Model saved to {path}")
def load(self, path: str):
"""Load trained pipeline."""
self.pipeline = joblib.load(path)
print(f"Model loaded from {path}")
# Usage
predictor = ChurnPredictor()
# Prepare features
X = df_features[predictor.numeric_features + predictor.categorical_features]
y = df_features['churned']
# Train and compare models
results = predictor.train(X, y)
# Save best model
predictor.save('models/churn_predictor.joblib')
Model Evaluation Visualizations You: "Add visualization for model evaluation"
from sklearn.metrics import roc_curve, precision_recall_curve
def plot_model_evaluation(y_true, y_prob, model_name: str):
"""Create comprehensive model evaluation plots."""
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
# ROC Curve
fpr, tpr, _ = roc_curve(y_true, y_prob)
auc = roc_auc_score(y_true, y_prob)
axes[0].plot(fpr, tpr, label=f'AUC = {auc:.3f}')
axes[0].plot([0, 1], [0, 1], 'k--')
axes[0].set_xlabel('False Positive Rate')
axes[0].set_ylabel('True Positive Rate')
axes[0].set_title('ROC Curve')
axes[0].legend()
# Precision-Recall Curve
precision, recall, _ = precision_recall_curve(y_true, y_prob)
axes[1].plot(recall, precision)
axes[1].set_xlabel('Recall')
axes[1].set_ylabel('Precision')
axes[1].set_title('Precision-Recall Curve')
# Probability Distribution
y_pred = (y_prob > 0.5).astype(int)
axes[2].hist(y_prob[y_true == 0], bins=50, alpha=0.5, label='No Churn')
axes[2].hist(y_prob[y_true == 1], bins=50, alpha=0.5, label='Churn')
axes[2].axvline(x=0.5, color='r', linestyle='--', label='Threshold')
axes[2].set_xlabel('Predicted Probability')
axes[2].set_ylabel('Count')
axes[2].set_title('Probability Distribution')
axes[2].legend()
plt.suptitle(f'Model Evaluation: {model_name}')
plt.tight_layout()
plt.show()
# Generate evaluation plots
y_prob = predictor.predict(X)
plot_model_evaluation(y, y_prob, predictor.best_model)
GEMINI.md for Python Projects # Python Project Configuration
## Environment
- Python 3.11+
- Virtual environment: .venv
- Package manager: pip
## Code Style
- Formatter: Black (line length 88)
- Linter: Ruff
- Type hints required for functions
## Data Science Conventions
- Use pandas for tabular data
- NumPy for numerical operations
- scikit-learn for ML pipelines
- Matplotlib/Seaborn for visualization
## Jupyter Notebooks
- Clear outputs before committing
- Use descriptive markdown headers
- Include docstrings in functions
## ML Best Practices
- Always use pipelines
- Cross-validate models
- Log experiments with MLflow
- Version data with DVC
MCP Servers for Python Enhance Python workflows with MCP integrations:
Jupyter MCP Server {
"mcpServers": {
"jupyter": {
"command": "npx",
"args": ["@modelcontextprotocol/server-jupyter"],
"env": {
"JUPYTER_TOKEN": "${JUPYTER_TOKEN}"
}
}
}
}
Data Sources {
"mcpServers": {
"postgres": {
"command": "npx",
"args": ["@modelcontextprotocol/server-postgres", "${DATABASE_URL}"]
},
"bigquery": {
"command": "npx",
"args": ["@modelcontextprotocol/server-bigquery"],
"env": {
"GOOGLE_PROJECT_ID": "${GCP_PROJECT}"
}
}
}
}
Conclusion Google Antigravity is a powerful environment for Python and data science work. With Gemini 3, you can:
Generate boilerplate data processing code
Build ML pipelines faster
Debug complex data issues
Create visualizations on demand
Work seamlessly with Jupyter notebooks
Set up your Python environment
Install recommended extensions
Configure GEMINI.md for your project
Connect data sources via MCP
Data science made faster with AI assistance. Focus on insights, not boilerplate.
Link Slot Gacor Mahjong333 Link Alternatif Resmi