database

RSS Crawler MCP Server MCP Server

RSS with SQLite storage and Firecrawl

rsscrawlersqlitefirecrawl

About

## RSS Crawler MCP Server: Deep Feed Discovery The **RSS Crawler MCP Server** integrates RSS/Atom feed discovery and crawling into Google Antigravity, enabling automatic feed detection, website crawling, and content indexing directly from your development environment. ### Why RSS Crawler MCP? - **Auto-Discovery**: Automatically find RSS/Atom feeds on any website - **Deep Crawling**: Discover feeds across entire domains and subdomains - **Link Extraction**: Extract and follow links to discover related content - **Sitemap Parsing**: Parse sitemaps to discover all available feeds - **Rate Limiting**: Respectful crawling with configurable rate limits ### Key Features #### 1. Feed Discovery ```python # Discover feeds on a website feeds = await mcp.discover_feeds( url="https://example.com", follow_links=True, max_depth=2 ) for feed in feeds: print(f"Found: {feed['url']}") print(f" Type: {feed['type']}") # rss, atom, json-feed print(f" Title: {feed['title']}") ``` #### 2. Domain Crawling ```python # Crawl entire domain for feeds results = await mcp.crawl_domain( domain="example.com", include_subdomains=True, max_pages=1000, respect_robots=True ) print(f"Pages crawled: {results['pages_crawled']}") print(f"Feeds found: {len(results['feeds'])}") for feed in results["feeds"]: print(f" - {feed['url']}") ``` #### 3. Sitemap Processing ```python # Parse sitemap for content pages = await mcp.parse_sitemap( sitemap_url="https://example.com/sitemap.xml" ) # Find feeds from sitemap pages feeds = await mcp.discover_from_sitemap( sitemap_url="https://example.com/sitemap.xml", check_feed_links=True ) ``` #### 4. Batch Discovery ```python # Discover feeds from multiple sites sites = ["https://site1.com", "https://site2.com", "https://site3.com"] results = await mcp.batch_discover( urls=sites, concurrent=5 ) for site, feeds in results.items(): print(f"{site}: {len(feeds)} feeds found") ``` ### Configuration ```json { "mcpServers": { "rss-crawler": { "command": "npx", "args": ["-y", "@anthropic/mcp-rss-crawler"], "env": { "CRAWLER_USER_AGENT": "RSSCrawler/1.0", "CRAWLER_RATE_LIMIT": "1", "CRAWLER_MAX_CONCURRENT": "5", "CRAWLER_TIMEOUT": "30000" } } } } ``` ### Use Cases **Feed Database**: Build a comprehensive database of RSS feeds across specific industries. **Content Discovery**: Find new content sources for aggregation and monitoring. **Competitive Analysis**: Discover all content channels from competitor websites. **SEO Research**: Analyze content structure and publishing patterns across domains. The RSS Crawler MCP enables feed discovery and content crawling within your development environment.

Installation

Configuration

{
  "mcpServers": {
    "rss-crawler": {
      "mcpServers": {
        "rss-crawler": {
          "args": [
            "-y",
            "rss-crawler-mcp"
          ],
          "command": "npx"
        }
      }
    }
  }
}