Hugging Face MCP Server Details

Hugging Face Official MCP Server connects your large language models (LLMs) to the Hugging Face Hub and thousands of Gradio AI Applications, enabling seamless MCP (Model Context Protocol) integration across multiple transports. It supports STDIO, SSE (to be deprecated but still commonly deployed), StreamableHTTP, and StreamableHTTPJson, with the Web Application allowing dynamic tool management and status updates. This MCP server is designed to be run locally or in Docker, and it provides integrations with Claude Desktop, Claude Code, Gemini CLI (and its extension), VSCode, and Cursor, making it easy to configure and manage MCP-enabled tools and endpoints. Tools such as hf_doc_search and hf_doc_fetch can be enabled to enhance document discovery, and an optional Authenticate tool can be included to handle OAuth challenges when called.

Use Case

The MCP Server acts as a bridge between LLM clients and MCP-enabled endpoints, orchestrating tool availability and communication across multiple transports. It is capable of running in STDIO, SSE, Streamable HTTP, or JSON-mode HTTP, allowing flexible deployments from local development to production-grade configurations. The Web UI lets you switch tools on and off, and the server can automatically enable document-related tools when document search is enabled. Example deployment patterns include installing via Claude or Gemini CLI, or integrating with VSCode or Cursor for seamless tooling within development environments.

Key usage patterns from the documentation include:

  • Running locally with npx to start in different modes:
  • npx @llmindset/hf-mcp-server       # Start in STDIO mode
    npx @llmindset/hf-mcp-server-http # Start in Streamable HTTP mode
    npx @llmindset/hf-mcp-server-json # Start in Streamable HTTP (JSON RPC) mode

  • Running with Docker:
  • docker pull ghcr.io/evalstate/hf-mcp-server:latest
    docker run --rm -p 3000:3000 ghcr.io/evalstate/hf-mcp-server:latest

  • Installing in Claude Desktop / Claude Code / Gemini CLI / VSCode / Cursor, with example commands:
  • claude mcp add hf-mcp-server -t http https://huggingface.co/mcp?login

    claude mcp add hf-mcp-server \
    -t http https://huggingface.co/mcp \
    -H "Authorization: Bearer <YOUR_HF_TOKEN>"

    gemini mcp add -t http huggingface https://huggingface.co/mcp?login

    gemini extensions install https://github.com/huggingface/hf-mcp-server

    To configure VSCode manually, the example mcp.json snippet is shown as:

    "huggingface": {
    "url": "https://huggingface.co/mcp",
    "headers": {
    "Authorization": "Bearer <YOUR_HF_TOKEN>"
    }

    Similarly, Cursor users can install via a provided link and use a config snippet like:

    "huggingface": {
    "url": "https://huggingface.co/mcp",
    "headers": {
    "Authorization": "Bearer <YOUR_HF_TOKEN>"
    }

    Available Tools (3)

    Examples & Tutorials

    Real examples and usage patterns directly from the docs:

  • Install and connect via Claude Desktop / Claude Code:
  • claude mcp add hf-mcp-server -t http https://huggingface.co/mcp?login

    claude mcp add hf-mcp-server \
    -t http https://huggingface.co/mcp \
    -H "Authorization: Bearer <YOUR_HF_TOKEN>"

  • Install via Gemini CLI:
  • gemini mcp add -t http huggingface https://huggingface.co/mcp?login

  • Install the Gemini CLI extension that bundles the MCP server:
  • gemini extensions install https://github.com/huggingface/hf-mcp-server

  • VSCode integration snippet (mcp.json):
  • "huggingface": {
    "url": "https://huggingface.co/mcp",
    "headers": {
    "Authorization": "Bearer <YOUR_HF_TOKEN>"
    }

  • Cursor integration snippet (mcp.json):
  • "huggingface": {
    "url": "https://huggingface.co/mcp",
    "headers": {
    "Authorization": "Bearer <YOUR_HF_TOKEN>"
    }

  • Run locally in different modes:
  • npx @llmindset/hf-mcp-server       # Start in STDIO mode
    npx @llmindset/hf-mcp-server-http # Start in Streamable HTTP mode
    npx @llmindset/hf-mcp-server-json # Start in Streamable HTTP (JSON RPC) mode

  • Docker-based run:
  • docker build -t hf-mcp-server .

    docker run --rm -p 3000:3000 -e DEFAULT_HF_TOKEN=hf_xxx hf-mcp-server

    Installation Guide

    Follow these steps from the documentation to install and run the MCP Server:

  • Install and run locally with npx (choose mode):
  • npx @llmindset/hf-mcp-server       # Start in STDIO mode
    npx @llmindset/hf-mcp-server-http # Start in Streamable HTTP mode
    npx @llmindset/hf-mcp-server-json # Start in Streamable HTTP (JSON RPC) mode

  • Run with Docker:
  • docker pull ghcr.io/evalstate/hf-mcp-server:latest
    docker run --rm -p 3000:3000 ghcr.io/evalstate/hf-mcp-server:latest

    docker build -t hf-mcp-server .

    docker run --rm -p 3000:3000 -e DEFAULT_HF_TOKEN=hf_xxx hf-mcp-server

  • Transport endpoints overview:

  • STDIO uses stdin/stdout; SSE is available at /sse with /message endpoint; Streamable HTTP at /mcp (JSON mode when using streamableHttpJson).

  • Integration Guides

    Frequently Asked Questions

    Is this your MCP?

    Claim ownership and get verified badge

    Repository Stats

    Sponsored

    Ad Space Available
    Important Notes

    SSE is marked as To be deprecated, but it is still commonly deployed. The Web Application can switch tools on and off, and in certain transports (STDIO, SSE, StreamableHTTP) the ToolListChangedNotification is sent when tools change. In JSON mode for StreamableHTTPJSON, a tool may not be listed when the client requests tool lists. Environment variables include MCP_STRICT_COMPLIANCE (GET 405 rejects in JSON mode) and AUTHENTICATE_TOOL (whether to include an Authenticate tool).

    Prerequisites

    pnpm is used for build and development; Corepack is used to ensure everyone uses the same pnpm version (10.12.3).

    Details
    Last Updated1/2/2026
    SourceGitHub

    Compare Alternatives

    Similar MCP Tools

    9 related tools
    Graphiti MCP Server

    Graphiti MCP Server

    Graphiti MCP Server is an experimental implementation that exposes Graphiti's real-time, temporally-aware knowledge graph capabilities through the MCP (Model Context Protocol) interface. It enables AI agents and MCP clients to interact with Graphiti's knowledge graph for structured extraction, reasoning, and memory across conversations, documents, and enterprise data. The server supports multiple backends (FalkorDB by default and Neo4j), a variety of LLM providers (OpenAI, Anthropic, Gemini, Groq, Azure OpenAI), and multiple embedder options, all accessible via an HTTP MCP endpoint at /mcp/ for broad client compatibility. It also includes queue-based asynchronous episode processing, rich entity types for structured data, and flexible configuration through config.yaml, environment variables, or CLI arguments.

    Context7 MCP Server

    Context7 MCP Server

    Context7 MCP Server delivers up-to-date, code-first documentation and examples for LLMs and AI code editors by pulling content directly from the source. It supports multiple MCP clients and exposes tools that help you resolve library IDs and retrieve library documentation, ensuring prompts use current APIs and usage patterns. The repository provides installation and integration guides for Cursor, Claude Code, Opencode, and other clients, along with practical configuration samples and OAuth options for remote HTTP connections. This MCP server is designed to keep prompts in sync with the latest library docs, reducing hallucinations and outdated code snippets.

    TrendRadar MCP

    TrendRadar MCP

    TrendRadar MCP is an AI-driven Model Context Protocol (MCP) based analysis server that exposes a suite of specialized tools for cross-platform news analysis, trend tracking, and intelligent push notifications. It integrates with TrendRadar’s multi-platform data aggregation (RSS and trending topics) and provides advanced AI-powered insights, sentiment analysis, and cross-platform correlation. The MCP server enables developers to query, analyze, and compare news across platforms using a consistent toolset, with ongoing updates that expand capabilities such as RSS querying, date parsing, and multi-date trend analysis. This documentation references the MCP module updates, tool additions, and architecture changes that enhance extensibility, cross-platform data handling, and AI-assisted reporting.

    ChainAware Behavioural Prediction MCP

    ChainAware Behavioural Prediction MCP

    The ChainAware Behavioural Prediction MCP is an MCP-based server that provides AI-powered tools to analyze wallet behaviour prediction, fraud detection, and rug pull prediction. Designed for Web3 security and DeFi analytics, it enables developers and platforms to integrate risk assessment, predictive wallet behavior insights, and rug-pull detection through MCP-compatible clients. The server exposes three specialized tools and uses Server-Sent Events (SSE) for real-time responses, helping safeguard DeFi users, monitor liquidity risks, and score wallet or contract trustworthiness. Access to production endpoints is API-key gated, reflecting a private backend architecture that supports secure, scalable risk analytics across wallets, contracts, and pools.

    Playwright MCP

    Playwright MCP

    Playwright MCP server. A Model Context Protocol (MCP) server that provides browser automation capabilities using Playwright. This server enables large language models (LLMs) to interact with web pages through structured accessibility snapshots, bypassing the need for screenshots or visually-tuned models. The server is designed to be fast, lightweight, and deterministic, offering LLM-friendly tooling and a rich set of browser automation capabilities via MCP tools. It supports standalone operation, containerized deployments, and integration with a variety of MCP clients (Claude Desktop, VS Code, Copilot, Cursor, Goose, Windsurf, and others).

    Sequential Thinking MCP Server

    Sequential Thinking MCP Server

    Sequential Thinking MCP Server provides a dedicated MCP tool that guides problem-solving through a structured, step-by-step thinking process. It supports dynamic adjustment of the number of thoughts and allows revision and branching within a controlled workflow, making it ideal for complex analysis and solution hypothesis development. This server is designed to register a single tool, sequential_thinking, and is integrated with common MCP deployment methods (NPX, Docker) as well as editor integrations like Claude Desktop and VS Code for quick setup. The documentation provides exact configuration snippets, usage patterns, and building instructions to help you deploy and use the MCP server effectively, including Codex CLI, NPX, and Docker installation examples.

    N8N MCP Server

    N8N MCP Server

    An MCP (Model Context Protocol) server designed to integrate Claude Desktop, Claude Code, Windsurf, and Cursor with n8n workflows. This MCP enables users to build, test, and orchestrate complex workflows by exposing a set of tools that bridge Claude’s capabilities with n8n’s automation platform. The project emphasizes robust trigger handling, multi-tenant readiness, and progressive documentation to help developers understand how tools map to real-world workflow tasks. It also outlines future tooling integration points (such as getNodeEssentials and getNodeInfo) to further enhance node-structure awareness within MCP-powered automations.

    Shadcn UI MCP Server v4

    Shadcn UI MCP Server v4

    Shadcn UI v4 MCP Server is an advanced MCP (Model Context Protocol) server designed to give AI assistants comprehensive access to shadcn/ui v4 components, blocks, demos, and metadata. It enables multi-framework support (React, Svelte, Vue, and React Native) with fast, cache-friendly access to component source code, demos, and directory structures, empowering AI-driven development workflows. The project emphasizes production-readiness with Docker Compose, SSE transport for multi-client deployments, and smart caching to optimize GitHub API usage while providing rich metadata and usage patterns for rapid prototyping and learning across frameworks.

    Figma MCP server

    Figma MCP server

    The Figma MCP server enables design context delivery from Figma files to AI agents and code editors, empowering teams to generate code directly from design selections. It supports both a remote hosted server and a locally hosted desktop server, allowing seamless integration with popular editors through Code Connect and a suite of tools that extract design context, metadata, variables, and more. This guide covers enabling the MCP server, configuring clients (VS Code, Cursor, Claude Code, and others), and using a curated set of MCP tools to fetch structured design data for faster, more accurate code generation. It also explains best practices, prompts, and integration workflows that help teams align generated output with their design systems. The documentation includes concrete JSON examples for configuring servers in editors like VS Code and Cursor, as well as command examples for Claude Code integration and plugin installation.