Content
# Image Vision MCP
MCP Server that provides descriptions of images.
## Requirements
### Ollama
You must have [Ollama](https://ollama.com/) set up, and exposing its LLMs via its API / webserver.
By default, the MCP uses the [llava:34b](https://ollama.com/library/llava) LLM, which you need to have installed and on the same computer that Ollama is running on.
You can install the model via:
```
ollama run llava:34b
```
You can also specify a different model installed via Ollama via the **--model** argument in Claude config.
## Installation
In order to install the MCP into Claude, add the following to the **claude_desktop_config.json** file.
```
"image-vision": {
"command": "npx",
"args": [
"-y",
"node",
"C:/Users/FOO/src/image-vision-mcp/src/image-vision-mcp.js",
"--permitted",
"C:/Users/FOO/mcp",
"--host",
"http://192.168.1.238:11434"
]
}
```
### Arguments
* **--permitted** (required) The paths that the MCP is allowed to access. You can include multiple entries.
* **--host** (optional) The host and post for the Ollama server. Defaults to http://127.0.0.1:11434
* **--model** (optional) The model installed into Ollama that should be used for the image vision. Defaults to "llava:34b".
## Usage
To run the server:
```bash
node src/image-vision-mcp.js --permitted /path/to/dir1 /path/to/dir2
```
The `--permitted` flag is used to specify which directories roots the MCP is allowed to access for security reasons.
## Development
You can run in development mode using the [MCP inspector](https://github.com/modelcontextprotocol/typescript-sdk?tab=readme-ov-file):
```
npx @modelcontextprotocol/inspector node src/image-vision-mcp.js --permitted /Users/FOO/Desktop/mcp/
```
## Questions, Feature Requests, Feedback
If you have any questions, feature requests, need help, or just want to chat, join the [discord](https://discord.gg/fgxw9t37D7).
You can also log bugs and feature requests on the [issues page](https://github.com/mikechambers/image-vision-mcp/issues).
## License
Project released under a [MIT License](LICENSE.md).
[](LICENSE.md)
You Might Also Like
OpenWebUI
Open WebUI is an extensible web interface for customizable applications.

NextChat
NextChat is a light and fast AI assistant supporting Claude, DeepSeek, GPT4...

Continue
Continue is an open-source project for seamless server management.
semantic-kernel
Build and deploy intelligent AI agents with the Semantic Kernel framework.

repomix
Repomix packages your codebase into AI-friendly formats for easy use.
UI-TARS-desktop
UI-TARS-desktop is part of the TARS Multimodal AI Agent stack.