Spring AI brings AI capabilities into the Spring ecosystem as a first-class citizen. Instead of hand-rolling HTTP clients for OpenAI, Anthropic, or Ollama and wiring JSON parsing, retries, and secrets yourself, you get a consistent abstraction over chat models, embeddings, and vector stores — with the usual Spring benefits: dependency injection, configuration properties, auto-configuration, and optional observability. If you already think in @Service and application.yml, Spring AI will feel immediately familiar. This is Part 1 of a four-part series that ends with a working RAG app.
What it gives you
- Chat models — one API whether you call OpenAI, Anthropic, Azure OpenAI, or a local Ollama model. Swap providers via config, not code. (Part 2 covers this.)
- Embeddings — turn text into vectors for similarity search, with multiple backends (cloud or local). (Part 3.)
- Vector stores — persist and query embeddings, with adapters for Pgvector, Redis, Chroma, Pinecone, and more. (Part 3.)
- RAG — retrieval-augmented generation building blocks: load documents, embed them, store them, and augment prompts with retrieved context. (Part 4.)
The unifying idea is the portable abstraction: ChatClient, EmbeddingModel, and VectorStore are interfaces. The provider is an implementation you select with a dependency and configure in YAML.
Why not just call the APIs directly?
You can — and for a one-off script, a raw HTTP client is perfectly fine. Spring AI earns its place when you’re building an application rather than a script:
- Provider portability. Prototype against local Ollama (free, private), ship against a hosted model — same code.
- Config, not constants. API keys and base URLs live in
application.yml/ env vars, not scattered string literals. - Spring-native ergonomics. Constructor injection,
@ConfigurationProperties, testing slices, and Micrometer observability all just work. - Higher-level building blocks. Prompt templates, structured output mapping to Java objects, advisors, and RAG plumbing you’d otherwise write by hand.
The trade-off: it’s another abstraction layer, and APIs have shifted across early versions (see the version note below). For most Spring teams, the consistency is worth it.
What you need
- Java 17+ and Spring Boot 3.x (Spring AI is built on Spring Framework 6).
- A model starter for your provider, e.g.
spring-ai-openai-spring-boot-starterfor OpenAI orspring-ai-ollama-spring-boot-starterfor local models. - For RAG: a vector store dependency and, optionally, document readers/splitters.
Version heads-up
[needs source]: Spring AI’s APIs and starter artifact names changed in the run-up to and after its 1.0 GA (and some docs reference a 2.x line aligned with Spring Boot 4). Pin a version in your build and follow the docs for that version — copy-pasting across versions is the main source of “it doesn’t compile.”
How the series fits together
Each part builds on the last:
- Intro (you are here) — the why and the setup.
- Chat completions — call a model with
ChatClient. - Embeddings & vector stores — turn data into searchable vectors.
- RAG pipeline — retrieve relevant chunks and feed them into a prompt so the model answers from your data.
If you only care about “make the model answer using my documents,” that’s RAG — and it’s the most common production pattern. (Not sure whether you need RAG or fine-tuning? Here’s the comparison.)
FAQ
Is Spring AI production-ready?
[needs source]
Which model providers does Spring AI support?
application.yml.
Do I need a cloud API key to try it?
http://localhost:11434 — no key, no cost, fully offline.
Spring AI or LangChain4j?
[needs source]
Key takeaway: Spring AI is a portable abstraction over chat models, embeddings, and vector stores that fits the Spring Boot programming model. Use it (over raw HTTP clients) when you want provider portability, config-driven setup, and ready-made RAG building blocks. Next up: calling a chat model.