Spring AI brings AI capabilities into the Spring ecosystem as a first-class citizen. Instead of hand-rolling HTTP clients for OpenAI, Anthropic, or Ollama and wiring JSON parsing, retries, and secrets yourself, you get a consistent abstraction over chat models, embeddings, and vector stores — with the usual Spring benefits: dependency injection, configuration properties, auto-configuration, and optional observability. If you already think in @Service and application.yml, Spring AI will feel immediately familiar. This is Part 1 of a four-part series that ends with a working RAG app.

What it gives you

  • Chat models — one API whether you call OpenAI, Anthropic, Azure OpenAI, or a local Ollama model. Swap providers via config, not code. (Part 2 covers this.)
  • Embeddings — turn text into vectors for similarity search, with multiple backends (cloud or local). (Part 3.)
  • Vector stores — persist and query embeddings, with adapters for Pgvector, Redis, Chroma, Pinecone, and more. (Part 3.)
  • RAG — retrieval-augmented generation building blocks: load documents, embed them, store them, and augment prompts with retrieved context. (Part 4.)

The unifying idea is the portable abstraction: ChatClient, EmbeddingModel, and VectorStore are interfaces. The provider is an implementation you select with a dependency and configure in YAML.

Why not just call the APIs directly?

You can — and for a one-off script, a raw HTTP client is perfectly fine. Spring AI earns its place when you’re building an application rather than a script:

  • Provider portability. Prototype against local Ollama (free, private), ship against a hosted model — same code.
  • Config, not constants. API keys and base URLs live in application.yml / env vars, not scattered string literals.
  • Spring-native ergonomics. Constructor injection, @ConfigurationProperties, testing slices, and Micrometer observability all just work.
  • Higher-level building blocks. Prompt templates, structured output mapping to Java objects, advisors, and RAG plumbing you’d otherwise write by hand.

The trade-off: it’s another abstraction layer, and APIs have shifted across early versions (see the version note below). For most Spring teams, the consistency is worth it.

What you need

  • Java 17+ and Spring Boot 3.x (Spring AI is built on Spring Framework 6).
  • A model starter for your provider, e.g. spring-ai-openai-spring-boot-starter for OpenAI or spring-ai-ollama-spring-boot-starter for local models.
  • For RAG: a vector store dependency and, optionally, document readers/splitters.

Version heads-up [needs source]: Spring AI’s APIs and starter artifact names changed in the run-up to and after its 1.0 GA (and some docs reference a 2.x line aligned with Spring Boot 4). Pin a version in your build and follow the docs for that version — copy-pasting across versions is the main source of “it doesn’t compile.”

How the series fits together

Each part builds on the last:

  1. Intro (you are here) — the why and the setup.
  2. Chat completions — call a model with ChatClient.
  3. Embeddings & vector stores — turn data into searchable vectors.
  4. RAG pipeline — retrieve relevant chunks and feed them into a prompt so the model answers from your data.

If you only care about “make the model answer using my documents,” that’s RAG — and it’s the most common production pattern. (Not sure whether you need RAG or fine-tuning? Here’s the comparison.)

FAQ

Is Spring AI production-ready?
It reached a 1.0 GA and is actively developed. As with any young framework, pin your version and read release notes before upgrading — the API moved quickly in early releases. [needs source]
Which model providers does Spring AI support?
Many, through provider-specific starters — including OpenAI, Anthropic, Azure OpenAI, and local models via Ollama, plus several vector stores. You select one with a dependency and configure it in application.yml.
Do I need a cloud API key to try it?
No. Run a local model with Ollama (see running LLMs locally) and point Spring AI at http://localhost:11434 — no key, no cost, fully offline.
Spring AI or LangChain4j?
Both wrap LLM providers for Java. Spring AI is the natural fit if you’re already in the Spring ecosystem (auto-config, properties, DI); LangChain4j is framework-agnostic. Choose based on your stack. [needs source]

Key takeaway: Spring AI is a portable abstraction over chat models, embeddings, and vector stores that fits the Spring Boot programming model. Use it (over raw HTTP clients) when you want provider portability, config-driven setup, and ready-made RAG building blocks. Next up: calling a chat model.