Part 1: Introduction to Spring AI

Spring AI brings AI capabilities into the Spring ecosystem as a first-class citizen. Instead of hand-rolling HTTP clients for OpenAI, Anthropic, or Ollama and wiring JSON parsing, retries, and secrets yourself, you get a consistent abstraction over chat models, embeddings, and vector stores — with the usual Spring benefits: dependency injection, configuration properties, auto-configuration, and optional observability. If you already think in @Service and application.yml, Spring AI will feel immediately familiar. This is Part 1 of a four-part series that ends with a working RAG app.

What it gives you

Chat models — one API whether you call OpenAI, Anthropic, Azure OpenAI, or a local Ollama model. Swap providers via config, not code. (Part 2 covers this.)
Embeddings — turn text into vectors for similarity search, with multiple backends (cloud or local). (Part 3.)
Vector stores — persist and query embeddings, with adapters for Pgvector, Redis, Chroma, Pinecone, and more. (Part 3.)
RAG — retrieval-augmented generation building blocks: load documents, embed them, store them, and augment prompts with retrieved context. (Part 4.)

The unifying idea is the portable abstraction: ChatClient, EmbeddingModel, and VectorStore are interfaces. The provider is an implementation you select with a dependency and configure in YAML.

Why not just call the APIs directly?

You can — and for a one-off script, a raw HTTP client is perfectly fine. Spring AI earns its place when you’re building an application rather than a script:

Provider portability. Prototype against local Ollama (free, private), ship against a hosted model — same code.
Config, not constants. API keys and base URLs live in application.yml / env vars, not scattered string literals.
Spring-native ergonomics. Constructor injection, @ConfigurationProperties, testing slices, and Micrometer observability all just work.
Higher-level building blocks. Prompt templates, structured output mapping to Java objects, advisors, and RAG plumbing you’d otherwise write by hand.

The trade-off: it’s another abstraction layer, and APIs have shifted across early versions (see the version note below). For most Spring teams, the consistency is worth it.

What you need

Java 17+ and Spring Boot 3.x (Spring AI is built on Spring Framework 6).
A model starter for your provider, e.g. spring-ai-openai-spring-boot-starter for OpenAI or spring-ai-ollama-spring-boot-starter for local models.
For RAG: a vector store dependency and, optionally, document readers/splitters.

Version heads-up [needs source]: Spring AI’s APIs and starter artifact names changed in the run-up to and after its 1.0 GA (and some docs reference a 2.x line aligned with Spring Boot 4). Pin a version in your build and follow the docs for that version — copy-pasting across versions is the main source of “it doesn’t compile.”

How the series fits together

Each part builds on the last:

Intro (you are here) — the why and the setup.
Chat completions — call a model with ChatClient.
Embeddings & vector stores — turn data into searchable vectors.
RAG pipeline — retrieve relevant chunks and feed them into a prompt so the model answers from your data.

If you only care about “make the model answer using my documents,” that’s RAG — and it’s the most common production pattern. (Not sure whether you need RAG or fine-tuning? Here’s the comparison.)

FAQ

Is Spring AI production-ready?

It reached a 1.0 GA and is actively developed. As with any young framework, pin your version and read release notes before upgrading — the API moved quickly in early releases. [needs source]

Which model providers does Spring AI support?

Many, through provider-specific starters — including OpenAI, Anthropic, Azure OpenAI, and local models via Ollama, plus several vector stores. You select one with a dependency and configure it in application.yml.

Do I need a cloud API key to try it?

No. Run a local model with Ollama (see running LLMs locally) and point Spring AI at http://localhost:11434 — no key, no cost, fully offline.

Spring AI or LangChain4j?

Both wrap LLM providers for Java. Spring AI is the natural fit if you’re already in the Spring ecosystem (auto-config, properties, DI); LangChain4j is framework-agnostic. Choose based on your stack. [needs source]

Key takeaway: Spring AI is a portable abstraction over chat models, embeddings, and vector stores that fits the Spring Boot programming model. Use it (over raw HTTP clients) when you want provider portability, config-driven setup, and ready-made RAG building blocks. Next up: calling a chat model.

What it gives you#

Why not just call the APIs directly?#

What you need#

How the series fits together#

FAQ#

What it gives you

Why not just call the APIs directly?

What you need

How the series fits together

FAQ