Home | Send Feedback | Share on Bluesky |

Tool/Function Calling with Spring AI

Published: 29. January 2025  •  java, spring, llm

In this blog post, we will take a look at Spring AI. A library developed by the Spring team to make it easy to call AI models and implement common AI workflows, such as Retrieval Augmented Generation (RAG) and Tool/Function calling, in Spring and Spring Boot applications.

Spring AI is comparable to LangChain in the Python world in its goal to make it easy to call AI models and implement common AI workflows.

Spring AI abstracts away the complexity of calling AI models and provides a common API to call different AI models. It supports all kinds of AI models, such as Chat Completion, Embedding, Text to Image, Audio Transcription, Text to Speech, Moderation, etc. The abstraction makes it easy to switch between different AI models. It also provides a common API for accessing vector stores and implementing RAG workflows.

This blog post will focus on Spring AI's support for Tool/Function calling.

What is Tool/Function calling?

Large language models don't know everything, even though the largest models are trained on a vast amount of data. All models have a knowledge cut-off date. They are trained on data up to a certain date. All events after that date are unknown to the model. Models are usually trained only on publicly available data. Data that is not public, for instance, data in a company, is not known to the model.

Another piece of information a model knows nothing about is real-time events like weather or stock prices. An LLM does not have access to the internet; it's just a static blob of floating-point numbers that form a neural network.

Tool calling is a way to solve this problem and to provide missing information to the model. This allows the model to answer questions about data the model was not trained on.

Tool calling works by sending a request to the LLM containing the prompt and a description of one or more tools. The LLM can then "call" these tools to get the missing information. "Call" here does not mean that the LLM can call a function or tool. The LLM can only generate text. Calling a tool means that the LLM generates a special response called a tool calling response, which contains the tool it wants to call and the arguments to call it.

The code that sent the request to the LLM can then call the tool or function. The response to this call is then sent back to the LLM, and the LLM can then use this additional information to answer the initial prompt.

The following sequence diagram shows the interaction between the caller, the LLM, and the tool:

LLMCallerToolLLMCallerToolInitial request1Send tool calling response2Call tool3Return tool output4Send tool output5Return response6

It's important to note here that when the caller sends back the tool call response to the LLM (5), it must contain the history of the initial request (1) and the tool calling response (2). LLMs are stateless and do not have a memory of previous requests; therefore, the caller must always send the whole conversation history back to the LLM.

To get a better idea of what you can do with tool calling, I recommend checking out the list of pre-built tools that LangChain provides: https://python.langchain.com/docs/integrations/tools/

Spring AI does not provide any pre-built tools. It only provides a framework that allows you to write your own tools. The following sections will show you how to write your own tools in Spring AI.

Note that not all models are capable of tool calling, especially when you work with smaller LLMs running in Ollama. Spring AI can access models running in Ollama, but not all of them support tool calling. The Ollama model listing shows you all pre-trained models with tool-calling capabilities.

Setting up Spring AI

The first thing we need to do is add the Spring AI dependency to our Spring Boot project. Because LLMs have different capabilities and APIs, Spring AI provides several starter libraries that contain the necessary dependencies depending on the model you want to use.

In this blog post, I will use OpenAI's GPT-4o-mini model; therefore, I add the openai starter library to my project:

    <dependency>
      <groupId>org.springframework.ai</groupId>
      <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    </dependency>

pom.xml

Check out the Spring AI documentation for a list of all supported models and what starter libraries you need to add to your project.

Next, we need to configure the credentials, usually an API key, to access the model. Additionally, I configured the model and some options. These options can be overridden during runtime.

spring.ai.openai.api-key=sk-
spring.ai.openai.chat.options.model=gpt-4o-mini
spring.ai.openai.chat.options.temperature=0.0
spring.ai.openai.chat.options.maxTokens=4096

application.properties

The official documentation provides a list of all available options and their default values.

Note that Spring AI also has a built-in retry mechanism that retries requests if they fail. The default number of retries is 10.

To access the LLM from the code, Spring AI provides the Chat Client API. A ChatClient is created using a ChatClient.Builder object. Spring Boot's autoconfiguration automatically creates an instance of the Builder during startup when it finds a Spring AI starter library on the classpath. This builder instance can be injected into the code like any other Spring bean.

  private final ChatClient chatClient;

  public DemoApplication(ChatClient.Builder chatClientBuilder) {
    this.chatClient = chatClientBuilder.build();
  }

DemoApplication.java

You can also programmatically create a ChatClient if you need more control over the process. See the official documentation for more information.

With everything set up, we can now start exploring tool calling in Spring AI.

Tool/Function Calling in Spring AI

Knowledge Cut-off

The first example shows you the knowledge cut-off problem and how a tool can help to solve it.

When we ask the model about an event that occurred after the knowledge cut-off date, the model will not know the answer. gpt-4o-mini was trained on data up to October 2023. Therefore, it knows nothing about events in 2024.

  private void newsDemoWithoutFunction() {
    String prompt = "Who won the 2024 Men's ATP Tennis tournament in Rome?";
    var response = this.chatClient.prompt().user(prompt).call().chatResponse();
    Generation generation = response.getResult();
    if (generation != null) {
      System.out.println(generation.getOutput().getText());
    }
    else {
      System.out.println("No generation");
    }
  }

DemoApplication.java

When we run this code, we get the following response:

I'm sorry, but I don't have information on events that occurred after October 2023, including the 2024 Men's ATP Tennis tournament in Rome. 
You may want to check the latest sports news or the official ATP website for the most current results.

To solve this problem, we can write a tool that fetches the information from a website. Wikipedia is a good source for this kind of information. The tool I wrote for this example is called wikipediaArticleFetcher. It takes a WikipediaQuery object as input, then searches Wikipedia with the given query and returns the article text of the top 3 search results.

The important part here is that each tool has a description. This description is very important; it tells the LLM what the tool does and how to use it. This is also the text you often have to tweak a few times to get the best results.

Additionally, you can add a description of the tool's expected input properties. In this case, I use the @JsonClassDescription and @JsonPropertyDescription annotations to describe the input properties of the tool. Spring AI generates a JSON schema for the input object, and these descriptions are included in the schema.

Like the description for the tool, these descriptions might need some tweaking to get the best result. Sometimes you don't need to provide a description when the name of the property is self-explanatory.

@JsonClassDescription("A query to search Wikipedia")
public record WikipediaQuery(
    @JsonPropertyDescription("The search query") String searchQuery) {
}

WikipediaQuery.java

The following code shows you the implementation of the WikipediaArticleFetcher tool. The important part here is the @Tool annotation with the description. You can add multiple tools to a class. The description of all these tools will be sent to the LLM when the service class is used in a tool call.

@Service
public class WikipediaArticleFetcher {

  private final ObjectMapper objectMapper = new ObjectMapper();

  @JsonIgnoreProperties(ignoreUnknown = true)
  record SearchResponse(Query query) {
    @JsonIgnoreProperties(ignoreUnknown = true)
    record Query(List<SearchResult> search) {
    }

    @JsonIgnoreProperties(ignoreUnknown = true)
    record SearchResult(String title) {
    }
  }

  @Tool(description = "Searches for a Wikipedia article and returns the text")
  public String search(WikipediaQuery query) {
    System.out.println(
        "Calling WikipediaArticleFetcher with parameters: " + query.searchQuery());
    String searchUrl = "https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch="
        + query.searchQuery().replace(" ", "%20") + "&format=json";

    try (HttpClient client = HttpClient.newHttpClient()) {
      HttpRequest request = HttpRequest.newBuilder().uri(URI.create(searchUrl)).build();
      var response = client.send(request, HttpResponse.BodyHandlers.ofString());
      var responseBody = response.body();
      SearchResponse searchResponse = this.objectMapper.readValue(responseBody,
          SearchResponse.class);

      if (searchResponse.query().search().isEmpty()) {
        return "";
      }

      List<SearchResult> searchResponses = searchResponse.query().search().subList(0,
          Math.min(searchResponse.query().search().size(), 3));
      StringBuilder context = new StringBuilder();
      for (SearchResult sr : searchResponses) {
        String url = "https://en.wikipedia.org/wiki/" + sr.title().replace(" ", "_");
        Document doc = Jsoup.connect(url).get();
        Element mainElement = doc.select("div[id=mw-content-text]").first();
        String text = mainElement.text();
        context.append(text.replaceAll("\\[.*?\\]+", ""));
        context.append("\n\n");
      }
      return context.toString();
    }
    catch (Exception e) {
      throw new RuntimeException(e);
    }
  }

}

WikipediaArticleFetcher.java

We can then inject the service and send a request to the LLM that contains the tool description. The only difference to the previous LLM call is the call to the tools method. Every method annotated with @Tool in the injected service results in one tool call description.

  @Autowired
  private WikipediaArticleFetcher wikipediaArticleFetcher;

  private void newsDemo() {
    String prompt = "Who won the 2024 Men's ATP Tennis tournament in Rome?";
    var response = this.chatClient.prompt().user(prompt)
        .tools(this.wikipediaArticleFetcher).call().chatResponse();
    Generation generation = response.getResult();
    if (generation != null) {
      System.out.println(generation.getOutput().getText());
    }
    else {
      System.out.println("No generation");
    }
  }

DemoApplication.java

When we run this code, we get the following output:

Calling WikipediaArticleFetcher with parameters: 2024 Men's ATP Tennis tournament Rome
The winner of the 2024 Men's ATP Tennis tournament in Rome, also known as the Italian Open, was Alexander Zverev. He defeated Nicolás Jarry in the final with a score of 6–4, 7–5.

The first output is from the tool, and we see the search query that the LLM has sent. The second output is the final response from the LLM, after the tool had searched Wikipedia, extracted the text of the top 3 search results, and sent it back to the LLM.

You see that Spring AI abstracts away the complexity of tool calling.

Note that subsequent responses might also contain a tool call response. Some LLMs, like the OpenAI models, are capable of parallel tool calling. That means the tool call response can contain multiple calls for different tools or for the same tool. Spring AI can handle these responses, run all tools, and send all responses back to the LLM.


Real-time Data

The second example shows you how to use a tool to get real-time data. In this example, we get the current temperature for a location from the Free Weather API. This service is free for non-commercial use as long as you follow the conditions of the terms of service.

This example implements and configures the tool in a different way. Instead of using a service class, this tool implementation is just a method in any class that receives a Location object and returns the current temperature at this location.

  record Location(float latitude, float longitude) {
  }

  private float fetchTemperature(Location location) {
    System.out.println("Calling fetchTemperature with parameters: " + location.latitude
        + ", " + location.longitude);
    try (var client = HttpClient.newHttpClient()) {

      var request = HttpRequest.newBuilder()
          .uri(URI.create("https://api.open-meteo.com/v1/forecast?latitude="
              + location.latitude + "&longitude=" + location.longitude
              + "¤t=temperature_2m&format=flatbuffers"))
          .build();
      var response = client.send(request,
          java.net.http.HttpResponse.BodyHandlers.ofByteArray());

      ByteBuffer buffer = ByteBuffer.wrap(response.body()).order(ByteOrder.LITTLE_ENDIAN);
      WeatherApiResponse mApiResponse = WeatherApiResponse
          .getRootAsWeatherApiResponse(buffer.position(4));
      VariablesWithTime current = mApiResponse.current();

      VariableWithValues temperature2m = new VariablesSearch(current)
          .variable(Variable.temperature).altitude(2).first();
      if (temperature2m == null) {
        return Float.NaN;
      }
      return temperature2m.value();
    }
    catch (IOException | InterruptedException e) {
      throw new RuntimeException(e);
    }

  }

DemoApplication.java

The code that calls the LLM first creates a FunctionToolCallback object with the name "fetchTemperature" and a java.util.function.Function implementation that calls the fetchTemperature method. The builder also supports implementations of java.util.function.BiFunction, java.util.function.Consumer, and java.util.function.Supplier.

Using the description method, you can add a description to the tool. The description is optional; If you don't provide one, Spring AI takes the method name as the description. The default description might not be self-explanatory in all cases, and the LLM might not call the tool because it does not know what the tool does. Therefore, it's usually a good idea to provide a custom description.

With inputType, you can specify the tool's input type. Spring AI generates a JSON schema based on this type. Instead of inputType, you can call inputSchema and provide a JSON schema directly.

  private void temperatureDemo() {
    String prompt = "What are the current temperatures in Lisbon, Portugal and Reykjavik, Iceland?";

    FunctionToolCallback<Location, Float> callback = FunctionToolCallback
        .builder("fetchTemperature", (Location location) -> fetchTemperature(location))
        .description("Get the current temperature of a location")
        .inputType(Location.class).build();

    var response = this.chatClient.prompt().user(prompt).functions(callback).call()
        .chatResponse();
    Generation generation = response.getResult();
    if (generation != null) {
      System.out.println(generation.getOutput().getText());
    }
    else {
      System.out.println("No generation");
    }
  }

DemoApplication.java

The output of this code is:

Calling fetchTemperature with parameters: 38.7223, -9.1393
Calling fetchTemperature with parameters: 64.1355, -21.8954
The current temperature in Lisbon, Portugal is approximately 13.9°C, while in Reykjavik, Iceland, it is around -6.1°C.

Note that this example uses the parallel tool calling feature of the OpenAI models. The LLM sent back two tool calls in the tool calling response, and Spring AI called the fetchTemperature method twice. Both responses are then sent back to the LLM, and the LLM generated the final response you see above.

Conclusion

In this blog post, you have seen how Spring AI abstracts away the complexity of tool calling. It helps you with a framework to write your own tools and integrate them into your Spring Boot application.

Tool calling is a powerful feature that allows you to provide missing information to the LLM and to answer questions about data on which the LLM was not trained.

Be careful when you use a tool that extracts data from in-house documents or databases and then sends this data to a cloud based LLM. Make sure that the data you send to the LLM does not violate any privacy or security policies.