In this article, we'll build a custom AI chatbot application using Hilla, Spring Boot, React, OpenAI (ChatGPT), and Pinecone. The chatbot is designed to answer Vaadin Flow and Hilla development questions using up-to-date documentation as reference material. My earlier blog post details the concept. This article focuses on the code needed to build your own AI assistant.
This article is part four of the Building an AI chatbot in Java series. It uses the services built in the previous three parts:
- Calling ChatGPT and OpenAI APIs from Spring Boot in Java
- Using a Pinecone vector database with Spring Boot
- Prompt engineering and token counting for a ChatGPT bot in Java
Requirements
The tutorial assumes you have a Hilla project based on Spring Boot and React that includes the services we have built over the previous 3 parts. You can find the complete source code below if you are new to the series.
Add the following frontend dependencies to the project to render markdown content.
npm i react-markdown rehype-highlight highlight.js
If you want to follow the code strictly, you need to set up Tailwind using these instructions. You can also use plain CSS or the Lumo utility classes included with Hilla.
Source code for the completed application
You can find the completed source code for the application on my GitHub, https://github.com/marcushellberg/docs-assistant.
Application overview
Here's how the application works on a high level:
- A user enters a query in the browser.
- Moderate the input to ensure it adheres to the content policy.
- Find the parts of the documentation most relevant to answering the question by querying the Pinecone vector database.
- Construct a ChatGPT completion query with a prompt, the relevant documentation, and the chat history. Count tokens to ensure maximal usage of the context size without exceeding it.
- Stream the response back to the user.
We already have the backend of the application completed. What remains to be done is the React front end.
Create a type-safe RPC endpoint
Hilla uses type-safe RPC endpoints for communicating between the server and the client.
In the com.example.application.endpoints
package, create the following two files:
package-info.java
@NonNullApi
package com.example.application.endpoints;
import org.springframework.lang.NonNullApi;
DocsAssistantEndpoint.java
@Endpoint
@AnonymousAllowed
public class DocsAssistantEndpoint {
private final DocsAssistantService docsAssistantService;
public DocsAssistantEndpoint(DocsAssistantService docsAssistantService) {
this.docsAssistantService = docsAssistantService;
}
public Flux<String> getCompletionStream(List<ChatCompletionMessage> history, String framework) {
return docsAssistantService.getCompletionStream(history, framework);
}
public List<Framework> getSupportedFrameworks() {
return docsAssistantService.getSupportedFrameworks();
}
}
The package-info
file instructs Hilla to consider all APIs in the package as non-nullable, simplifying TypeScript compatibility.
DocsAssistantEndpoint
uses the DocsAssistantService
we built in the previous article to provide two methods: getSupportedFrameworks
to return a list of all frameworks (namespaces) that we support, and getCompletionStream
that returns a response stream given a list of messages and the framework we're interested in.
Create a React view for chatting with the AI bot
Update App.tsx
with the following implementation:
export default function App() {
const [working, setWorking] = useState(false);
const [framework, setFramework] = useState("");
const [supportedFrameworks, setSupportedFrameworks] = useState<Framework[]>(
[]
);
const [messages, setMessages] = useState<ChatCompletionMessage[]>([]);
// Reset the messages when the framework changes
function changeFramework(newFramework: string) {
setFramework(newFramework);
setMessages([]);
}
return (
<div className="flex flex-col max-w-screen-lg mx-auto h-screen p-4 max-w">
<div className="flex gap-4 mb-6 items-center justify-between">
<h1 className="font-semibold text-lg md:text-2xl">
Vaadin Docs Assistant
</h1>
<Select
className="w-24 sm:w-48"
items={supportedFrameworks as SelectItem[]}
value={framework}
onChange={(e) => changeFramework(e.target.value)}
/>
</div>
<VirtualList items={messages} className="flex-grow">
{({ item }) => <ChatMessage content={item.content} role={item.role} />}
</VirtualList>
<MessageInput
className="p-0 pt-2"
onSubmit={(e) => getCompletion(e.detail.value)}
/>
</div>
);
}
Create a separate component, ChatMessage.tsx
, to represent a single message:
import ReactMarkdown from "react-markdown";
import rehypeHighlight from "rehype-highlight";
import "highlight.js/styles/atom-one-light.css";
export default function ChatMessage({ content, role }: ChatCompletionMessage) {
return (
<div className="w-full mb-4">
<div className="flex flex-col md:flex-row md:gap-2">
<div className="text-2xl">{role === Role.ASSISTANT ? "🤖" : "🧑💻"}</div>
<div className="max-w-full overflow-x-scroll">
<ReactMarkdown
rehypePlugins={[[rehypeHighlight, { ignoreMissing: true }]]}
>
{content || ""}
</ReactMarkdown>
</div>
</div>
</div>
);
}
The component uses react-markdown
and rehype-highlight
to render the markdown and highlight code snippets.
In App.tsx
, call the endpoint to fetch the supported frameworks and update the state.
useEffect(() => {
DocsAssistantEndpoint.getSupportedFrameworks().then((supportedFrameworks) => {
setSupportedFrameworks(supportedFrameworks);
setFramework(supportedFrameworks[0].value!);
});
}, []);
Finally, implement the method for getting a completion:
async function getCompletion(text: string) {
const messageHistory = [
...messages,
{
role: Role.USER,
content: text,
},
];
// Display the question
setMessages(messageHistory);
// Add a new message to the list on the first response chunk, then append to it
let firstChunk = true;
function appendToLastMessage(chunk: string) {
if (working) return;
setWorking(true);
if (firstChunk) {
// Init the response message on the first chunk
setMessages((msg) => [
...msg,
{
role: Role.ASSISTANT,
content: "",
},
]);
firstChunk = false;
}
setMessages((msg) => {
const lastMessage = msg[msg.length - 1];
lastMessage.content += chunk;
return [...msg.slice(0, -1), lastMessage];
});
}
// Get completion as stream
DocsAssistantEndpoint.getCompletionStream(messageHistory, framework)
.onNext((chunk) => appendToLastMessage(chunk))
.onComplete(() => setWorking(false))
.onError(() => {
console.error("Error processing stream");
setWorking(false);
});
}
The method:
- Immediately appends the question to the message list to display it while waiting for the answer.
- Calls
DocsAssistantEndpoint
to get a completion for the message history and selected framework. - Subscribes to response chunks and appends them to the last message as they come in.
Next steps
In the final part of the Building an AI chatbot in Java series, we'll take a look at Deploying a Spring Boot app as a native GraalVM image with Docker.