Building LLM Based Chatbot with a knowledge base in Go

You can feel the craze of the Large Language Model (LLM) based solution Today. People are racing to find a good market-fit product or even a disruptive one. However, the scene is mainly dominated by Python-based implementation, so building a solution on top of Go is rare Today and I hope interesting for many people.

Here we will use github.com/wejick/gchain to build the chatbot, it’s a framework heavily inspired by LangChain. It has enough tools to build a chatbot with a connection to a vector db as the knowledge base.

What we’re building today?

An indexer to put information to a vector DB
A chatbot backed by openAI Chat API
The chatbot will query information to the vector DB to answer user question

Building Indexer

We’re going to use English Wikipedia page of Indonesia and History of Indonesia as the knowledge and put it into weaviate as our vector db. As both text containts a lot of text and LLM has constraints on the context window size, chunking the text into smaller sizes will help in our application.

In the indexer there are several main steps to be done :

Load the text
Cut the texts into chunks
Get each chunk embedding
Store the text chunk + embedding to the vector DB

Loading the text

    sources := []source{
		{filename: "indonesia.txt"},
		{filename: "history_of_indonesia.txt"},
	}
	for idx, s := range sources {
		data, err := os.ReadFile(s.filename)
		if err != nil {
			log.Println(err)
			continue
		}
		sources[idx].doc = string(data)
	}

Cut the texts into chunks using textsplitter package

	// create text splitter for embedding model
	indexingplitter, err := textsplitter.NewTikTokenSplitter(openai.AdaEmbeddingV2.String())

	// split the documents by 500 token
	var docs []string
	docs = indexingplitter.SplitText(sources[0].doc, 500, 0)
	docs = append(docs, indexingplitter.SplitText(sources[1].doc, 500, 0)...)

Create weaviate and embedding instance, then Store the text chunk to DB

	// create e
	embeddingModel = _openai.NewOpenAIEmbedModel(OAIauthToken, "", openai.AdaEmbeddingV2)
	// create a new weaviate client
	wvClient, err = weaviateVS.NewWeaviateVectorStore(wvhost, wvscheme, wvApiKey, embeddingModel, nil)
	// store the document to weaviate db
	bErr, err := wvClient.AddDocuments(context.Background(), "Indonesia", docs)
	// handle error here

Create the ChatBot

GChain has a premade chain named conversational_retriever which gives us ability to create a chatbot that connects to a datastore easily. As we have prepared the knowledge base above, it’s just a matter of wiring everything together and creating a Text interface to handle user interaction.

Setup the chain

	// memory to store conversation history
	memory := []model.ChatMessage{}
	// Creating text splitter for GPT3Dot5Turbo0301 model
	textplitter, err := textsplitter.NewTikTokenSplitter(_openai.GPT3Dot5Turbo0301)

	// create the chain
	chain := conversational_retrieval.NewConversationalRetrievalChain(chatModel, memory, wvClient, "Indonesia", textplitter, callback.NewManager(), "", 2000, false)

Create a simple text-based user interface, basically loop until user want to quit

	fmt.Println("AI : How can I help you, I know many things about indonesia")
	scanner := bufio.NewScanner(os.Stdin)
	for {
		fmt.Print("User : ")
		scanner.Scan()
		input := scanner.Text()
		output, err := chain.Run(context.Background(), map[string]string{"input": input})
		if err != nil {
			log.Println(err)
			return
		}
		fmt.Println("AI : " + output["output"])
	}

You can find the full code in the GChain Example. It will looks like this :

3 thoughts on “Building LLM Based Chatbot with a knowledge base in Go”

Taufik Rama says:

on June 18, 2023 at 10:16 pm

I wonder if this would become more common in other languages/projects too, since I see that for the most part this integrates to provider (like openAI), and so we can just simply do the work via HTTP request.

On the top of my head I can imagine that we can feed the program’s own source to as the token & use it to create an FAQ/support page/help messages (like a “–help”)
wejick says:

on June 19, 2023 at 10:57 am

Indeed it will be more and more common to consume hosted model, as it’s just a matter of API call.
Pingback: Making an LLM-Based Streaming Chatbot with Go and WebSocket | In My Honest Opinion

In My Honest Opinion

Just take it easy in your mind

Building LLM Based Chatbot with a knowledge base in Go

Building Indexer

Create the ChatBot

3 thoughts on “Building LLM Based Chatbot with a knowledge base in Go”

Leave a comment

Building Indexer

Create the ChatBot

Share this:

Related

3 thoughts on “Building LLM Based Chatbot with a knowledge base in Go”

Leave a comment