I am not sure if it is just the case with me or everyone else. Whenever I hear the word "chat", my brain immediately pops up a light bulb that says "Sockets! You need sockets". I recently built a chatbot powered by GPT and after working with it for a while, I realized that you don't need sockets to chat with GPT. To put it into perspective, ChatGPT does not use sockets if you take a peek under the hood
Sockets are good for two-way non-blocking communication, where both the actors on scene act as the sender and the receiver. If you are building a chatbot powered by GPT (or any other model from OpenAI), then you don't need a socket server/client. Because till now, GPT has not initiated any conversations (Let's hope it doesn't... Ever!). As a user, you initiate the conversation every time and you wait for the response from the model
In the above clip, the concept you are witnessing is called Server-Sent Events. The client makes a simple REST api call to open a persistent connection with the server and the new messages from the server will be sent in the form of events to the open connection. In the case of ChatGPT, instead of keeping the user waiting till the model returns the entire response, the api streams the message chunks in the form of events, as and when they become available. This gives a feel to the user that they are not waiting forever to get a response from the server
We will build a simple chatbot using react and OpenAI client library for JS, and then we will see how the same can be achieved with a Node JS backend. We will use a simple REST endpoint in the backend and relay the response data to the UI without using any socket implementations
⚛️ Client-side Implementation (React + OpenAI library )
I have used Vite to bootstrap a new React app and installed the openai
dependency alone
yarn create vite #to create a new react app using vite
yarn add openai #installing the openai client library as a dependency
I don't want to deviate from the topic by installing a ton of other things, so I will keep it simple. We will just build a simple chat UI using vanilla react and inline styling
OpenAI client setup
We will get started by creating a new instance for the OpenAI client library
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: import.meta.env.VITE_OPENAI_API_KEY, //Store this in your .env file
dangerouslyAllowBrowser: true,
});
export { openai };
OpenAI guidelines specifically discourage the use of their API keys on the client side. However, if you understand the risk then you will be allowed to use it on the client side by setting the dangerouslyAllowBrowser
field to true
Components
The next step is to create a component to wrap all the chat elements. I will make use of the default App.jsx
file as the parent to hold all the components
/* App.jsx */
import { useState } from "react";
import { openai } from "./openai";
import InputForm from "./InputForm";
import Conversations from "./Conversations";
export default function App() {
const [conversations, setConversations] = useState([]); // To track all the conversations
const [aiMessage, setAiMessage] = useState(""); // To store the new AI responses
const [userMessage, setUserMessage] = useState(""); // To store the message sent by the user
const handleSubmit = useCallback(
async (e) => {
// We will handle the user message submission and streaming here
},
[]
);
// The app will have a vertical layout with the conversation section at the top
// The message textbox will be at the bottom
return (
<div
style={{
width: "100%",
height: "100%",
display: "flex",
flexDirection: "column",
justifyContent: "center",
background: "darkslateblue",
}}
>
{/* Conversations section */}
{/* Message input */}
</div>
);
}
The above is the wrapper that will contain the chat bubbles on top and the message input field at the bottom. Now we will create a new component that will hold a form
to accept the user messages
/* Message input */
/* InputForm.jsx */
export default function InputForm({
userMessage,
setUserMessage,
handleSubmit,
}) {
return (
<form
style={{
width: "60%",
display: "flex",
flexDirection: "row",
justifyContent: "center",
background: "white",
margin: "auto",
}}
onSubmit={handleSubmit}
>
<input
type="text"
placeholder="Ask something..."
value={userMessage}
onChange={(e) => setUserMessage(e.target.value)}
style={{
width: "90%",
outline: "none",
padding: "20px",
border: "none",
}}
></input>
<button
type="submit"
style={{
width: "10%",
outline: "none",
padding: "20px",
border: "none",
background: "mediumslateblue",
color: "white",
fontWeight: "bold",
cursor: "pointer",
}}
>
Send
</button>
</form>
);
}
The resultant UI will look like this. Neat, isn't it?
Now that we have an input form, we will proceed to create a basic chat bubble component and a scrollable section to display all the chat bubbles with the messages from the user and the AI
/* ChatBubble.jsx */
export default function ChatBubble({ isHuman, message }) {
return (
<div
style={{
width: "100%",
display: "flex",
flexDirection: "row",
justifyContent: isHuman ? "flex-end" : "flex-start",
}}
>
<div
style={{
width: "fit-content",
maxWidth: "600px",
background: isHuman ? "lightblue" : "white",
padding: "10px",
borderRadius: "10px",
margin: "14px",
color: "black",
fontFamily: "sans-serif",
fontWeight: "400",
}}
>
{message}
</div>
</div>
);
}
If the message is from the user, then it will float on the right. The messages from the AI will be on the left. The below is the look we are going for
Now the scrollable section. This will be the first child of the wrapper component
/* Conversations section */
/* Conversations.jsx */
import ChatBubble from "./ChatBubble";
export default function Conversations({ conversations, aiMessage }) {
return (
<div
style={{
width: "100%",
height: "90%",
display: "flex",
flexDirection: "column",
justifyContent: "flex-end",
margin: "0 auto",
background: "#625a91",
}}
>
<div
style={{
maxHeight: "800px",
overflowY: "auto",
padding: "30px",
}}
>
{conversations &&
conversations.map((conversation, index) => {
return (
<ChatBubble
isHuman={conversation.isHuman}
message={conversation.message}
key={index}
/>
);
})}
{aiMessage && <ChatBubble isHuman={false} message={aiMessage} />}
</div>
</div>
);
}
Now, with the entire component put together, the UI will look as below,
The UI-related things are all done now. The one major part that is pending is the actual thing for which you opened this article in the first place. THE STREAMING!
Streaming the response
We have created a function within the App
component to handle the message submission from the user. We will handle the logic of invoking the OpenAI api and streaming the response. The responsibilities of the submission handler are as follows,
To store the user message in the global
conversations
state and empty the chat input fieldTo invoke the OpenAI api with the user message and stream the response
To store the AI response to the
conversations
state once the stream is closed
const handleSubmit = useCallback(
async (e) => {
e.preventDefault(); //To prevent the default form submission behaviour
// Storing the user message to the state
setConversations((prev) => {
return [
...prev,
{
message: userMessage,
isHuman: true,
},
];
});
setUserMessage(""); // Emptying the input field
const stream = await openai.chat.completions.create({
model: "gpt-3.5-turbo",
messages: [{ role: "user", content: userMessage }],
stream: true, // This is required to stream the response
});
let streamedMessage = "";
// The stream will be read till its closed
for await (const part of stream) {
setAiMessage((prev) => prev + part.choices[0].delta.content);
// Once the entire message is received, the stream will receive the finish_reason as 'stop;
if (part.choices[0].finish_reason === "stop") {
setConversations((prev) => {
return [
...prev,
{
message: streamedMessage,
isHuman: false,
},
];
});
setAiMessage("");
break;
} else {
streamedMessage += part.choices[0].delta.content;
}
}
},
[userMessage]
);
The above handler will take care of the streaming and storing part. If you notice, we have a temporary state to store the AI message. This is to avoid mutating the conversations
state (which is an array), multiple times based on frequently streamed data. So once the entire stream is closed, we will reset the temporary state and store the message in the conversations
list
Demo
With everything put together, the app will do what it was intended to do
It streams the response and mutates the UI. This gives a feel that the user is not waiting for a long time to get a response, rather they get visual feedback as and when chunks of response are available
This concludes the client-side setup for streaming response from OpenAI
🌐 Backend Implementation (Node JS + Express + OpenAI library)
Why server? Well, as I mentioned above, OpenAI doesn't want you to use the apiKey on the client side and they have a very strong reason for it. We will create a new node js project and install the required dependencies
mkdir server && cd server
yarn init -y
yarn add express openai
Route to handle conversations
We will create a file named server.js
and set up everything required within this single file. It is pretty straightforward with express
import express from "express";
import OpenAI from "openai";
const app = express();
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY, //Ensure this is a part of your env variables
});
app.use(express.json());
app.post("/api/chat", async (req, res) => {
const { message } = req.body;
const stream = await openai.chat.completions.create({
model: "gpt-3.5-turbo",
stream: true,
messages: [
{
role: "user",
content: message,
},
],
});
for await (const part of stream) {
const reason = part.choices[0].finish_reason;
if (reason === "stop") {
break;
}
const chunk = part.choices[0].delta.content;
res.write(chunk); //Write each chunk to the open stream
}
res.end(); //Close the connection once the streaming is done
});
app.listen(5000, () => {
console.log("Server is listening on port 5000");
});
In the above server-side code, the route /api/chat
will handle the communication with the OpenAI api. We send the message from the UI in the request body and pass it over to the chat completion api. We use the res.write(chunk)
function to continuously stream the data back to the UI.
Read from the stream in the UI
We need to have a new reader in the UI to read the streamed data and mutate the state accordingly. The handleSubmit
function with the logic to read from the server response will look like this,
const handleSubmit = useCallback(
async (e) => {
e.preventDefault();
setConversations((prev) => {
return [
...prev,
{
message: userMessage,
isHuman: true,
},
];
});
setUserMessage("");
const res = await fetch("/api/chat", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({ message: userMessage }),
});
if (!res.ok) {
console.log("Error in response");
return;
}
let streamedMessage = "";
const reader = res.body.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const decoded = new TextDecoder("utf-8").decode(value);
streamedMessage += decoded;
setAiMessage((prev) => prev + decoded);
}
setConversations((prev) => {
return [
...prev,
{
message: streamedMessage,
isHuman: false,
},
];
});
setAiMessage("");
},
[userMessage]
);
We get the reader from the response body returned by fetch
and we should keep reading the data till the stream is closed. When the stream is closed, the value of done
will be true
and use that to break out of the loop (Ensure this is done, otherwise the loop becomes infinite). The data read from the stream will be an unsigned integer array and we use the TextDecoder
to convert it into a string. The rest of the logic is the same where we mutate the states to show the response inside the chat bubbles
Demo
The UI invokes the /api/chat
api and reads the response from the stream. The connection will be in a pending state until the stream is closed
Conclusion
Streaming the completions from OpenAI is easy to implement. Even though the above demo is based on JS, the same could be implemented with Python or any other language of your choice
Frameworks like langchain
also provides callbacks to read from the stream. With streaming, you can avoid the feel of long wait times. You can write your logic to split the responses into separate chat bubbles or perform on-the-fly formatting and a bunch of other things. The above demo project can be found here. Clone it and get creative...
Happy hacking!