- ChatMate - Your AI-Powered Chatbot
ChatMate - Your AI-Powered Chatbot
ChatMate - Your AI-Powered Chatbot
Overview
ChatMate is a Next.js application that provides a chatbot interface with local conversation storage, API integration for backend queries, conversation history in a sidebar, and user authentication. It leverages Genkit for AI-powered features such as conversation summarization and text-to-speech.
Features
- User Authentication: Secure user authentication and authorization.
- Conversation History: Easy access to past conversations in a collapsible sidebar.
- Local Storage: Conversations are stored locally on the user's device.
- API Integration: Queries are sent to the backend API for processing.
- AI-Powered Summarization: Summarizes the last 10 messages of the current conversation using an AI-powered tool, displayed at the top of the chat window.
- Text-to-Speech (TTS): Converts AI responses to speech for accessibility.
- Voice Input: Allows users to transcribe audio to text for input.
- Theme Toggle: Option to switch between light and dark modes.
- Microphone and Speaker Audio Testing: Options to configure the microphone and speakers
Technologies Used
- Next.js - React framework for building performant web applications
- TypeScript - Typed superset of JavaScript
- Tailwind CSS - Utility-first CSS framework
- ShadCN UI - Reusable UI components built with Radix UI and Tailwind CSS
- Lucide React - Icons
- Genkit - Platform for building reliable, scalable, and observable AI-powered features
- Zod - TypeScript-first schema validation with static type inference
- Firebase - Backend services for authentication and data storage
- NextAuth.js - Authentication library for Next.js
System Diagram
This diagram illustrates the overall architecture and data flow of the ChatMate application, focusing on the interactions within page.tsx and the AI flows.
graph TD
subgraph "Browser (Client-Side)"
U[User] --> UI[React UI (src/app/page.tsx)]
UI -- Interacts with --> StateHooks[React State Hooks (useState, useEffect, etc.)]
UI -- Renders --> Comp[Shadcn UI Components (@/components/ui/*)]
subgraph "User Input/Output"
InputArea[Text Input Area] --> |onKeyDown, onChange| StateHooks
MicBtn[Mic Button] --> |onClick| Record[Recording Logic (start/stopRecording)]
TTSBtn[TTS Button/Select] --> |onClick, onValueChange| Speak[Speak Logic (speakText)]
end
subgraph "Storage"
StateHooks -- Persists/Loads --> LS[Browser localStorage (messages, history, names, summary)]
end
subgraph "Media Handling"
Record --> |getUserMedia| BrowserAPI[Browser MediaRecorder API]
BrowserAPI --> |Audio Blob| Record
Record --> |createObjectURL| AudioDataURL[Client-Side Audio URL]
Speak --> |new Audio(ServerURL/DataURL)| AudioPlayer[HTML Audio Element]
end
UI -- Triggers --> SendMsg[sendMessage Logic]
UI -- Triggers --> NewChat[createNewConversation Logic]
UI -- Triggers --> SelectChat[selectConversation Logic]
UI -- Triggers --> RenameChat[handleRename Logic]
UI -- Triggers --> DeleteChat[deleteConversation Logic]
UI -- Triggers --> ReloadChat[handleReload Logic]
end
subgraph "Server-Side (Next.js API / Genkit Flows)"
AIFlows[AI Flows (@/ai/flows/*)]
Genkit[Genkit AI Instance (@/ai/ai-instance.ts)]
subgraph "Specific Flows"
AnswerFlow[answerMessage]
SummarizeFlow[summarizeConversation]
NameFlow[nameConversation]
TTSFlow[textToSpeech]
TranscribeFlow[transcribeAudio]
end
AIFlows -- Uses --> Genkit
Genkit -- Interacts with --> ModelAPI[AI Model API (e.g., Gemini)]
end
subgraph "Data Flow"
SendMsg --> |User Input, History, Summary| AnswerFlow
AnswerFlow --> |AI Response| SendMsg
SendMsg --> |Update State| StateHooks
SendMsg --> |Potentially Trigger| NameFlow
NameFlow --> |AI Name| SendMsg
StateHooks -- Triggers Summary Update --> SumLogic[updateSummary Logic]
SumLogic --> |Messages| SummarizeFlow
SummarizeFlow --> |Summary| SumLogic
SumLogic --> |Update State| StateHooks
Speak --> |Text, Language| TTSFlow
TTSFlow --> |Audio URL or Data URL| Speak
Speak --> AudioPlayer
Record --> |Audio Data URL| TranscribeFlow
TranscribeFlow --> |Transcription Text| Record
Record --> |Update Input State| StateHooks
StateHooks --> LS
end
%% Styling
classDef client fill:#f9f,stroke:#333,stroke-width:2px;
classDef server fill:#ccf,stroke:#333,stroke-width:2px;
classDef storage fill:#lightgrey,stroke:#333,stroke-width:1px;
classDef state fill:#D6EAF8,stroke:#333,stroke-width:1px;
classDef logic fill:#ABEBC6,stroke:#333,stroke-width:1px;
classDef flows fill:#FDEDEC,stroke:#333,stroke-width:1px;
class UI,StateHooks,Comp,InputArea,MicBtn,TTSBtn,Record,Speak,AudioPlayer,AudioDataURL,BrowserAPI client;
class LS storage;
class SendMsg,NewChat,SelectChat,RenameChat,DeleteChat,ReloadChat,SumLogic logic;
class AIFlows,Genkit,AnswerFlow,SummarizeFlow,NameFlow,TTSFlow,TranscribeFlow,ModelAPI server;
class AnswerFlow,SummarizeFlow,NameFlow,TTSFlow,TranscribeFlow flows;
Client-Side (Browser):
User Interaction: The user interacts with the React UI rendered by src/app/page.tsx. UI Components: The UI is built using Shadcn UI Components. State Management: User actions (typing, clicking buttons) update the component's state managed by React State Hooks (useState, useEffect, etc.). Logic Handlers: Functions like sendMessage, speakText, startRecording, createNewConversation, etc., are triggered by UI interactions. Storage: Key application data (conversation messages, history list, conversation names, summaries) is persisted in the Browser localStorage to retain state across sessions. Media Handling: Recording uses the browser's MediaRecorder API to capture audio, creating a client-side Blob and Audio URL. TTS playback uses the standard HTML Audio Element, playing audio from a URL provided by the backend (or potentially a Data URL). Server-Side (Next.js API / Genkit Flows):
Genkit AI Instance: Acts as the core interface for interacting with the underlying AI model (e.g., Gemini). AI Flows: Specific tasks are encapsulated in server-side functions defined using Genkit (answerMessage, summarizeConversation, nameConversation, textToSpeech, transcribeAudio). These flows interact with the Genkit AI Instance. AI Model API: Genkit communicates with the configured Large Language Model API to perform tasks like text generation, summarization, naming, TTS, and transcription. Data Flow:
Chatting: User input triggers sendMessage, which calls the answerMessage flow. The response updates the state, which re-renders the UI. Messages are saved to localStorage. Summarization (updateSummary -> summarizeConversation) and Naming (sendMessage -> nameConversation) flows are triggered based on conversation state. TTS: Clicking the speak button triggers speakText, calling the textToSpeech flow. The returned audio URL is used to create and play an Audio object. Transcription: Clicking the mic button triggers startRecording. When stopped, the audio data (as a client-side URL) is sent to the transcribeAudio flow. The returned text updates the input state. Persistence: State changes related to conversations (messages, names, history, summary) trigger updates to localStorage. Data is loaded from localStorage on initial load and when switching conversations. This architecture uses client-side state management and localStorage for the core chat interface and history, while leveraging server-side Genkit flows for all AI-powered operations.
Setup Instructions
Follow these steps to set up the ChatMate application:
Prerequisites
- Node.js (version 18 or higher)
- npm or yarn
- Genkit CLI:
npm install -g genkit-clioryarn global add genkit-cli
Installation
-
Clone the repository:
git clone [your_repository_url] cd ChatMate -
Install dependencies:
npm install # or yarn install -
Environment Variables: Create a
.envfile in the root directory and add the following environment variables:GOOGLE_GENAI_API_KEY=YOUR_GEMINI_API_KEY GOOGLE_CLOUD_PROJECT=YOUR_GOOGLE_CLOUD_PROJECT GOOGLE_CLOUD_LOCATION=YOUR_GOOGLE_CLOUD_LOCATIONGOOGLE_GENAI_API_KEY: Obtain an API key from the Google AI Studio.GOOGLE_CLOUD_PROJECT: Your Google Cloud Project ID.GOOGLE_CLOUD_LOCATION: The location for your Google Cloud project (e.g.,us-central1).
-
Configure Firebase:
- Set up a Firebase project on the Firebase Console.
- Enable Authentication (e.g., Google Sign-In).
- Add Firebase configuration to your project.
-
Run the application:
npm run dev # or yarn devOpen your browser and navigate to
http://localhost:9002.
Package.json Scripts
Here's a breakdown of the scripts in package.json:
-
"dev": Runs the Next.js development server with Turbopack on port 9002."dev": "next dev --turbopack -p 9002" -
"genkit:dev": Starts the Genkit development server, executingsrc/ai/dev.ts."genkit:dev": "genkit start -- tsx src/ai/dev.ts" -
"genkit:watch": Starts the Genkit development server with live reloading on changes tosrc/ai/dev.ts."genkit:watch": "genkit start -- tsx --watch src/ai/dev.ts" -
"build": Builds the Next.js application for production."build": "next build" -
"start": Starts the Next.js production server."start": "next start" -
"lint": Runs the ESLint linter for code quality checks."lint": "next lint" -
"typecheck": Runs the TypeScript compiler for type checking."typecheck": "tsc --noEmit"
AI Flows
The AI-powered features in ChatMate are implemented using Genkit flows. These flows are defined in the src/ai/flows directory and registered in src/ai/dev.ts.
1. Answering User Messages (src/ai/flows/answer-message.ts)
- Purpose: Generates a response to the user's message based on conversation history and a summary.
- Input:
message: The user's message (string).conversationHistory: The history of the conversation (array of{ sender: string; text: string }).summary: The summary of the conversation (optional string).
- Output:
response: The AI's response (string).
- Prompt: Uses a prompt to instruct the AI to act as a customer care agent for Luminous Power Technologies. It utilizes the conversation history and summary to provide relevant information.
- Usage: The
answerMessagefunction is called with the user's input, conversation history, and summary to generate a response.
2. Naming Conversations (src/ai/flows/name-conversation.ts)
- Purpose: Generates a descriptive name for a conversation based on the first message.
- Input:
firstMessage: The first message of the conversation (string).
- Output:
conversationName: The generated name for the conversation (string).
- Prompt: Uses a prompt to create a short, descriptive name based on the first message, aiming for conciseness and relevance.
- Usage: The
nameConversationfunction is called with the first message of a new conversation to generate an appropriate name.
3. Summarizing Conversations (src/ai/flows/summarize-conversation.ts)
- Purpose: Summarizes the last 10 messages of a conversation using AI.
- Input:
messages: The last 10 messages of the conversation (array of{ sender: string; text: string }).
- Output:
summary: The summary of the conversation (string).
- Prompt: Uses a prompt to summarize the given conversation in a few sentences.
- Usage: The
summarizeConversationfunction is called with the last 10 messages of the conversation to generate a summary.
4. Text-to-Speech (src/ai/flows/text-to-speech.ts)
- Purpose: Converts text to speech using a Genkit flow.
- Input:
text: The text to convert to speech (string).language: The language to convert the text to (optional string, defaults to English).
- Output:
audioUrl: The URL of the generated audio file (string).
- Prompt: Uses a prompt that includes the input text and language to instruct the AI to convert the text to speech.
- Usage: The
textToSpeechfunction is called with the text and language to generate an audio URL.
5. Transcribing Audio (src/ai/flows/transcribe-audio.ts)
- Purpose: Converts audio to text using a Genkit flow.
- Input:
audioUrl: The URL of the audio file to transcribe (string).
- Output:
transcription: The transcribed text from the audio (string).
- Prompt: Uses a prompt to transcribe the audio from the given URL.
- Usage: The
transcribeAudiofunction is called with the audio URL to generate a transcription.
Page.tsx Functionalities
The src/app/page.tsx file implements the main chatbot interface. Here's a breakdown of its functionalities:
- State Management: Uses
useStateto manage messages, input, summaries, conversation history, and other UI-related states. - Conversation Handling: Implements logic for creating new conversations, deleting conversations, selecting conversations, and renaming conversations.
- Message Sending: Sends user messages to the backend API for processing and receives AI responses.
- Local Storage: Stores conversation data (messages, summaries, and conversation names) in local storage to persist conversations across sessions.
- AI Integration: Integrates with Genkit flows to summarize conversations, generate conversation names, and answer user messages.
- Text-to-Speech: Implements text-to-speech functionality using the
textToSpeechflow. - Voice Input: Implements voice input functionality using the
transcribeAudioflow. - UI Rendering: Renders the chatbot interface with messages, input area, conversation history, and other UI elements.
- Audio configuration: Gives option to configure the audio setting for speakers and microphone
Further Development
- Implement user authentication and authorization.
- Enhance the UI with more styling and customization options.
- Add more AI-powered features, such as sentiment analysis and topic detection.
- Integrate with other backend services and APIs.
- Improve error handling and user feedback.
- Implement testing and monitoring.
System Diagram
This diagram illustrates the overall architecture and data flow of the ChatMate application, focusing on the interactions within page.tsx and the AI flows.
graph TD
subgraph "Browser (Client-Side)"
U[User] --> UI[React UI (src/app/page.tsx)]
UI -- Interacts with --> StateHooks[React State Hooks (useState, useEffect, etc.)]
UI -- Renders --> Comp[Shadcn UI Components (@/components/ui/*)]
subgraph "User Input/Output"
InputArea[Text Input Area] --> |onKeyDown, onChange| StateHooks
MicBtn[Mic Button] --> |onClick| Record[Recording Logic (start/stopRecording)]
TTSBtn[TTS Button/Select] --> |onClick, onValueChange| Speak[Speak Logic (speakText)]
end
subgraph "Storage"
StateHooks -- Persists/Loads --> LS[Browser localStorage (messages, history, names, summary)]
end
subgraph "Media Handling"
Record --> |getUserMedia| BrowserAPI[Browser MediaRecorder API]
BrowserAPI --> |Audio Blob| Record
Record --> |createObjectURL| AudioDataURL[Client-Side Audio URL]
Speak --> |new Audio(ServerURL/DataURL)| AudioPlayer[HTML Audio Element]
end
UI -- Triggers --> SendMsg[sendMessage Logic]
UI -- Triggers --> NewChat[createNewConversation Logic]
UI -- Triggers --> SelectChat[selectConversation Logic]
UI -- Triggers --> RenameChat[handleRename Logic]
UI -- Triggers --> DeleteChat[deleteConversation Logic]
UI -- Triggers --> ReloadChat[handleReload Logic]
end
subgraph "Server-Side (Next.js API / Genkit Flows)"
AIFlows[AI Flows (@/ai/flows/*)]
Genkit[Genkit AI Instance (@/ai/ai-instance.ts)]
subgraph "Specific Flows"
AnswerFlow[answerMessage]
SummarizeFlow[summarizeConversation]
NameFlow[nameConversation]
TTSFlow[textToSpeech]
TranscribeFlow[transcribeAudio]
end
AIFlows -- Uses --> Genkit
Genkit -- Interacts with --> ModelAPI[AI Model API (e.g., Gemini)]
end
subgraph "Data Flow"
SendMsg --> |User Input, History, Summary| AnswerFlow
AnswerFlow --> |AI Response| SendMsg
SendMsg --> |Update State| StateHooks
SendMsg --> |Potentially Trigger| NameFlow
NameFlow --> |AI Name| SendMsg
StateHooks -- Triggers Summary Update --> SumLogic[updateSummary Logic]
SumLogic --> |Messages| SummarizeFlow
SummarizeFlow --> |Summary| SumLogic
SumLogic --> |Update State| StateHooks
Speak --> |Text, Language| TTSFlow
TTSFlow --> |Audio URL or Data URL| Speak
Speak --> AudioPlayer
Record --> |Audio Data URL| TranscribeFlow
TranscribeFlow --> |Transcription Text| Record
Record --> |Update Input State| StateHooks
StateHooks --> LS
end
%% Styling
classDef client fill:#f9f,stroke:#333,stroke-width:2px;
classDef server fill:#ccf,stroke:#333,stroke-width:2px;
classDef storage fill:#lightgrey,stroke:#333,stroke-width:1px;
classDef state fill:#D6EAF8,stroke:#333,stroke-width:1px;
classDef logic fill:#ABEBC6,stroke:#333,stroke-width:1px;
classDef flows fill:#FDEDEC,stroke:#333,stroke-width:1px;
class UI,StateHooks,Comp,InputArea,MicBtn,TTSBtn,Record,Speak,AudioPlayer,AudioDataURL,BrowserAPI client;
class LS storage;
class SendMsg,NewChat,SelectChat,RenameChat,DeleteChat,ReloadChat,SumLogic logic;
class AIFlows,Genkit,AnswerFlow,SummarizeFlow,NameFlow,TTSFlow,TranscribeFlow,ModelAPI server;
class AnswerFlow,SummarizeFlow,NameFlow,TTSFlow,TranscribeFlow flows;
Page.tsx Functionalities
The src/app/page.tsx file implements the main chatbot interface. Here's a breakdown of its functionalities:
- State Management: Uses
useStateto manage messages, input, summaries, conversation history, and other UI-related states. - Conversation Handling: Implements logic for creating new conversations, deleting conversations, selecting conversations, and renaming conversations.
- Message Sending: Sends user messages to the Genkit flows for processing and receives AI responses.
- Local Storage: Stores conversation data (messages, summaries, and conversation names) in local storage to persist conversations across sessions.
- AI Integration: Integrates with Genkit flows to summarize conversations, generate conversation names, and answer user messages.
- Text-to-Speech: Implements text-to-speech functionality using the
textToSpeechflow. - Voice Input: Implements voice input functionality using the
transcribeAudioflow. - UI Rendering: Renders the chatbot interface with messages, input area, conversation history, and other UI elements.
- Audio configuration: Gives option to configure the audio setting for speakers and microphone
Further Development
- Implement user authentication and authorization.
- Enhance the UI with more styling and customization options.
- Add more AI-powered features, such as sentiment analysis and topic detection.
- Integrate with other backend services and APIs.
- Improve error handling and user feedback.
- Implement testing and monitoring.