Making a Student Attendance Tracking ChatBot


I built a chatbot that can take attendance and answer questions about attendance history, using LangChain and OpenAI "Function Calling".

Run in CodeSandbox (opens in a new tab) or View Source Code (opens in a new tab)

Note: don't forget to set the "OPENAI_API_KEY" environment variable with your OpenAI API key.

In the school software I built, there is a feature that enables teachers to take attendance. The teacher can view a list of students in the class and mark them as present, absent, late, or excused each day. The attendance is then saved to the database. The teacher can also access the attendance history of each student. Since this is a web app, the teacher has to use a computer for this task. Not to mention, the teacher has to manually mark each student one by one and day by day.

Now, with the help of AI, I want to explore the possibilities of building a simple chatbot that can assist teachers with such tedious tasks. The teacher can text or talk to the chatbot to take attendance in natural language. Imagine the teacher can just pull out their smartphone, launch the chat app, and say, "Mark everyone as present today except for John Doe." The chatbot will then dutifully update the attendance in the database as instructed.

On the other hand, parents can also use the chatbot to communicate with the teacher about their children's attendance. For example, the parent can say, "John Doe is sick today," or "John Doe will be late today" to the chatbot. The chatbot will then notify the teacher about the message. After a bit of brainstorming, I came up with the following examples:

1. Taking attendance:

2. Teacher can ask questions about attendance history:

3. Parents can interact with the chatbot:

For this experiment, I'll use the LangChain library and OpenAI to build the chatbot.

Version 1 - import student table with CSV Loader

First of all, I need to load a CSV file of 124 sample student data (opens in a new tab) into LangChain and ask questions about it. The CSV file contains the following columns:

LangChain libraries comes with a CSV Loader so I'll give this a try.

import { CSVLoader } from "langchain/document_loaders/fs/csv";

After loading the student data, I asked a simple question: "List all the female students in homeroom 7". However, I found that the result is not correct:

Received answer:


The expected result should have 9 students:


Why is the answer not correct? After a little research, it seems that the CSV Loader approach has some disadvantages:

  1. While the current LLM model can easily read and parse a huge corpus of natural language content, it is not efficient at processing relational data and the table structure.

  2. It cannot handle complex queries as efficiently and accurately as SQL queries.

  3. The limitation of token length will probably cause the model to miss some important information in the long run, making it unfeasible to scale up.

I need to find another way to let the chatbot understand the student data better and answer the questions correctly.

Run version 1 in CodeSandbox:

Run in CodeSandbox (opens in a new tab)

Version 2 - SQL Chain

So, the next approach is to use SQL Chain. SQL Chain is a library that allows you to query data in a relational database using natural language. You can think of it as an English-to-SQL translator. There is a risk of running raw SQL and messing up the database, but I'll give it a try just for learning.

The idea is to first ask the LLM to generate an SQL query from my natural language question, then run the SQL query in the database, and finally return the result to the user.

Run version 2 in CodeSandbox:

Run in CodeSandbox (opens in a new tab)

I found that although the idea of directly manipulating the database with generated SQL is very cool, the queries it generates are pretty inconsistent, and at worst, it would not work because of the data conflict. For example, I asked the script to "mark all students in homeroom 7 as present today," but it generated an INSERT query that can only be run once. If I run the query again, it will throw an error because the student data already exists in the database.

Giving up on the idea of directly manipulating the database, I decided to use Function Callings to update the attendance data instead. This way, I can protect the data the way I want and yet expose the necessary operations to the chatbot to do its job.

Version 3 - Function Calling

In this version, I'll use OpenAI Function Calling feature to update the attendance data. The idea is to first ask the LLM to generate function calls from my natural language question, then run the returned function and arguments, and finally return the result to the user.

Run version 3 in CodeSandbox:

Run in CodeSandbox (opens in a new tab)

The result is pretty close to what I imagined. I was able to interact with the chatbot with natural language questions. Example:

The following is the code history:

Loading students into the database...
✨ Hello Mr. Smith! The default homeroom is 1, and today's date is 2023-12-28. How can I assist you today?
> hello
✨ Hello Mr. Smith! How can I assist you today?
> show me the attendance
10001   Mia Chen        1       2023-12-01      1
10008   Aiden Kumar     1       2023-12-01      1
10015   Mila Lee        1       2023-12-01      1
10022   Ezra Wang       1       2023-12-01      0       fever and cough
> show me the attendance today
> what's the date of today
✨ Today's date is 2023-12-28.
> mark everyone as present today
Marked all students in homeroom 1 as present on 2023-12-28
10001   Mia Chen        1       2023-12-28      1
10008   Aiden Kumar     1       2023-12-28      1
10015   Mila Lee        1       2023-12-28      1
10022   Ezra Wang       1       2023-12-28      1
10029   Aaliyah Lopez   1       2023-12-28      1
10036   Liam Rao        1       2023-12-28      1
10043   Lily Wu 1       2023-12-28      1
10050   Kai Singh       1       2023-12-28      1
10057   Avery Lin       1       2023-12-28      1
10064   Mason Lam       1       2023-12-28      1
10071   Harper Gupta    1       2023-12-28      1
10078   William Kim     1       2023-12-28      1
10085   Ella Morales    1       2023-12-28      1
10092   Lucas Kumar     1       2023-12-28      1
10099   Ava Singh       1       2023-12-28      1
10106   Liam Hernandez  1       2023-12-28      1
10113   Lily Garcia     1       2023-12-28      1
10120   Harper Lam      1       2023-12-28      1
> who was absent this month?
10022   Ezra Wang       1       2023-12-01      0       fever and cough
> Lily was sick 2 days ago
Found multiple students with name Lily. Please be more specific.
> Lily Wu was sick 2 days ago
Set attendance for student 10043 on 2023-12-26 to false with reason sick
10043   Lily Wu 1       2023-12-26      0       sick
> who was absent this month?
10022   Ezra Wang       1       2023-12-01      0       fever and cough
10043   Lily Wu 1       2023-12-26      0       sick

There are still some issues to be resolved:

  1. The current version is not able to memorize the history of the conversation. Currently, I am just hardcoding the context into the init prompt and sending the same init prompt to the LLM every time. I need to find a way to let the chatbot remember the context of the conversation so that it's possible to ask follow-up questions.

  2. I cannot figure out how to get the chatbot to call multiple functions in one conversation. For example, I want to ask the chatbot to "mark everyone as present today except for John Doe," but it can only process the first request and simply ignore the rest.

  3. There are some discussions about whether to use OpenAI Function Callings or simply write a comprehensive prompt to teach the chatbot how to respond in the desired JSON format. Function Calling yields more consistent results, but it's not as flexible as engineering your own prompt.

  4. User interface - it would be cool to integrate the chatbot into common instant messaging apps like WhatsApp, Slack, or Telegram.


Follow-up: I have written new versions to address the issues above. Check out the following links: