28 Comments
User's avatar
Rob Nelson's avatar

I can tell you exactly how the Khanmigo Jefferson chatbot responds to questions about Sally Hemings, at least how it responded back in October when I wrote up my experience pretending to be a fifth grader looking to complete a homework assignment. It responds with an approximation of an encyclopedia entry. And if you ask it questions that border on the sexually explicit it refuses to answer.

As you point out, the first problem with using LLMs as a tool for history education is confabulation. The second, as you also point out, is that LLMs generate language that has emotional and moral impact, but unlike writers and editors of encyclopedias, they have no agency or ethical understanding.

The third reason is that, unlike watching Bill and Ted, talking to one is deadly boring.

Sadly, the first two reasons are not likely to stop people from using them as educational tools. The third reason might.

John Warner's avatar

Do you have a link to your writeup you could post in a comment? I think people who come across this would really benefit from seeing it. I can add a note to the main text linking to it as well.

One of the patterns that seems to keep repeating with these genAI applications is an initial sense of wonder/novelty that is relatively quickly replaced with boredom. Like the Sora video application or those instant song generators blow you away, but once the novelty wears off you're left asking "who cares?"

Even in Khan's book the paces he puts his own bot through are dull as dirt. The excitement for these applications seems to be built on a fantasy where we imagine a version much more capable than what's on offer.

Rob Nelson's avatar

Very kind of you to invite me to post a link! The original was posted in September on LinkedIn, but I appended it to a general piece on Khanmigo in December on Substack. Here is an internal link to the Substack post that should go straight to the Jefferson stuff. https://ailogblog.substack.com/i/139726012/what-happens-when-you-ask-khanmigos-thomas-jefferson-chatbot-about-sally-hemings

Rob Nelson's avatar

They are an excellent technology for producing short videos that tell stories about the future arriving. But getting people excited about a demo and getting people to pay a monthly subscription for an entertainment service are two different things.

John Warner's avatar

I wonder if your feelings about Khanmigo and its overall promise have evolved at all since the experience you wrote about in September. I'm a skeptic, as I made clear in my review of Khan's book (and here), so I'm interested in hearing from people who see more potential and what they're seeing.

Rob Nelson's avatar

My dial has been sliding toward the skeptical all year. I have always admired Khan Academy as a resource to help a kid and whoever the kid has helping with homework. Maybe not as highly as Jimmy Wales, but I rated Sal Khan pretty high on the list of heroes of the internet.

As I've grown more convinced that anthropomorphizing LLMs prevents us from understanding how they actually work, I've also realized how not helpful it is in contexts when I'm trying to help my six-grader with homework. And when they hook GPT-4o up to Khanmigo it won't help. A video illustration is much better than an LLM.

I guess Khan was always part of Silicon Valley's deep commitment to decontextualized but technically personalized learning technology as the way to approach educational reform, but now he is the spokesmodel for the whole movement.

John Warner's avatar

It's interesting how thoroughly besotted they become with the concept of this infinitely patient tutor as a solution to the problem of education, rather than sticking with making good educational content available. From what I've seen Khanmigo's presentations of topics are considerably worse, from a content perspective, than the video explanations. The interactivity not only seems to add nothing, it makes it less useful.

Amy Pemberton's avatar

Ironically, my church had a Thomas Jefferson re-enactor once give a service. I remember someone in the congregation asking "Thomas Jefferson" about his views on women's rights (not high) and the re-enactors response, including context (Thomas Jefferson was limited in his ability to see women outside the roles of mothers, wives and caretakers). This was in front of a congregation of mostly adults who understood the game. Letting children in particular be exposed to a simulacrum of some historic figure in an educational setting seems highly irresponsible.

John Warner's avatar

I think this is inevitable once more people have specific exposure to the actual interface outside of the hype with which these products are introduced. Reading Khan's book, my primary response was "Is this it?"

One of the mistakes that the boosters of this technology are making is that at some point people will actually use it and when the experience fails to match that hype, disillusionment will arrive rather quickly.

Jason Gulya's avatar

I really appreciate this breakdown, John!

This use case has always felt ick (I know, a technical term) so I just never did it.

I get your point. This exercise creates a caricature of the person, but our students will see it as authoritative.

A lot to think about...

John Warner's avatar

Very glad to know it was helpful to you. With a lot of this stuff I feel like the novelty gets ahead full consideration of how this stuff works.

Mike Kentz's avatar

You can put these chatbots in front of students without short-circuting skepticism.

Have them interview the bot after they have received the content from you and then write an essay answering a prompt along these lines: "How effective was the Rembrandt bot at mimicking the Rembrandt you learned about from _____? How effective was the Rembrandt bot at furthering your understanding of the artist, his life, and his works?"

This forces the student to evaluate the bot against the historical texts that you provided. They will often/likely come to the conclusion that the bots are worthless, thereby delivering the lesson you hoped to deliver but letting them draw the conclusion themselves.

I had my students do this with a Holden Caulfield chatbot last September. Many of them concluded the bot was either ineffective at mimicking Holden or ineffective at furthering their understanding. To me, this is the best kind of critical thinking. Let them evaluate the bot themselves.

https://www.edsurge.com/news/2024-02-09-how-a-holden-caulfield-chatbot-helped-my-students-develop-ai-literacy

Later in the year I found out that the members of this class all now thought Character.ai was dumb. My other class - who did not do this project - was hooked to the platform. My only regret was that I didn't put the bot in front of them too.

John Warner's avatar

But why am I spend so much time having them interact with a bot? What is the end goal of this kind of learning experience? I don't intend to deliver any lesson about the bots because I don't think the bots are relevant or interesting when it comes to the experiences of learning.

Threre's definitely a use in familiarizing students with the tech and making them think critically about it in the way you describe, but if I do that once, mission accomplished. I'd be ready to move on to something genuinely more meaningful than bouncing my ideas and intellect off of something that cannot think, feel, or communicate with intention.

Why keep playing with the bots?

Mike Kentz's avatar

I see what you mean, but perhaps I should have told you about the students that didn't get it...

For some students you will see that their ability to critically review the bot-produced information and analyze it next to what you gave them simply is not there. They can't do it. They believe the bot when it says stupid stuff, or you see that they didn't actually understand Rembrandt at all. In those situations, it's a pretty good assessment technique for the content you delivered. Sometimes you can see right in the chat or in their review of the bot that they have no idea who Rembrandt is. In the age of ChatGPT, that's a better assessment than at at-home essay.

Further from that, there are some kids that understood the content (Rembrandt) but didn't have the critical reviewing skills to cross-reference and/or compare it against the information they have on hand to draw the correct conclusions. They are kind of "on their way" but maybe at a B-level in terms of this particular skill. So, re-using it is a good way to establish the skill to engage in comparative analysis with their own domain expertise and draw conclusions and communicate them clearly.

For the kids that "get it," you are right. There's not much else to say to them. But - that would be like saying, "This kid got a 100 on this essay, so I'm not gonna teach him essays anymore." AI Literacy is a writing skill. The content the bots produce is different every time. Meaning, those engagements are abstract experiences. They have to react in real-time, akin to writing a different type of essay on a different type of topic. So, while the student might be bored (they probably will be), it's still an effective reinforcement learning tool for them.

So - I am not saying to use it "all the time," but you'd be surprised what you can learn about a student from "grading their chats" and asking them to critically review and reflect ont he effectiveness of a tool.

John Warner's avatar

I think that all sounds fine, but interacting with the bots isn't necessary to achieve any of those goals and we had the capacity to make those distinctions long before this technology arrived. Why do we now think these activities should be done by interacting with the bots? Isn't that a kind of failure of critical literacy by not questioning the employment of the technology in the first place, e.g., the moral injury of resurrecting the dead in the form of a bot?

I think where I probably disagree is that "AI literacy is a writing skill." This seems to presuppose that AI literacy is a necessity as part of a writing practice and I think we have no evidence of that, certainly not when it comes to learning to write.

I'm sure grading the chats reveals some insights, but are they demonstrably better than quality reflective writing or experiences? You've made the bot responsive to some of your pedagogical goals for sure, but I don't see anything unique to the bots that can't be (and hasn't been) accomplished with them.

I know some argue that we need to employ these things because the tech is "inevitable" but that's not a value I accept, personally, as I consider my own approaches.

Mike Kentz's avatar

All good points. I don't disagree with anything you've said.

A quote that comes to mind for me is, "We all want to save the world, we just disagree on how to do it."

It is definitely true that there are other ways to build the skills we've been talking about without using bots. But, there is only one way to build critical AI Literacy. Put the bots in front of students and ask them to evaluate them.

For me, that is the way to "save the world." I can keep them out of my classroom, sure, but I can't keep them out of my students' hands. So, I have a responsibility to teach this to them. If I can pair them with content understandings and evaluations, then I am really hitting the mark.

So, to be clear, I'm not using them cause I think they are a great pedagogical tool. I'm using them because I feel like I have to. Sure, I can say "not in my classroom," but what good does that do my students? Based on your responses, I think that is the fundamental point on which we disagree (which is healthy and fine.) Maybe I am beating a dead horse, but I think that is at the core.

I would also say that yes, in some cases, the chats that I graded were demonstrably better than the traditional reflective writing experiences I have employed. That may be a failure of my traditional writing assessments, but I did find (for some of my students) that the chats were stunning windows into their thinking processes that I never really had access to before as a teacher. (This is not limited to character bots, by the way.)

It's complicated. I think we agree that character bots are stupid and were not a good idea in the first place. We're just reacting to their presence differently. More than one way to skin a cat. I appreciate this chat, thank you for engaging in this way with me.

John Warner's avatar

I agree that classrooms can't be walled off from this stuff, and the ways and whys of how AI can or should be engaged are infinite, but I really worry about giving in to the "inevitability" narrative out of the gate as a frame for critically engaging with the technology. One of the avenues available to students (and teachers, etc...) must be rejection, otherwise we're not allowing critical engagement. If the signal we send is that interacting with bots is inevitable (because jobs, or whatever), I think that's an abrogation of the need to give students freedom and agency over their lives.

I'm pretty convinced that this tech is hastening an environmental apocalypse and as the AI companies are racing toward AGI as a savior, we're going to drain our groundwater and exhaust our energy while cooking the planet in the process. If these aren't also questions put in front of students, IMO, we're doing the bidding of tech companies that see a future where our lives require us to interact with bots.

Mike Kentz's avatar

Not a bad, point. The environmental factor is a creeping issue that no one wants to pay attention to. Feels like it is lurking in the corner.

YoChatGPT!'s avatar

Valid points. However, if there is a moderator (like the teacher) who verifies and checks the responses from the genAI historical figure, that is more "safer" and "guided". On our platform YoChatGPT!, that is the goal: safe and guided use of genAI with teachers. Good for AI literacy courses, as well.

Clayton Ramsey's avatar

Absolutely. This is what well researched historical fiction is for. Or a classroom discussion with an expert. Humanizing the bots too much is perilous.

Stephen Fitzpatrick's avatar

I agree with this wholeheartedly. If you could actually replicate a real conversation with a historical figure, LLM’s would not permit you to discuss anything interesting.

Bjorn Behrendt's avatar

While I think the author brings up good points to be cautious of AI, I think the approach of "do not use" is wrong. I did not see anything pertaining to student engagement in the article, which is why a teacher might use the new AI medium for sharing information. The article also doesn't acknolete that AI contextual content will be this generation's primary means of information consumption.   The author brings up good points, but where he says we should avoid using resources because of it, I think educational institutions need to teach students to understand those differences. 

John Warner's avatar

There are a number of unevidenced assumptions in this comment that I'd like to unpack.

1. "I did not see anything pertaining to student engagement in the article, which is why a teacher might use the new AI medium for sharing information."

We have no evidence that this sort of activity enhances student "engagement." We also should be paying attention to what kind of engagement we wish to foster in students. Is surface level chatting with chatbot the same as reading and discussing a text? The engagement of chatting with a bot assumes a fairly low opinion of student curiosity and interest. I hope we can aim higher than this.

2. "The article also doesn't acknolete that AI contextual content will be this generation's primary means of information consumption."

This inevitability argument is beneath what I would expect from a thinking person. Choosing to cede ourselves to a mode of interaction simply because it exists is to abandon our own humanity to the companies developing this technology. What are we signaling to students about the world and their lives if we tell them "this is inevitable"?

Most importantly, these bots based in historical figures are a moral abomination. To take a person who lived in the world and reduce them to a chatbot and tell students that this is an appropriate way to learn about and honor the life of this person is truly grotesque. I understand the enthusiasm for it among people who can make a buck on it, but we should maintain the power to reject it because it is offensive to us.

Joseph Micallef's avatar

How much of what you have been stating here is based on empirical studies where outcomes can be measured, may I ask? Or is this just an opinion piece?

John Warner's avatar

You can read, right? What do you think?

Joseph Micallef's avatar

Depending how the bots - more properly called AI Agents - are generated. I use ElevenLabs and it lets me add reference documents and websites to minimize hallucinations making them virtually non-existent. The bots are now basically as good as the reference documents offered.

John Warner's avatar

I think you're self-deluding if you think that you've created an LLM-interface where hallucinations are virtually non-existent, given that the developers of this technology themselves are now admitting that this is not a likely outcome.

The more important question, though, is why we would engage in this novelty when we have something far superior, the works of the unique intelligences of the historic persons as well as the writing and scholarship that has been developed by other unique intelligences concerning those historic personages.

What's wrong with students reading, thinking, conversing? This is at best a novelty, and at worth, a degradation of our humanity.