Let students use AI

I originally wrote this post for an essay contest at my school. It won the third place prize and was published in a campus magazine under a slightly different title. I’ll admit that my discussion is rather parochial in its focus on Williams, but my main claims easily generalize to other institutions. As always, I invite comments, objections, and feedback.

Williams is due for a deep rethink on the role of artificial intelligence (hereafter AI) in college classes. To see why, let’s start by considering where we are now. Just in case you’ve somehow missed it, AI language models have gotten very good over the past few years. OpenAI’s ChatGPT has perhaps caused the largest public sensation, impressing users with its intelligent and uncannily human-like output, but it’s far from the only advanced large language model (LLM) available. Other models such as OpenAI’s GPT-4, Anthropic’s Claude, and Google’s PaLM 2 have all matched or exceeded ChatGPT’s performance, and new frontier LLMs continue to proliferate. Crucially, these models are already sophisticated enough to meaningfully assist humans on a wide range of intellectual tasks, including academic tasks such as editing, finding citations, and writing code.

Further, I can confidently predict that AI is only going to get better over the coming years—much better. You can see this from the so-called scaling laws, highly consistent empirical relationships between the amount of data and compute used to train an LLM and a metric of performance known as the loss. These scaling laws imply that as AI labs continue to do ever larger and longer training runs, they will produce increasingly powerful language models, and given the mountains of money labs are currently spending on compute, it’s a safe bet that the next few generations of LLMs will be very capable indeed. In fact, I believe that if current trends hold, we are rapidly heading toward a future in which almost no cutting-edge intellectual work will be done by humans unaided by AI. It is reasonable to predict, in other words, that most important research and writing will soon be done by human-AI centaurs rather than by humans alone.

Given all of these developments and trends, it would seem that the College’s academic policies need updating. It would be quite surprising, after all, if rules and guidelines originally adopted in the nineteenth century were still optimal now that all students have on-demand access to powerful thinking machines. Yet on an institutional level, Williams has barely begun to reckon with the ongoing LLM revolution. The College Dean formed an Ad Hoc Committee on Academic Integrity last spring to “facilitate dialogue about [the honor code’s] implementation” in the age of AI, but the Committee is not expected to issue any decisions before the end of this year. In the meantime, Williams is left without any uniform college-wide policy on AI, leaving it up to professors to set their policies on a class by class basis.

To their credit, some Williams professors are clearly paying attention and have taken it upon themselves to come up with nuanced approaches to AI. One English class due to run in the spring, for instance, will not only allow students to consult AI on some assignments, but will teach them to “write two of their essays in collaboration with chat AI.” This class is unusual, though. It’s much more typical for professors not to announce their AI policies at all. In three of my four classes last semester, the syllabi did not mention AI and no guidelines relating to AI were ever given to the class. According to the chair of the Honor and Discipline Committee, most professors do not tell their students which uses of AI are permissible and which are impermissible, forcing the committee to make up policy as it goes along. Shockingly, even computer science—the department you might expect to be most on top of recent developments—says nothing about AI in its department honor code. And when professors do set explicit policies on AI, it is most common for them to ban it outright. My only professor from last semester who bothered to mention AI told us “it is a violation of the honor code to…use AI of any kind” full stop. Judging by what my classmates have told me, such blanket bans are fairly common at Williams.

I want to argue against both of these options: the non-policy of ignoring AI and the policy of banning it completely. The first has little to recommend it from any angle. Silence on AI might make sense if you believe LLMs are still too dumb to be worth banning, but as I’ve already noted, this is simply not the case. GPT-4 is more than smart enough to assist students on a wide range of academic tasks, and its successors are going to be even smarter. This gives students a clear incentive to use AI in their work, an incentive that they will follow unless professors set explicit restrictions on AI. It is unfair and arbitrary to punish students for crossing invisible boundaries when neither the College nor their professors have told them the rules ahead of time. In this sense, I think that any policy on AI would be better than no policy.

But that’s not to say that all policies would be equally good, and in particular, I think it would be a mistake for the College to ban students from using LLMs. I admit that there’s a certain perspective from which an AI ban would make sense. You might believe that the main purpose of higher education, or at least the purpose of Williams College, should be filtering, testing students’ intrinsic capacities and promoting those who are more capable over those who are less. If you buy this perspective, it’s not hard to see why you’d want to ban AI since work done in collaboration with AI reveals less about its author’s innate ability than work done without AI. Recent research suggests that giving workers access to an LLM improves their average productivity, but shrinks the gap in productivity between the best workers and the worst. If the same is true in education, allowing students to use LLMs would raise the average quality of their assignments while compressing the variance. AI would thus make students harder to filter.

However, I don’t believe that filtering ought to be the main purpose of a college education, and it certainly shouldn’t be the purpose of a Williams education. Rather, the College’s overriding goal should be teaching, giving its students new skills so that they can leave this place more capable than they were when they arrived. This is not just my eccentric opinion. Go and read the Williams College mission statement; you will see long, glowing tributes to the importance of teaching and learning and not a single word about filtering. The College’s purpose, according to the mission, is “to develop in students both the wisdom and skills they will need to become responsible contributors to whatever communities they join,” not to rank them as accurately as possible. And I wholeheartedly agree. Teaching should come first here, and filtering should be a distant second.

By banning AI, the College would fail to fulfill its teaching mission in two ways. First, it would not teach us to use AI effectively. Contrary to what’s sometimes thought, using an LLM well is not as easy as typing in the first query that pops into your head and letting the model do the rest. As experienced users can attest, naive prompts usually elicit mediocre answers, but you can learn to write better prompts through practice and by learning techniques such as chain of thought prompting. An equally valuable skill is knowing what kinds of tasks particular models excel at. Some models are well suited to coding tasks, others do better on writing tasks, and there are some tasks that are beyond the abilities of all current LLMs. Knowing which bucket a given task falls into ensures that you can choose the right tool for the job.

Perhaps professors are not the right people to teach students prompt engineering. Very few Williams faculty members are experts on AI, and LLMs were invented so recently that most professors have little more experience using them than their students have. Even so, I think that an AI ban would get in the way of students teaching themselves to use LLMs more effectively. It would deny them the opportunity to practice working with models on their homework assignments, and it would also give them a sense that AI is ethically dodgy, which would discourage them from experimenting with it on their own personal projects. This would be a clear loss for students’ learning.

Second and just as importantly, an AI ban would prevent the College from teaching us to use AI ethically. AI is improving so quickly that many of the ethical quandaries it will raise have yet to be fully recognized, let alone resolved. Under these circumstances, it’s reasonable that many professors don’t want to wade into the ethical swamp and find it cleaner to simply ban AI. Still, I believe there is robust precedent for us to draw upon as we collectively hash out new rules for AI. For instance, if you are using an LLM as a research assistant, editor, or librarian, you should acknowledge it just as you would acknowledge the corresponding type of human assistant. If you are turning in language that was generated by an LLM, you must clearly say so, and you must flag the AI’s words the same way you would flag a human scholar’s words with quotation marks. Of course, my suggestions are sketchy, and I’ve hardly even begun to cover the full range of ethical issues raised by AI. But precisely because these issues are so numerous and so thorny, students need help navigating them. The College should offer us this help just as it helps us to handle more traditional issues of academic honesty.

Obviously the College must impose some restrictions on AI. As a recent Record op-ed emphasized, “AI should augment, rather than replace, human intelligence,” and this plausibly requires that AI be off-limits in certain situations. Even in upper level courses, professors often give assignments that AI can easily do from start to finish—think calculation problems in a math course or translations in a language course. Allowing students to use AI on such assignments would be counterproductive to their learning, so professors can and should declare AI off-limits in these cases. The College also ought to impose some restrictions on AI to uphold academic honesty. If a professor or the College at large sets explicit citation rules for AI, and a student breaks them, that should be treated just like any other case of plagiarism.

Setting these cases aside though, Williams should permit students to use AI on most academic assignments. Further, the College should make this an official policy, so that if a professor doesn’t announce anything to the contrary, AI is permitted in their class by default. This policy would be much more transparent than the status quo, and on balance, it would promote students’ learning better than a flat ban. Like it or not, AI is here to stay, and the College has a responsibility to teach its students how to use it.

I asked Claude to critique a rough draft of this essay and used its responses to refine some of my arguments. However, all of the words you have just read were written by me.