"AI", and the trouble with inaccessible SaaS
I wasn't particularly enthused by the alpha release of Vercel's v0 UI generation tool this week. Provided with a prompt, it produces React code with Tailwind to style it. It's being marketed as a prototyping tool, but was described by Vercel's CEO on Twitter as "production-grade". Well, which one is it?
To be clear, my issue isn't with the concept of large language models (LLMs) in general: I use Copilot in my everyday work, and it's saved me a lot of time when writing tests! My concern is with the source material used to train it, and what people do with the output.
Hidde de Vries wrote a great post on the incredibly problematic dev community response after the accessibility of the UI was called out, so I won't go into that here.
Stop putting inaccessible prototypes into production
Prototyping is one thing. Noodling around with UI ideas? Fantastic. Please don't put it into production.
A pattern I've noticed with a huge number of tech/SaaS startups is:
- have idea
- build very quick, inaccessible UI "prototype" to test idea
- ship prototype to production
- prototype becomes the actual product
Inevitably, people will copy existing patterns in the codebase, adding more stuff on top, and time passes. At this point, you have a completely inaccessible SaaS product, and the amount of work needed to go back and fix all the problems is so much, and so costly, that it's never going to be a "business priority". So many of the B2B SaaS tools I've used are terribly inaccessible; companies seem to forget that staff may have access needs, as well as customers, and that those access needs may extend beyond being blind or partially sighted.
In fact, a 2020 study by RNIB (docx) found that "50% of employers thought that there may be additional health and safety risks in the workplace [employing someone blind or partially sighted]; 33% of employers thought that they may not be able to operate a computer/laptop; 33% of employers thought that they may not be able to operate the necessary equipment, excluding computers/laptops". Isn't that just terribly fucking depressing?
I truly wish the founders of SaaS companies worldwide would learn some proper semantic HTML before they build these UIs that are going to form the basis of their entire product for years to come. I'd love to see SaaS accessibility viewed as a competitive advantage rather than an expensive afterthought.
As Ashlee Boyer puts it:
The right time to work on accessibility is before you launch a product into alpha.
Before v0 was announced I was planning on writing a blog post about the inaccessibility of SaaS software, and how it starts at prototypes making it to production, so I guess this is that post. The problem has existed for years, long before generative LLMs came along. LLMs are just making it worse, and faster.
People write inaccessible code, and LLMs copy people
Tools like v0 are destined to become the new "copying and pasting from open-source UI libraries". In the wider community, ChatGPT has already overtaken StackOverflow as the first port of call for coding questions. Developers shouldn't unreservedly trust the output from code-generating LLMs, though. I double-triple-check everything that Copilot produces for me, and generally end up tweaking it about half of the time – and I rarely let it write blocks of code unless I'm writing loads of repetitive tests. It's great for autocompleting variables, repeating things I've already written and finishing lines, but I have found it has a tendency to make up functions that don't exist.
Likewise, ChatGPT and v0 may produce decent enough code, or they might give you <div>
s that should be links. At least v0 seems to be adding labels to form inputs, which is more than I can say for a lot of developers on the web. But that's kind of the issue: people don't write accessible code, so why should we be trusting algorithms trained on other people's inaccessible code to write UI code for us? Vercel say it was trained on code written by their team, but considering the UI for v0 wasn't accessible itself, I don't have a huge amount of confidence that the input source material for the LLM was.
This isn't purely a Vercel problem, of course, it's an everyone problem. But we as developers can make it better by learning how to write semantic HTML and proper CSS, and then training the models on that.
I do believe that LLM-backed tools can speed up the development process and make us more productive, but not through developers indiscriminately pasting whatever they generate into our codebases and trusting that it's all fine. Our users deserve a little more diligence than that.