Imagine relying on AI to modernize a critical system like Ubuntu's Error Tracker, only to discover that some of its code was completely flawed. Sounds alarming, right? Well, that’s exactly what happened recently when Microsoft’s GitHub Copilot was tasked with updating the Error Tracker’s Cassandra database to meet modern standards. While the AI showed promise in some areas, even for seemingly straightforward tasks, a significant portion of the generated code was deemed “plain wrong” by the developer overseeing the project.
Last week, I shared how AI was being leveraged to breathe new life into Ubuntu’s Error Tracker (https://www.phoronix.com/news/AI-Ubuntu-Error-Tracker-Improve). The idea was to use GitHub Copilot to streamline the process, making it easier to adopt modern coding practices and eliminate outdated code. Many Phoronix readers were quick to celebrate this as a brilliant use of AI, and in many ways, it is. But here’s where it gets controversial: AI isn’t a magic bullet. Despite its potential, it’s far from perfect.
Canonical engineer Skia recently provided an update on this AI-driven modernization effort in the Ubuntu Foundations Team’s weekly notes (https://discourse.ubuntu.com/t/foundations-team-updates-2025-12-04/73104/6). Skia noted, “We’re now reviewing and testing Copilot’s output. It’s not a total disaster, but it’s also not plug-and-play. For instance, it lacks access to a real database, and I didn’t include the schema in my prompt. Some functions were outright incorrect, though thankfully, those were the minority. You can see the details in my latest pull request.”
And this is the part most people miss: while the AI-generated code wasn’t entirely unusable, it still required significant human intervention to correct errors and ensure functionality. On the bright side, this experiment appears to have saved some development time, even if the results weren’t flawless. For those curious about the nitty-gritty—the AI’s mistakes, the corrections, and everything in between—you can explore the GitHub pull request here (https://github.com/ubuntu/error-tracker/pull/4). It’s a fascinating look at the current capabilities and limitations of AI in software development.
But here’s a thought-provoking question: If AI can produce “plain wrong” code even for relatively simple tasks, how much can we truly rely on it for more complex systems? Let’s discuss—do you think AI is ready to take the lead in modernizing critical codebases, or is human oversight still irreplaceable? Share your thoughts in the comments!