Dr Imogen Wright 18/02/2022 500 Views
There are few feelings more distressing to a software developer than receiving news of a totally unknown error experienced by an important customer. At Hyrax Biosciences, we’re always trying to get just a little bit better at what we do: to improve our software, tests, review processes and deployment frameworks. We want our users to be able to forget about their NGS data analysis, and concentrate on what they do best: generating data. Trust is our most important asset, and when bioinformatics software breaks, trust breaks too.
However, we also operate at the coal-face of a fast-evolving pandemic, processing vast quantities of complex NGS data every week. At any kind of scale there’s no such thing as an edge case, and black swan events become daily problems. Relatedly, the cry of scientific progress is not “Eureka!” but “That’s funny…”, and novel data exercises code in novel ways.
I was on holiday in Rome when the South African National Institute for Communicable Diseases (NICD) first contacted us in November 2021, with an extremely urgent version of “That’s funny”. They reported that the Exatype SARS-CoV-2 bioinformatics software was returning in an error state for a critical sample analysis, with no indication of what had gone wrong. The urgency from the customer’s side seemed even greater than usual. Pathology labs across the world have been under enormous pressure throughout the pandemic to turn data around in record time, but there was a particular tone to this conversation that felt different.
We set to work immediately - we receive notifications every time a customer experiences a failure, and it’s always “all hands on deck” until the problem is solved. This time, though, we had to scratch our heads a little more than usual, because the problem wasn’t a bug introduced in an update. Instead, we determined that the samples the NICD were running through Exatype SARS-CoV-2 were crashing a piece of bioinformatics software called Nextclade, used to determine the variant of a SARS-CoV-2 sample. Nextclade was not simply returning “Unknown”: whatever data was being passed to it was so foreign that it exercised an untested code pathway, leading to a crash.
We were aware that COVID-19 cases were spiking across South Africa. We weren’t aware at the time, however, that a pathologist in Johannesburg had already noticed that many SARS-CoV-2 samples were beginning to exhibit a phenomenon known as “S-gene dropout”. Essentially, these virus' spike proteins were too mutated to be detected by conventional SARS-CoV-2 testing. This alarming behaviour, combined with the sudden increase in cases, had led to the emergency samples the NICD were desperate to receive results for.
We realised only much later that both effects were caused by the same biological reality: a new variant of SARS-CoV-2 had emerged, with an astonishing total of 37 mutations in its spike protein. This was beyond anything either PCR testing or variant typing bioinformatics software was designed to work with. It’s hard to write code beyond the possibilities we can see in front of us, and such a highly mutated variant of SARS-CoV-2 was beyond imagination for our third-party providers.
Luckily, thanks to the dedication and experience of our amazing dev and R&D teams, we were quickly able to reroute our bioinformatics software, and it took only a couple of additional days from that first crash report for Nextclade to be updated externally. The data generated from our swift actions allowed the NICD to write the first, crucial report on what was soon to be known as the Omicron variant of SARS-CoV-2.
Hyrax Biosciences is proud to be part of the system that allows global health systems to monitor SARS-CoV-2 as it evolves. The work is time-sensitive, tiring and has no tolerance for error, and is some of the most rewarding work in software development. We hope that bioinformatics software will continue to evolve to respond to ongoing global health needs, but it will always be “all hands on deck” at Hyrax when a virus, as always, finds a way to outpace the software that monitors it.