All These Mutant Virus Strains Need New Code Names

GISAID started in 2008, after researchers around the world expressed some reticence at putting sequence data from their surveillance of bird flu into public domain databases. Under-resourced scientists didn’t want to drop a new sequence but then get scooped on the analysis by some other researcher with a zillion-dollar lab. And as GISAID got more and more data, the people who ran it had to come up with a way to identify each sequence and put them all into context with one another. Now it’s the main data repository for SARS-CoV-2 genomes.

But the world of Covid nomenclature has two more great and noble houses. Nextstrain, based at the Fred Hutchinson Cancer Research Institute and University of Basel, is one. Its organization revolves around clades, big branches on the phylogenetic tree of life. (Nextstrain started out doing the same job for influenza.) Its names have a cheat code—clades are organized by the year they’re discovered and a letter of the alphabet, and then according to specific mutations of interest. The de Oliveira team’s variant had a bunch of mutations, but the N501Y was important. (The mutation changes an asparagine, abbreviated with the letter N, to tyrosine, abbreviated with a Y, at the 501st amino acid on the virus’ spike protein, in the RBD (that’s Receptor Binding Domain) that attaches to the human ACE2 receptor (that’s Angiotensin-Converting Enzyme).

Easy, right? (Ahem.) But then things got even more complicated. The one the UK researchers were seeing had the same mutation, among many others. To distinguish it from de Oliveira’s, each got a new designation—appending “V1” on the one from the UK and “V2” on the other. Another similar variant that led back to Manaus, in Brazil, came to be “v3.”

“We’re not trying to name everything. In fact, we’re really explicitly trying not to have more than 10 or 20 names a year, and we’re interested in picking out the most important things,” Hodcroft says. “That’s, like, big changes in the tree. When we see groups that are different in their genetics and they spread, even if it takes a while, in a region or around the world, we give those a Nextstrain clade.”

investigate this sitekiller deal
knowing it
learn here
learn more
learn more here
learn the facts here now
learn this here now
like it
like this
linked here
listen to this podcast
look at here
look at here now
look at more info
look at these guys
look at this
look at this now
look at this site
look at this web-site
look at this website
look here
look these up
look what i found
love it
lowest price
made a post
made my day
more about the author
more bonuses
more help
more helpful hints
more hints
more info
more info here
more information
more tips here
moved here
my company
my explanation
my latest blog post
my response
my review here
my sources
navigate here
navigate to these guys
navigate to this site
navigate to this web-site
navigate to this website
next page
no titleofficial site
official source
official statement
official website
on bing
on front page
on the main page
on yahoo
one-time offer
original site
our site
our website
over at this website
over here
pop over here
pop over to these guys
pop over to this site
pop over to this web-site
pop over to this website

That’s not what the other bigwig in the space does, though. It’s analytical software called Pangolin—“Phylogenetic Assignment of Named Global Outbreak LINeages.” So-called Pango lineages start with a letter, initially A or B, designating the first two diverging SARS-CoV-2 sequences that emerged from China in late 2019 and early 2020. Each generation gets a number, and its descendants get an additional number, preceded by a period—but only for three generations. Four or more, and the whole lineage gets assigned to a new letter. Imagine an Obed-begat-Jesse-and-Jesse-begat-David vibe, but with diagrams and genomic receipts. “Lineages are operating on a different resolution. You can have very big ones and small ones, but the idea is to capture the emerging edge of the pandemic,” says Áine O’Toole, an evolutionary biologist at the University of Edinburgh who created Pangolin and is now one of its main developers. “The idea is to have a cluster of sequences that is linked to some sort of epidemiological piece of information.”

(After publication, O’Toole emailed me to note that while she had created the Pangolin software, she didn’t come up with the Pango notation used in the nomenclature—that was a bigger team. It’s an important distinction that also proves my point about how hard it is to name things, including the people who name things.)

Pangolin has a tricky bit. Anyone working on a viral genome can use the software to try to figure out whether they have something new, and where it might fit with all the known lineages (with data pulled from GISAID, just as Nextstrain does). But making a final call on whether a strain is indeed new, and deserves a different spot in the heuristic—its Pango lineage—is up to actual living people on the team and suggestions from scientists in the field. “I think maybe it’s something we need to work harder on, to try to convey there’s a difference between lineage designation and lineage assignment,” O’Toole says. “When we designate lineages, that’s just based on what we know. If you’ve got a new lineage and we haven’t seen it, Pangolin won’t be able to assign it, because it can’t predict lineages that will arise in the future. So there is a lag.”

Leave a Reply

Your email address will not be published.