Alissa Eckert/Dan Higgins/CDC
The genetic sequence itself doesn't indicate the origins of the virus that causes Covid-19
CNN  — 

The genetic sequence of SARS-CoV-2, the virus that causes Covid-19, was submitted to a National Institutes of Health database two weeks before its release by the Chinese government, according to documents that were shared with US lawmakers and released Wednesday.

The sequence doesn’t indicate the origin of the coronavirus but undermines the Chinese government’s claims about its knowledge of the information, one expert told CNN – and could have cost critical weeks in the development of a vaccine against the virus.

On December 28, 2019, virologist Dr. Lili Ren of the Institute of Pathogen Biology at the Chinese Academy of Medical Sciences & Peking Union Medical College submitted the genetic sequence to GenBank, a “genetic sequence repository that collects, preserves, and provides public access to assembled and annotated nucleotide sequence data from all domains of life,” according to a letter that Dr. Melanie Egorin, assistant secretary of legislation at the US Department of Health and Human Services, sent to House Energy and Commerce Committee Chair Cathy McMorris Rodgers last month.

GenBank is managed by the National Center for Biotechnology Information, part of the US National Institutes of Health.

Ren’s submission “was incomplete and lacked the necessary information required for publication,” the letter says. She was sent a resubmission request three days later, but “NIH never received the additional information requested.” The submission was removed from a processing queue on January 16, 2020, and “the sequence was never made publicly available on GenBank.”

However, a different submission of the genetic sequence that was “nearly identical” to Ren’s was published on GenBank on January 12, Egorin said, one day after the World Health Organization said it had received the sequence from China.

McMorris Rodgers, R- Washington; Subcommittee on Health Chair Brett Guthrie, R-Kentucky; and Subcommittee on Oversight and Investigations Chair Morgan Griffith, R-Virginia, said in a news release Wednesday that the committee’s investigation into the origins of Covid-19 will help policymakers strengthen the nation’s biosafety practices in addition to helping prepare for the next pandemic.

They noted that they received the new information almost two months after they informed the NIH of their intent to issue subpoenas for copies of documents related to any early coronavirus sequences, early Covid-19 cases or other pertinent information.

Dr. Jesse Bloom, a virologist at the Fred Hutchinson Cancer Center, wrote Wednesday in an analysis of Ren’s submission that it “clearly falsifies the Chinese government’s claim that the causative agent of the Wuhan pneumonia outbreak still had not been identified near the end of the first week of January 2020.”

The earlier submission “would have provided adequate information to initiate vaccine production in late 2019 if it had been made public,” he said, noting that drugmaker Moderna “used the spike sequence to design its COVID-19 vaccine” within two days of the January 12 release.

However, he said, the genetic sequence “is unlikely to represent the first virus that infected humans” and “does not provide any new insights into the origin or early spread of SARS-CoV-2 in Wuhan.”

“The belated discovery of the submission underscores the importance of rapid data sharing during outbreaks, since immediate public release of the sequence could have accelerated by several weeks the development of COVID-19 vaccines that saved thousands of lives per week in the United States alone,” he said.

Even two weeks “would have made a huge difference in the pandemic,” agreed Dr. Eric Topol, founder and director of the Scripps Research Translational Institute. The fact that the vaccine program began immediately on publication of the genetic sequence “shows you how important that sequence was.”

“When you sequence a virus – it’s not even just a vaccine – then you’ve nailed it. You know exactly the features, about the spike protein and all the other major components: the nucleocapsid, the envelope, the whole entire panoramic view of the virus. You can’t get that without the sequence.”

The documents should be read in the context of hindsight, says Dr. Kristian Andersen, an evolutionary biologist and director of infectious disease genomics at the Translational Institute.

Get CNN Health's weekly newsletter

“In late 2019, nobody knew that a pandemic would later ensue,” he wrote in an email. “This is a really critical part that most people seem to forget – nobody knew back then that a never-before-seen coronavirus only distantly related to SARS-CoV-1 was causing ‘mysterious’ illnesses in patients associated with a wet market in the middle of Wuhan, which would later spark a devastating pandemic.

“Should the sequence have been released at the time and [marked] as preliminary data? Sure, that would have been great, and is a good example of where we could hope to do better in the future,” he said. “Whoever reviewed the sequence at NCBI over the holiday period in 2019 would have no way of connecting this sequence to a ‘mysterious’ illness in Wuhan – because it was yet to be reported.”

CNN’s Jen Christensen and Brenda Goodman contributed to this report.