Seed World

Consortium Including Brazilians Sequences the Reference Genome of Arabica Coffee

Study may guide the development of varieties better adapted to climate change. Photo: Gian Barros

Coffee, one of the most traded commodities globally, is primarily derived from Coffea arabica, the most widely consumed among approximately 130 species. It is the product of the hybridization of two other species: Coffea canephora, known as Conilon or Robusta coffee in Brazil, and Coffea eugenioides. While nearly every major commodity has had its reference genome sequenced over the last decade, coffee has only recently been added to the list.

The reference genome is crucial for developing coffee varieties better suited to climate change and resistant to diseases. In a groundbreaking effort, a consortium of scientists sequenced the reference genome of Arabica coffee, according to a peer reviewed publication and press release from the Fundação De Amparo À Pesquisa Do Estado De São Paulo. This allowed them to pinpoint candidate genes responsible for coffee’s resistance to rust and other diseases. Simultaneously, they identified the expression of certain genes associated with Arabica’s aroma.

“With the knowledge of the genome, it is possible to obtain information that allows us to go in two directions: the development of varieties by directing crossbreeding, in other words, as a reference to guide us in future crossbreeding that produces new varieties; and more direct interventions, such as modifying a gene specifically,” summarizes Douglas Domingues, currently a researcher at the Plant Genomics and Transcriptomics Group of the Luiz de Queiroz School of Agriculture at the University of São Paulo (ESALQ-USP), in Brazil, and one of the authors of the paper (developed when he was still working at the Rio Claro campus of the São Paulo State University).

Domingues  believes there was a bit of a race to sequence the genome. 

“The price of sequencing has come down a lot, and coffee was one of the few commodities that hadn’t had its reference genome sequenced. There were other groups trying, and there was a paper published just before ours. But most of them used the standard strategy: choosing an interesting plant for cultivation and sequencing its genome.” 

Domingues’ research group has sequenced a plant that may not be of immediate agronomic interest but holds significant genetic value. “The advantage of our reference genome is that it’s derived from a ’dihaploid’ individual. This results in a homogeneous reference genome that will be a superior standard for future research,” explains Patrick Descombes, the project coordinator and senior genomics expert at the Nestlé Institute of Food Safety & Analytical Sciences. He further elaborates that Arabica coffee is tetraploid, containing two genomes in one due to its hybrid nature, which is a fusion of two other species.

The releases notes that by sequencing a dihaploid derived from Arabica coffee compared to a common tetraploid variety, scientists get a clearer and more simplified view of the genome. This makes it possible to identify variations between similar genes with greater precision, facilitating the use of molecular information for improvement studies.

In this study, the group was able to pinpoint with greater precision when the fusion occurred. No more than 600,000 years ago, C. canephora and C. eugenioides merged, giving rise to the tetraploid hybrid, which then continued its evolutionary journey. 

 “We came to this conclusion using DNA information from Arabica, Robusta and Eugenioides: we were able to make a more accurate inference because previously this interval was dated at between 50,000 and 1 million years. We reduced that window to 350,000 to 600,000 years,” reports Domingues.

The article, published in Nature Genetics on April 15, was the result of a consortium of scientists from more than ten countries, including Brazil, which participated with more than one institution. In Domingues’ case, his participation was partially funded by FAPESP through a Young Researcher project and a postdoctoral fellowship awarded to Suzana Tiemi Ivamoto-Suzuki, also an author of the article.

“We used the reference sequence to understand the diversity that exists in wild Arabica coffees, from the African region of origin, and compare this with the Arabica coffees that are cultivated today,” says the ESALQ-USP scientist, explaining that the group resequenced Arabica coffee varieties planted in different parts of the world, as well as wild specimens collected in the forests of Ethiopia, and managed to understand the difference between the wild and cultivated ones.

To understand the genomic evolution of Arabica, the consortium sequenced 46 accessions, comprising three Robusta, two Eugenioides, and 41 Arabica specimens. Among the Arabica samples were an 18th-century type specimen, 12 cultivars with varied breeding histories, the Timor hybrid (a natural cross between Arabica and the pest-resistant C. canephora Robusta variety), along with five of its backcrosses with Arabica. Additionally, the study included 17 wild accessions and three wild/cultivated accessions collected from both the east and west sides of the Great Rift Valley in Ethiopia.

“We used the latest genomic technologies, i.e. long reads from the high-fidelity PacBio system [for gene sequencing] and proximity ligation with short reads from Illumina [an integrated system for analyzing genetic variation and biological function], to generate the chromosome assembly. This combination resulted in a chromosome-level assembly of the highest quality and integrity,” says Descombes.