Author(s): Qi Feng [1, 2]; Yujun Zhang [1, 2]; Pei Hao [1, 2]; Shengyue Wang [2, 3]; Gang Fu [3]; Yucheng Huang [1]; Ying Li [1]; Jingjie Zhu [1]; Yilei Liu [1]; Xin Hu [1]; Peixin Jia [1]; Yu Zhang [1]; Qiang Zhao [1]; Kai Ying [1]; Shuliang Yu [1]; Yesheng Tang [1]; Qijun Weng [1]; Lei Zhang [1]; Ying Lu [1]; Jie Mu [1]; Yiqi Lu [1]; Lei S. Zhang [1]; Zhen Yu [1]; Danlin Fan [1]; Xiaohui Liu [1]; Tingting Lu [1]; Can Li [1]; Yongrui Wu [1]; Tongguo Sun [1]; Haiyan Lei [1]; Tao Li [1]; Hao Hu [1]; Jianping Guan [1]; Mei Wu [1]; Runquan Zhang [1]; Bo Zhou [1]; Zehua Chen [1]; Ling Chen [1]; Zhaoqing Jin [1]; Rong Wang [1]; Haifeng Yin [3]; Zhen Cai [3]; Shuangxi Ren [3]; Gang Lv [3]; Wenyi Gu [3]; Genfeng Zhu [3]; Yuefeng Tu [3]; Jia Jia [3]; Yi Zhang [3]; Jie Chen [3]; Hui Kang [3]; Xiaoyun Chen [3]; Chunyan Shao [3]; Yun Sun [3]; Qiuping Hu [3]; Xianglin Zhang [3]; Wei Zhang [3]; Lijun Wang [3]; Chunwei Ding [3]; Haihui Sheng [3]; Jingli Gu [3]; Shuting Chen [3]; Lin Ni [3]; Fenghua Zhu [3]; Wei Chen [4]; Lefu Lan [4]; Ying Lai [4]; Zhukuan Cheng [5, 6]; Minghong Gu [5]; Jiming Jiang [6]; Jiayang Li [4]; Guofan Hong [1]; Yongbiao Xue [4]; Bin Han (corresponding author) [1]

Rice is the principal food for over half of the population of the world. With its genome size of 430 megabase pairs (Mb), the cultivated rice species Oryza sativa is a model plant for genome research [1]. Here we report the sequence analysis of chromosome 4 of O. sativa , one of the first two rice chromosomes to be sequenced completely [2]. The finished sequence spans 34.6 Mb and represents 97.3% of the chromosome. In addition, we report the longest known sequence for a plant centromere, a completely sequenced contig of 1.16 Mb corresponding to the centromeric region of chromosome 4. We predict 4,658 protein coding genes and 70 transfer RNA genes. A total of 1,681 predicted genes match available unique rice expressed sequence tags. Transposable elements have a pronounced bias towards the euchromatic regions, indicating a close correlation of their distributions to genes along the chromosome. Comparative genome analysis between cultivated rice subspecies shows that there is an overall syntenic relationship between the chromosomes and divergence at the level of single-nucleotide polymorphisms and insertions and deletions. By contrast, there is little conservation in gene order between rice and Arabidopsis .

The rice genome has been well mapped both genetically and physically [3, 4, 5] and has a syntenic relationship with other cereals [6]. Arabidopsis thaliana (Arabidopsis), a member of the Brassica family of dicotyledonous (dicot) plants, has become an important model flowering plant for studying many aspects of plant biology [7]. The completion of the Arabidopsis genome [8, 9, 10] has afforded an unprecedented opportunity for systematic studies of plant gene function. Equally, the complete rice genome sequence will provide...

