2010-12-26 11:25:15| 分类: Bioinformatics | 标签: |举报 |字号大中小 订阅
Similar databases and browsers are found at NCBI and the University of California, Santa Cruz (UCSC).
In the Ensembl project, sequence data is fed into a software "pipeline" (written in Perl) which creates a set of predicted gene locations and saves them in a MySQL database for subsequent analysis and display. Ensembl makes these data freely accessible to the world research community. All the data and code produced by the Ensembl project is available to download, and there is also a publicly accessible database server allowing remote access. In addition, the Ensembl website provides computer-generated visual displays of much of the data.
Over time the project has expanded to include additional species (including key model organisms such as mouse, fruitfly and zebrafish) as well as a wider range of genomic data, including genetic variations and regulatory features. From late 2008 a new project, Ensembl Genomes, will be extending the scope of Ensembl into plants, fungi, bacteria and protists, whilst the original project continues to focus on vertebrates.
Central to the Ensembl concept is the ability to automatically generate graphical views of the alignment of genes and other genomic data against a reference genome. These are shown as data tracks, and individual tracks can be turned on and off, allowing the user to customise the display to suit their research interests. The interface also enables the user to zoom in to a region or move along the genome in either direction.
Other displays show data at varying levels of resolution, from whole karyotypes down to text-based representations of DNA and amino acid sequences, or present other types of display such as trees of similar genes (homologues) across a range of species. The graphics are complemented by tabular displays, and in many cases data can be exported directly from the page in a variety of standard file formats such as FASTA.
Externally produced data can also be added to the display, either via a DAS (Distributed Annotation System) server on the internet, or by uploading a suitable file in one of the supported formats, such as BED or PSL.
Graphics are generated using a suite of custom Perl modules based on GD, the standard Perl graphics display library.
Ensembl 是一项生物信息学研究计划,旨在开发一种能够对真核生物基因组进行自动诠释(automatic annotation)并加以维护的软件。该计划由英国Sanger研究所Wellcome基金会及欧洲分子生物学实验室所属分部欧洲生物信息学研究所共同协作运营。
该计划开放所有源信息,所有由该计划所产生的数据及软件都可以免费及自由地从网络上获取并使用。
该计划所开发并使用的大部分软件是用Perl语言编写的,并基于BiopPerl的基础框架。其他基因组计划亦可轻易使用Perl语言的应用程序接口(Application programming interface,API)。比方说,可以使用它对基因或者克隆目录进行诠释。
From Wikipedia
评论