Python是當下非常流行的編程語言之一,而在生物信息學領域,python也大有用處。例如,我們可以使用python來畫基因圖。
# 導入需要使用的庫 from reportlab.lib import colors from reportlab.lib.units import cm from Bio.Graphics import GenomeDiagram from Bio import SeqIO # 讀取序列數據 record = SeqIO.read("NC_000964.gbk", "genbank") # 創建基因圖對象 gd_diagram = GenomeDiagram.Diagram("Escherichia coli str. K-12 substr. MG1655") # 添加一個軸線 gd_track_for_features = gd_diagram.new_track(1, name="Annotated Features") gd_feature_set = gd_track_for_features.new_set() # 遍歷特征 for feature in record.features: if feature.type != "gene": # 只繪制基因特征 continue # 繪制基因的外圍 color = colors.blue start, end = feature.location.start, feature.location.end gene_feature = gd_feature_set.add_feature(feature, sigil="ARROW", color=color, label=True, label_size=14, label_angle=0) # 繪制內部注釋 for site, name, color in [("TSS", "green", colors.green), ("promoter", "blue", colors.blue), ("RBS", "purple", colors.purple), ("CDS", "orange", colors.orange)]: if site in feature.qualifiers: site_feature = SeqFeature(FeatureLocation(feature.qualifiers[site][0]-start, feature.qualifiers[site][0]-start+1), strand=feature.strand) gd_feature_set.add_feature(site_feature, sigil="OCTO", color=color, label=True, label_size=10, label_color=color, label_angle=0) # 設置圖像參數并生成輸出 gd_diagram.draw(format="circular", circular=True, pagesize=(20*cm,20*cm), start=0, end=len(record), circle_core=0.5) gd_diagram.write("ecoli_gene_map.pdf", "PDF")
上述代碼可以用于讀取NCBI數據庫中的Escherichia coli str. K-12 substr. MG1655基因序列并畫出基因圖,其中利用了reportlab和Bio庫中的一些函數和類來繪圖。使用python畫基因圖可以讓我們更好地了解基因結構和功能,有助于進一步的生物信息學研究和分析。