銆€銆€杩戞棩锛屽崕涓啘涓氬ぇ瀛︿俊鎭闄�鐢熺墿缁熻鍥㈤槦鑳″娴锋暀鎺堣棰樼粍鐮斿彂鍑轰竴娆鹃拡瀵规鐗╄浆褰曞洜瀛愮粨鍚堜綅鐐归娴嬬殑宸ュ叿鍙婂叾docker闀滃儚锛岀浉鍏崇爺绌舵垚鏋滃彂琛ㄥ湪鍥介檯鐢熺墿淇℃伅瀛﹂鍩熷鏈湡鍒夿ioinformatics涓娿€�
銆€銆€杞綍鍥犲瓙缁撳悎浣嶇偣锛圱FBS锛夋槸椤哄紡璋冩帶鍏冧欢鐨勫熀鏈粍鎴愰儴鍒嗭紝鍦ㄥ熀鍥犺〃杈剧殑绮剧‘璋冩帶涓捣閲嶈浣滅敤銆俆FBS鏍稿績鍩哄簭鍐呯殑闈炵紪鐮佸彉寮傚彲鑳戒細鏄捐憲鏀瑰彉鍏剁粨鍚堜翰鍜屽姏锛岃繖鍙兘鏄В閲婇仐浼犲彉寮傚浣曞奖鍝嶅鏉傛€х姸鐨勭敓鐗╁鏈哄埗銆傛鐗╀腑杞綍鍥犲瓙缁撳悎浣嶇偣瀹為獙鏁版嵁鐨勭己涔忥紝浠ュ強妞嶇墿TFs鐨勭嫭绔嬭繘鍖栫壒鎬ч兘浣垮緱閴村畾妞嶇墿TFBS鐨勮绠楁柟娉曡惤鍚庝簬鐩稿叧鐨勪汉绫荤爺绌躲€傛湰鐮旂┒棣栧厛浣跨敤娣卞害鍗风Н绁炵粡缃戠粶锛圖eepCNN锛夊湪鍩轰簬鍙敤鐨勬嫙鍗楄姤Dap-seq鏁版嵁闆嗗缓绔嬩簡265涓嫙鍗楄姤TFBS鐨勯娴嬫ā鍨嬶紝骞朵笖灏嗗叾杩佺Щ鐢ㄤ簬棰勬祴鍏朵粬妞嶇墿鐨勫悓婧怲F涓€�

銆€銆€寤烘ā缁撴灉琛ㄦ槑锛孌eepCNN鍦�265涓嫙鍗楄姤鏁版嵁闆嗕笂閮借幏寰椾簡寰堥珮鐨勯娴嬬簿纭害锛堝钩鍧嘇UC杈�0.96锛夛紝闃愭槑浜嗗叾鍦ㄦ鐗㏕FBS棰勬祴鏂归潰鐨勫彲琛屾€с€傞€氳繃杩涗竴姝ユ繁鍏ュ垎鏋怐eepCNN涓嵎绉牳鐨勬€ц川锛屼綔鑰呮彁渚涗簡妯″瀷鐨勭敓鐗╁鍙В閲婃€э細DeepCNN涓嶄粎鑳藉涔犲埌褰撳墠杞綍鍥犲瓙鍦ㄥ簭鍒楀綋涓殑鍏抽敭缁撳悎motif锛岃€屼笖鑳藉瀛︿範鍒颁笌璇ヨ浆褰曞洜瀛愬叡鍚屽崗浣滅殑杞綍鍥犲瓙鐨勭粨鍚坢otif銆�
銆€銆€鏈€鍚庡綋浣跨敤杩佺Щ瀛︿範鎶€鏈皾璇曚粠璁$畻鐨勯€斿緞瑙e喅鐩墠妞嶇墿TFBS鐮旂┒闂鐨勫洶闅炬椂锛屼綔鑰呭彂鐜板湪涓嶅悓鐨勬鐗╃绫讳腑锛岃縼绉诲涔犵殑琛ㄧ幇鍏锋湁寰堝ぇ鐨勪笉鍚屻€傚湪姘寸ɑ鐨勫崄涓猅F涓殑涓変釜閮藉彇寰椾簡姣旇緝濂界殑棰勬祴鏁堟灉锛孊ZIP23 銆丒RF48鍜孧ADS29鐨� PPV锛圥ositive predictive value锛夊垎鍒负0.752銆�0.951鍜�0.816銆傝€屽綋杩佺Щ鍒扮帀绫冲拰澶ц眴涓椂锛岄娴嬫晥鏋滃潎涓嶇敋鐞嗘兂銆傝繖琛ㄦ槑杩佺Щ瀛︿範鍦ㄦ鐗╃殑璺ㄧ墿绉嶈浆褰曞洜瀛愮粨鍚堜綅鐐归娴嬮棶棰樹笂鍏锋湁涓€瀹氱殑鍙鎬э紝浣嗘槸鏈潵鎴戜滑浠嶉渶璁捐鏇村姞鏈夋晥鐨勮縼绉诲涔犵瓥鐣ャ€�
銆€銆€涓轰簡鎻愪緵鏇存柟渚裤€佹洿浼樿川鐨勭敓鐗╀俊鎭鏈嶅姟锛岃棰樼粍涓烘鍏锋湁楂樼簿纭巼杈ㄥ埆杞綍鍥犲瓙缁撳悎浣嶇偣鐨勬繁搴﹀嵎绉缁忕綉缁滄ā鍨嬫惌寤轰簡docker闀滃儚锛岄€氳繃涓嬭浇璇ラ暅鍍忓苟鍦ㄦ湰鍦伴厤缃彲浠ュ疄鐜扮绾块娴嬫鐗╄浆褰曞洜瀛愮粨鍚堜綅鐐圭殑棰勬祴鍔熻兘锛坔ttps://github.com/liulifenyf/TSPTFBS锛夈€�
銆€銆€銆愯嫳鏂囨憳瑕併€�
銆€銆€Motivation: Both the lack or limitation of experimental data of transcription factor binding sites 锛圱FBS锛� in plants and the independent evolutions of plant TFs make computational approaches for identifying plant TFBSs lagging behind the relevant human researches. Observing that TFs are highly conserved among plant species, here we first employ the deep convolutional neural network 锛圖eepCNN锛� to build 265 Arabidopsis TFBS prediction models based on available DAP-seq 锛圖NA affinity purification sequencing锛� datasets, and then transfer them into homologous TFs in other plants.
銆€銆€Results: DeepCNN not only achieves greater successes on Arabidopsis TFBS predictions when compared with gkm-SVM and MEME, but also has learned its known motif for most Arabidopsis TFs as well as cooperative TF motifs with PPI 锛坧rotein-protein-interaction锛� evidences as its biological interpretability. Under the idea of transfer learning, trans-species prediction performances on ten TFs of other three plants of Oryza sativa, Zea mays and Glycine max demonstrate the feasibility of current strategy.
銆€銆€Availability and implementation: The trained 265 Arabidopsis TFBS prediction models were packaged in a Docker image named TSPTFBS, which is freely available on DockerHub at https://hub.docker.com/r/vanadiummm/tsptfbs. Source code and documentation are available on GitHub at: https://github.com/liulifenyf/TSPTFBS.
銆€銆€Contact: huxuehai@mail.hzau.edu.cn
銆€銆€鍘熸枃閾炬帴锛歨ttps://academic.oup.com/bioinformatics/article/37/2/260/6069568