D-I-TASSER O95786

D-I-TASSER results for O95786

[Click result.zip to download all results on this page]

Summary of the Protein

Uniprot ID: O95786
Protein Name: Antiviral innate immune response receptor RIG-I
Gene Name: DDX58

Input Sequence in FASTA format

>O95786 (925 residues)
MTTEQRRSLQAFQDYIRKTLDPTYILSYMAPWFREEEVQYIQAEKNNKGPMEAATLFLKF
LLELQEEGWFRGFLDALDHAGYSGLYEAIESWDFKKIEKLEEYRLLLKRLQPEFKTRIIP
TDIISDLSECLINQECEEILQICSTKGMMAGAEKLVECLLRSDKENWPKTLKLALEKERN
KFSELWIVEKGIKDVETEDLEDKMETSDIQIFYQEDPECQNLSENSCPPSEVSDTNLYSP
FKPRNYQLELALPAMKGKNTIICAPTGCGKTFVSLLICEHHLKKFPQGQKGKVVFFANQI
PVYEQQKSVFSKYFERHGYRVTGISGATAENVPVEQIVENNDIIILTPQILVNNLKKGTI
PSLSIFTLMIFDECHNTSKQHPYNMIMFNYLDQKLGGSSGPLPQVIGLTASVGVGDAKNT
DEALDYICKLCASLDASVIATVKHNLEELEQVVYKPQKFFRKVESRISDKFKYIIAQLMR
DTESLAKRICKDLENLSQIQNREFGTQKYEQWIVTVQKACMVFQMPDKDEESRICKALFL
YTSHLRKYNDALIISEHARMKDALDYLKDFFSNVRAAGFDEIEQDLTQRFEEKLQELESV
SRDPSNENPKLEDLCFILQEEYHLNPETITILFVKTRALVDALKNWIEGNPKLSFLKPGI
LTGRGKTNQNTGMTLPAQKCILDAFKASGDHNILIATSVADEGIDIAQCNLVILYEYVGN
VIKMIQTRGRGRARGSKCFLLTSNAGVIEKEQINMYKEKMMNDSILRLQTWDEAVFREKI
LHIQTHEKFIRDSQEKPKPVPDKENKKLLCRKCKALACYTADVRVIEECHYTVLGDAFKE
CFVSRPHPKPKQFSSFEKRAKIFCARQNCSHDWGIHVKYKTFEIPVIKIESFVVEDIATG
VQTLYSKWKDFHFEKIPFDPAEMSK

Predicted Contact and Distance Map Used in D-I-TASSER simulation

Contact Map

Distance Map

D-I-TASSER simulation is guided by the consensus contact map (left figure) and distance map (right figure) derived based on confidence scores of AttentionPotential. In the contact, distance map and hydrogen bond networks, the axes mark the residue index along the sequence. For the contact map, each dot represents a residue pair with predicted contact, while for the distance map and hydrogen bond network, a color scale represents a distance of 1-20+ angstroms or a angle of 0-180 degree.

Information of Domain

No. of Domain	Domain	Domain Boundary	Length of Domain	Models of Domain	Sequence
1	O95786-D1	1-92	92		MTTEQRRSLQAFQDYIRKTLDPTYILSYMAPWFREEEVQYIQAEKNNKGPMEAATLFLKFLLELQEEGWFRGFLDALDHAGYSGLYEAIESW
2	O95786-D2	93-224,772-925	286		DFKKIEKLEEYRLLLKRLQPEFKTRIIPTDIISDLSECLINQECEEILQICSTKGMMAGAEKLVECLLRSDKENWPKTLKLALEKERNKFSELWIVEKGIKDVETEDLEDKMETSDIQIFYQEDPECQNLSEDEAVFREKILHIQTHEKFIRDSQEKPKPVPDKENKKLLCRKCKALACYTADVRVIEECHYTVLGDAFKECFVSRPHPKPKQFSSFEKRAKIFCARQNCSHDWGIHVKYKTFEIPVIKIESFVVEDIATGVQTLYSKWKDFHFEKIPFDPAEMSK
3	O95786-D3	225-442	218		NSCPPSEVSDTNLYSPFKPRNYQLELALPAMKGKNTIICAPTGCGKTFVSLLICEHHLKKFPQGQKGKVVFFANQIPVYEQQKSVFSKYFERHGYRVTGISGATAENVPVEQIVENNDIIILTPQILVNNLKKGTIPSLSIFTLMIFDECHNTSKQHPYNMIMFNYLDQKLGGSSGPLPQVIGLTASVGVGDAKNTDEALDYICKLCASLDASVIATV
4	O95786-D4	443-465,607-771	188		KHNLEELEQVVYKPQKFFRKVESENPKLEDLCFILQEEYHLNPETITILFVKTRALVDALKNWIEGNPKLSFLKPGILTGRGKTNQNTGMTLPAQKCILDAFKASGDHNILIATSVADEGIDIAQCNLVILYEYVGNVIKMIQTRGRGRARGSKCFLLTSNAGVIEKEQINMYKEKMMNDSILRLQTW
5	O95786-D5	466-606	141		RISDKFKYIIAQLMRDTESLAKRICKDLENLSQIQNREFGTQKYEQWIVTVQKACMVFQMPDKDEESRICKALFLYTSHLRKYNDALIISEHARMKDALDYLKDFFSNVRAAGFDEIEQDLTQRFEEKLQELESVSRDPSN

Top 1 final models from D-I-TASSER

SpinHigh qualityWhite background

Click to view	Rank^a	Download	Estimated TM-score^b
	1	model1.pdb.gz	0.72

(a)

D-I-TASSER simulations generate a large ensemble of structural conformations, i.e. decoys. These decoys are clustered by SPICKER based on pairwise structure similarity to report up to five final models from the five largest clusters. Models are ranked in descending order of cluster size. If the simulations converge well, it is possible to have less than 5 models generated, which is usually an indication of good model quality.

(b)

The model confidence is quantitatified by estimated TM-score (eTM-score), calculated based on significance of threading template alignments, contact map satisfaction rate, mean absolute error between distance of model and distance of AttentionPotential, and convergence of D-I-TASSER simulations. eTM-score is typically in the range of [0, 1], with higher eTM-score signifies higher model confidence.

Proteins with similar structure

SpinHigh qualityWhite background

Top 10 structural analogs in PDB (as identified by TM-align)

Rank	PDB Hit	TM-score	RMSD^a	IDEN^a	Cov.	Download Alignment
1	7jl1A	0.55	4.41	0.731	0.624	model1_7jl1A.pdb.gz
2	7jl0A	0.54	4.68	0.300	0.627	model1_7jl0A.pdb.gz
3	5jc3A	0.53	4.33	0.289	0.609	model1_5jc3A.pdb.gz
4	5jajA	0.53	4.54	0.262	0.612	model1_5jajA.pdb.gz
5	4f91B	0.49	6.32	0.100	0.623	model1_4f91B.pdb.gz
6	4bgdA	0.49	6.62	0.066	0.632	model1_4bgdA.pdb.gz
7	5m59A	0.48	6.76	0.074	0.625	model1_5m59A.pdb.gz
8	7askF	0.45	6.26	0.083	0.568	model1_7askF.pdb.gz
9	6sxaF	0.42	6.52	0.077	0.552	model1_6sxaF.pdb.gz
10	5aorA	0.42	6.76	0.099	0.551	model1_5aorA.pdb.gz

(a)	Query structure is shown in cartoon, while the structural analog is displayed using backbone trace.
(b)	Ranking of proteins is based on TM-score of the structural alignment between the query structure and known structures in the PDB library.
(c)	RMSD^a is the RMSD between residues that are structurally aligned by TM-align.
(d)	IDEN^a is the percentage sequence identity in the structurally aligned region.
(e)	Cov. represents the coverage of the alignment by TM-align and is equal to the number of structurally aligned residues divided by length of the query protein.

Predicted Gene Ontology (GO) Terms

Molecular Function (MF)

GO term	Cscore^GO	Name
GO:0097159	0.85	organic cyclic compound binding
GO:1901363	0.84	heterocyclic compound binding
GO:0003676	0.81	nucleic acid binding
GO:0003723	0.78	RNA binding
GO:0003824	0.75	catalytic activity
GO:0016787	0.73	hydrolase activity

Download full result of the above consensus prediction.

Click the graph to show a high resolution version.

(a)

Cscore^GO is the confidence score of predicted GO terms. Cscore^GO values range in between [0-1]; where a higher value indicates a better confidence in predicting the function using the template.

(b)

The graph shows the predicted terms within the Gene Ontology hierachy for Molecular Function. Confidently predicted terms are color coded by Cscore^GO:

[0.4,0.5)

[0.5,0.6)

[0.6,0.7)

[0.7,0.8)

[0.8,0.9)

[0.9,1.0]

Biological Process (BP)

GO term	Cscore^GO	Name
GO:0009987	0.94	cellular process
GO:0065007	0.73	biological regulation
GO:0071704	0.72	organic substance metabolic process
GO:0044238	0.72	primary metabolic process
GO:0050789	0.71	regulation of biological process
GO:0044237	0.71	cellular metabolic process
GO:0043170	0.71	macromolecule metabolic process
GO:0044260	0.70	cellular macromolecule metabolic process
GO:0034641	0.67	cellular nitrogen compound metabolic process
GO:0090304	0.66	nucleic acid metabolic process
GO:0050794	0.60	regulation of cellular process
GO:0016070	0.60	RNA metabolic process
GO:0050896	0.51	response to stimulus
GO:0006396	0.51	RNA processing

Download full result of the above consensus prediction.

Click the graph to show a high resolution version.

(a)

Cscore^GO is the confidence score of predicted GO terms. Cscore^GO values range in between [0-1]; where a higher value indicates a better confidence in predicting the function using the template.

(b)

The graph shows the predicted terms within the Gene Ontology hierachy for Biological Process. Confidently predicted terms are color coded by Cscore^GO:

[0.4,0.5)

[0.5,0.6)

[0.6,0.7)

[0.7,0.8)

[0.8,0.9)

[0.9,1.0]

Cellular Component (CC)

GO term	Cscore^GO	Name
GO:0044424	1.00	intracellular part
GO:0043229	0.82	intracellular organelle
GO:0043231	0.70	intracellular membrane-bounded organelle
GO:0005634	0.67	nucleus
GO:0005737	0.66	cytoplasm
GO:0032991	0.60	macromolecular complex
GO:0044444	0.55	cytoplasmic part
GO:0030529	0.53	intracellular ribonucleoprotein complex

Download full result of the above consensus prediction.

Click the graph to show a high resolution version.

(a)

Cscore^GO is the confidence score of predicted GO terms. Cscore^GO values range in between [0-1]; where a higher value indicates a better confidence in predicting the function using the template.

(b)

The graph shows the predicted terms within the Gene Ontology hierachy for Cellular Component. Confidently predicted terms are color coded by Cscore^GO:

[0.4,0.5)

[0.5,0.6)

[0.6,0.7)

[0.7,0.8)

[0.8,0.9)

[0.9,1.0]

Predicted Enzyme Commission (EC) Numbers

SpinHigh qualityWhite background

Top 5 enzyme homologs in PDB

Rank	Cscore^EC	PDB Hit	TM-score	RMSD^a	IDEN^a	Cov.	EC Number	Predicted Active Site Residues
1	0.060	1dmrA	0.280	8.23	0.029	0.420	1.7.2.3	NA
2	0.060	3l9oA	0.378	6.13	0.101	0.480	3.6.4.13	264,266,270
3	0.060	1eulA	0.199	7.80	0.023	0.283	3.6.3.8	NA
4	0.060	3eiqD	0.309	4.87	0.126	0.368	3.6.4.13	NA
5	0.060	2w00B	0.364	4.92	0.068	0.433	3.1.21.3	726,732

	Click on the radio buttons to visualize predicted active site residues.
(a)	Cscore^EC is the confidence score for the Enzyme Commission (EC) number prediction. Cscore^EC values range in between [0-1]; where a higher score indicates a more reliable EC number prediction.
(b)	TM-score is a measure of global structural similarity between query and template protein.
(c)	RMSD^a is the RMSD between residues that are structurally aligned by TM-align.
(d)	IDEN^a is the percentage sequence identity in the structurally aligned region.
(e)	Cov. represents the coverage of global structural alignment and is equal to the number of structurally aligned residues divided by length of the query protein.

Predicted Ligand Binding Sites

SpinHigh qualityWhite background

Template proteins with similar binding site:

Click to view	Rank	Cscore^LB	PDB Hit	TM-score	RMSD^a	IDEN^a	Cov.	BS-score	Lig. Name	Download Complex	Predicted binding site residues
	1	0.54	3tmiA	0.537	4.81	0.702	0.630	1.18	UUU	complex1.pdb.gz	241,242,244,247,266,267,269,270,271,272,372,373
	2	0.01	3jv2A	0.356	5.68	0.084	0.439	0.47	III	complex2.pdb.gz	247,267,268

	Click on the radio buttons to visualize predicted binding site and residues.
(a)	Cscore^LB is the confidence score of predicted binding site. Cscore^LB values range in between [0-1]; where a higher score indicates a more reliable ligand-binding site prediction.
(b)	BS-score is a measure of local similarity (sequence & structure) between template binding site and predicted binding site in the query structure. Based on large scale benchmarking analysis, we have observed that a BS-score >1 reflects a significant local match between the predicted and template binding site.
(c)	TM-score is a measure of global structural similarity between query and template protein.
(d)	RMSD^a the RMSD between residues that are structurally aligned by TM-align.
(e)	IDEN^a is the percentage sequence identity in the structurally aligned region.
(f)	Cov. represents the coverage of global structural alignment and is equal to the number of structurally aligned residues divided by length of the query protein.

[Click result.zip to download all results on this page]

References:
1.	Wei Zheng, Chengxin Zhang, Yang Li, Robin Pearce, Eric W. Bell, Yang Zhang. Folding non-homology proteins by coupling deep-learning contact maps with I-TASSER assembly simulations. In preparation, 2020.