INSTALLATION AND IMPLEMENTATION OF GPU-I-TASSER SUITE
   (Copyright 2020 by Zhang Lab, University of Michigan, All rights reserved)
                    (Version 1.0, 2020/11/10)

1. What is GPU-I-TASSER Suite?
   
   GPU-I-TASSER Suite is a GPU based composite package of programs for protein
   structure prediction and function annotations. The Suite
   includes the following programs:

   a) GPU-I-TASSER: A hierarchical GPU based program for protein structure prediction
   b) COACH: A function annotation program based on COFACTOR, TM-SITE and S-SITE
   c) COFACTOR: A program for ligand-binding site, EC number & GO term prediction
   d) TM-SITE: A structure-based approach for ligand-binding site prediction
   e) S-SITE: A sequence-based approach for ligand-binding site prediction
   f) MUSTER: A threading program for protein template identification
   g) LOMETS: A meta-server approach consisting of multiple threading programs
   h) SPICKER: A clustering program for structure decoy selection
   i) HAAD: Quickly adding hydrogen atoms to protein heavy atom structure
   j) EDTSurf: Construct triangulated surfaces of protein molecules
   k) ModRefiner: Construct and refine atomic model from C-alpha traces
   l) NWalign: Protein sequence alignments by Needleman-Wunsch algorithm
   m) PSSpred: A program for Protein Secondary Structure PREDiction
   n) ResQ: An algorithm to estimate B-factor and residue-level error of models

2. How to install the GPU-I-TASSER Suite?

   a) download the GPU-I-TASSER Suite 'GPU-ITASSER1.0.tar.bz2' from
      http://zhanglab.dcmb.med.umich.edu/I-TASSER/download
      and unpack 'GPU-ITASSER1.0.tar.bz2' by
      > tar -xvf GPU-ITASSER1.0.tar.bz2
      The root path of this package is called $pkgdir, e.g. 
      /home/yourname/I-TASSER5.0. You should have all the programs under this 
      directory. You can install the package at any location on your computer.
   
   b) Download GPU-I-TASSER and COACH library files from
      http://zhanglab.dcmb.med.umich.edu/library/ 
      http://zhanglab.dcmb.med.umich.edu/BioLiP/
      A script 'download_lib.pl' is provided in the package for automated
      library download and update of the libraries.
      We recommend putting the library files under the path /home/yourname/ITLIB.

   c) Third-party software installation:

      While the majority of programs in the package 'I-TASSER5.0.tar.bz2' are
      developed in the Zhang Lab herein the permission of use is released,
      there are some programs and databases (including blast, nr and GOparser)
      which were developed by third-party groups. A default version of blast
      and nr are included in the package. It is user's obligation to obtain
      license permission from the developers for all the third-party software 
      before using them. A detailed list of addresses and guidance for install 
      these programs can be seen at
      http://zhanglab.dcmb.med.umich.edu/I-TASSER/addition.
      In addition, your system needs to have Java installed.

3. Bug report:

   Please report and post bugs and suggestions at I-TASSER message board: 
   http://zhanglab.dcmb.med.umich.edu/forum


   #######################################################
   #                                                     #
   # 4. Installation and implementation of GPU-I-TASSER  #
   #                                                     #
   #######################################################
   
4.1. Introduction of I-TASSER
   
   GPU-I-TASSER is an integrated package for protein structure and function 
   predictions. For a given sequence, GPU-I-TASSER first identifies template proteins 
   from the Protein Data Bank (PDB) by multiple threading techniques (LOMETS). 
   The continuous fragments excised from the template alignments are used to 
   assemble full-length models by iterative Monte Carlo simulations. The best 
   models are then selected from the Monte Carlo trajectories by decoy 
   clustering. The final atomic models are rebuilt from the structure clusters 
   by atomic-level structural refinements. To run GPU-I-TASSER, 

   For function annotation, the I-TASSER structure model is matched through 
   the function library (BioLiP) to identify functional template. The biological 
   insights (including ligand-binding, enzyme classification, and gene ontology) 
   are inferred from the functional templates by COACH based on the consensus
   of predictions from COFACTOR, TM-SITE and S-SITE.

4.2. How to run GPU-I-TASSER?
   
   a) Main script for running I-TASSER is $pkgdir/I-TASSERmod/runI-TASSER.pl. 
      Run it directly without arguments will output the help information.

   b) The following arguments must be set (mandatory arguments). One example is: 

      "$pkgdir/I-TASSERmod/runI-TASSER.pl -libdir /home/yourname/ITLIB -seqname example -datadir /home/yourname/I-TASSER5.0/example -runstyle gnuparallel"

      -libdir    means the path of the template libraries
      -seqname   means the unique name of your query sequence
      -datadir   means the directory which contains your sequence 
      -runstyle  means the style in which to run jobs, whether "parallel", "serial" or "gnuparallel"
                 Runstyle "gnuparallel" or "parallel" must be specified when running GPU-I-TASSER
                 and GPU nodes must be available to support the GPU runs. The default runstyle is "serial"
                 and that means running sequential I-TASSER simulations.
                 "parallel" means running parallel GPU simulation jobs in the
                 cluster using PBS/torque job scheduling system.
                 "gnuparallel" means running GPU parallel simulation jobs on
                 one GPU enabled computer with multiple cores using GNU parallel

   c) Other arguments are optional whose default values have been set.
      User can reset one or more of them. One example of command line is: 

      "$pkgdir/I-TASSERmod/runI-TASSER.pl -pkgdir /home/yourname/I-TASSER5.0 -libdir /home/yourname/ITLIB -seqname example -datadir /home/yourname/I-TASSER5.0/example -runstyle parallel -homoflag benchmark -idcut 0.3 -LBS true -EC true -GO true -java_home /usr"

      -pkgdir     means the path of the I-TASSER package. default is to
                  guess by the location of runI-TASSER.pl script
      -java_home  means the path contains the java executable "bin/java"
                  (your system needs to have Java installed)
      -homoflag   [real, benchmark],"real" will use all templates, "benchmark"
                  will exclude homologous templates    
      -idcut      sequence identity cutoff for "benchmark" runs, default
                  value is 0.3, range is in [0,1]    
      -ntemp      number of top templates output for each threading program,
                  default is 20, range is in [1,50]    
      -nmodel     number of final models output by I-TASSER, default value
                  is 5, range is in [1,10]
      -LBS        [false or true], whether to predict ligand-binding site, default is false
      -EC         [false or true], whether to predict EC number, default is false
      -GO         [false or true], whether to predict GO terms, default is false
      -restraint1 specify distance/contact restraints (read more at 
                  http://zhanglab.dcmb.med.umich.edu/I-TASSER/option1.html )
      -restraint2 specify template with alignment (read more at 
                  http://zhanglab.dcmb.med.umich.edu/I-TASSER/option4.html )
      -restraint3 specify template name without alignment (read more at 
                  http://zhanglab.dcmb.med.umich.edu/I-TASSER/option2.html )
      -restraint4 specify template file without alignment (read more at 
                  http://zhanglab.dcmb.med.umich.edu/I-TASSER/option3.html )
      -temp_excl  exclude specific templates from template library (read more 
                  at http://zhanglab.dcmb.med.umich.edu/I-TASSER/option6.html )
      -traj       this option means to deposit the trajectory files
      -light      this option means to run I-TASSER in fast mode (each 
                  simulation runs by default 5 hours maximum)
      -hours      specify maximum hours of simulations (default=5 when -light=true)
      -outdir     where the final results should be saved (default value is set to data_dir)
   
   d) To make HTML webpage for GPU-I-TASSER suite output, follow document at
      $pkgdir/file2html/readme

   NOTE:
   a) Outline of steps for running GPU-I-TASSER by 'runI-TASSER.pl':
      a1) standardize 'seq.fasta' to 'seq.txt' and get the sequence length
      a2) run 'psiblast' to generate 'chk', 'out', 'pssm', 'mtx' files
          run 'PSSpred' to get 'seq.dat', 'seq.dat.ss'
          run 'solve' to get 'exp.dat'
          run 'pairmod' to get 'pair1.dat' and 'pair3.dat'
      a3) run threading programs sequentially
          run 'mkinit.pl' to generate restraints
      a4) run I-TASSER simulation
      a5) run SPICKER clustering program
          run 'get_cscore.pl' to get confidence score
          run 'EMrefinement.pl' to get full-atomic models
          run 'get_rsq_bfp.pl' to get local accuracy and B-factor estimations
      a6) run 'runCOACH.pl' to generate ligand-binding sites, EC number and 
          GO terms predictions.
   b) 'seq.fasta' is the query sequence file in FASTA format, which is the
      only needed input file for running I-TASSER. This file should be
      put in $datadir before running this job.
   c) I-TASSER structure assembly simulations contains 14 independent 
      runs by default. This number can be modified if the user wants to run
      more simulations, especially for big protein without good templates.
   d) If working on a cluster with multiple nodes, it is recommended to set 
      $runstyle="parallel". You need have PBS server installed in your system. 
      Parallel jobs will run faster since jobs are distributed among different 
      nodes. The default setting $runstyle="serial" will run all the jobs on a 
      single computer.
   e) If the job has been executed partially and encounter some error, you can 
      rerun the main script without modification. It will check the existing 
      files and start from the correct position.

4.3 System requirement:

   a) x86_64 machine, Linux kernel OS, Free disk space of more than 60G.
   b) GPU nodes for running GPU-I-TASSER.
   c) Perl and java interpreters should be installed. GO:Parser should be installed 
      if you want to predict GO terms
   d) Basic compress and decompress package should be installed to support: 
      tar and bunzip2.
   e) If you are using computer clusters, job management software PBS server should 
      support 'qsub' and 'qstat'. If using other job management software, such as 
      SGE and Slurm, some changes should be made following the instructions at:
      http://zhanglab.dcmb.med.umich.edu/bbs/?q=node/3561

4.4. How to cite I-TASSER and I-TASSER Suite?

   If you are using the GPU-I-TASSER package, you can cite:

   E MacCarthy, C Zhang, Y Zhang, D KC. GPU-I-TASSER: a GPU accelerated 
   I-TASSER protein structure prediction tool. Bioinformatics 2021.