---------------------
README.piana_examples
---------------------

This file describes a few examples on how to execute piana for
different purposes.

For all these examples there is a piana configuration_file associated
(that you can find under piana/code/execs/conf_files/). To create your
own configuration file (which should contain the parameters and commands
you need), follow the instructions under
piana/code/execs/conf_files/general_template.piana_conf

In order to try these examples on your machine you must make sure that
all installation instructions have been followed:
piana/README.piana_installation

---------------------------------------------------------------------
A generalized example of using piana would be:

1. follow instructions on
   piana/code/execs/conf_files/general_template.piana_conf to define
   your parameters and execution commands

2. execute piana.py on the command line, giving as argument the
   configuration file you wrote

   $> python2.3 piana.py --configuration-file=your_configuration_file

   [alternatively, you can write a configuration file that leaves
    'blank' some parameters, and set the parameters values through
    the command line:
   
      $> python2.3 piana.py --configuration-file=your_configuration_file --piana-dbname=pianaDB_limited --piana-dbhost=piana_server
    
   ]
   

3. analyze results from output files


----------------------------------------------------------------------
EXAMPLES OF PIANA EXECUTION
----------------------------------------------------------------------

--> all examples are shown for a mysql database pianaDB_limited
    located in the same machine as the code (ie localhost) with a
    mysql server that does not require password
      --> if your mysql server is on a machine different from the one
          where your code is, write the name of the machine instead of
          localhost
      --> if your mysql server requires a password, you need to add to
          all commands line the arguments: --piana-dbuser=username and
          --piana-dbpass=password

DON'T FORGET TO set the correct piana-dbname and piana-dbhost on the
commands described below!!!
  (unless your database name is pianaDB_limited and your mysql server
   is localhost)

Attention! The protein lists provided as examples DO NOT correspond to
real examples (except for example 8): therefore, biological
interpretations of examples described below are not advised... This is
presented here just for giving you a clearer idea on how to use PIANA

Attention! The results described in this file or in directory 
piana/code/execs/dummy_files/output are not always updated to show
the last developments of PIANA. Therefore, some details such as
format or information might be different from the results you will get 
when following the examples instructions. For an updated description
of PIANA formats and information look on file
piana/code/execs/conf_files/general_template.piana_conf


********************************************************************
EXAMPLE 1 ==> Your first PIANA example
********************************************************************

Situation: we want to create the interaction network for a single
           protein and view the information associated to proteins in
           that network

1.1. go to the piana execution directory
     $> cd piana/code/execs

1.2. create a file with one protein per line, using your preferred type
     of protein identifier 
            
     --> we have placed one example protein in file
         piana/code/execs/dummy_files/input/example_protein.txt

1.3. write a configuration file to obtain results for this protein:
            
     --> we have written for you the configuration file:
         piana/code/execs/conf_files/first_example.piana_conf

         - look at this configuration file to better understand how
            this file has been written
         - descriptions of all PIANA parameters and commands are 
           provided in the template for configuration files:
           piana/code/execs/conf_files/general_template.piana_conf   

1.4. execute piana with this configuration file to get the interaction
     network and table
           
   $> python piana.py --configuration-file=conf_files/first_example.piana_conf


	--> Attention! If you look into first_example.piana_conf
            you'll see that we have asked PIANA to print results in
            two format modes: html and txt. This means that if you
            have executed the command as described above, you'll have
            results files '.txt' and '.html'. Both format modes show
            more or less the same information (txt is always more
            complete) but have to be visualized differently: txt files
            are easily parseable (or visualized using any text editor)
            and html files have to be visualized with a web browser.
	
            - 'txt' mode is mainly thought to be used for quick
               visualization or for being parsed afterwards by the
               user.
            - 'html' mode is thought to be used to visually
               interpret the results

            In most of the following examples, for speeding up things
            when looking at results, we are going to use 'txt' format
            mode. In case you are interested in visualizing your
            results in a web browser, just change the format-mode
            argument of the piana command of your configuration file
            from txt to html

        --> Attention! If you look into first_example.piana_conf 
            you'll see that we have asked PIANA to print the network
            in DOT format (format-mode=dot). This can be changed to
            other formats (eg. SIF format, if you want to visualize
            the network using cytoscape).


     the previous command created the following files:

      - example_results.all.print-network.dot              
            -> DOT file of the network 
              (to be converted in image as explained below in 1.5)

      - example_results.all.print-table.html               
            -> HTML file with the table of protein interactions

      - example_results.all.print-table.txt                
            -> Text file with the table of protein interactions
         
      - example_results.compact.print-all-prots-info.html  
            -> HTML file with information about the proteins in the
               network

      - example_results.compact.print-all-prots-info.txt   
            -> Text file with information about the proteins in the
               network

1.5. to visualize the network you can use any software that reads .dot
     format (eg. neato from Graphviz, see README.piana_requirements):

     $> neato -Tgif -o example_results.gif example_results.all.print-network.dot
     $> xview example_results.gif

       -> the image you just produced 'example_results.gif' must be
          identical to 
	  piana/code/execs/dummy_files/output/example1/example_results.gif
          (unless your database contains interactions different from
           those in the pianaDB_limited version provided in our web)

       -> more instructions on network visualization can be read at
          piana/code/execs/README.visualize_piana_network

       -> color codes used in the network are explained in command
          print-network of file
          piana/code/execs/conf_files/general_template.piana_conf
           ---> color meanings of the network are also shown in this
                image: piana/docs/documentation/network_colors.gif

1.6. look at the interaction table in html format in your web browser
     with option 'open file' and then searching in the directory for
     'example_results.all.print-table.html'

1.7. look network proteins information in html format in your browser
     with option 'open file' and then searching in the directory for
     'example_results.compact.print-all-prots-info.html'
           

Attention! Most PIANA commands and parameters can also be given to
           piana interactively. Just do...

$piana/code/execs> python piana.py

... and PIANA will show you the possibilities of the interactive
mode. The interactive mode also accepts a configuration file where you
can set the execution parameters: just set the exec-mode parameter of
the configuration file to interactive. The following examples are all
shown for batch mode but they could have also been achieved using the
interactive mode.

   --> in mode interactive, the commands section of the configuration
       file is ignored
   --> some parameters and commands are not available in interactive mode


***********************************************************************
EXAMPLE 2 ==> Getting standard results (interactions, network, proteins 
              that connect root nodes, etc) for a list of proteins 
***********************************************************************

Situation: we want to get all standard results for the list of
           proteins in
           piana/code/execs/dummy_files/input/liver_cancer_proteins.txt

    --> this file contains one protein of interest per line (we are
        supposedly studying proteins related to liver cancer)

        -> in PIANA, the proteins that are used to build a network (in
           this case, proteins from file liver_cancer_proteins.txt)
           are called root proteins

        -> these proteins come from any type of wet lab experiment
           where they have found to be somehow related to liver cancer

    --> we want to analyze the protein interaction network formed by
        these proteins and their interaction partners

          -> by visualizing the network

          -> by listing the information associated to the proteins in
             the network

          -> by identifying other relevant proteins with a high
             probability of being related to liver cancer
               --> for example, linker proteins (proteins that connect
                   more than one root protein) are of special interest
                   because they connect two proteins that we know are
                   related to liver cancer.

2.0. the input file liver_cancer_proteins.txt contains one protein per
     line, where the code used for proteins is uniprot accession number
   
    --> the first important thing to do is to find out which is the
        'piana identifier name' for uniprot accession numbers

        -> you can get a list of all 'piana names for identifier types' by
           doing:  

          $piana/code/execs> python piana.py --print-reference-card --piana-dbname=pianaDB_limited --piana-dbhost=localhost

        -> in the case of uniprot accession numbers, the 'piana name'
           is 'uniacc'
             -> other piana names are 'unientry' (for uniprot
                entries), 'gi' (for ncbi GenBank gi), 'geneName' (for
                gene names), 'geneID' (for ncbi Gene ID), ...

2.1. In order to get the results for these proteins, we need to create
     a configuration file that sets the parameters and executes the
     commands needed for getting standard results.
   
   To create this configuration file, we use
   general_template.piana_conf as a guide to creating our own configuration
   file

     -> the result of modifying general_template.piana_conf can be
        seen in piana/code/execs/conf_files/get_example_results.piana_conf

     -> instead of setting all the parameters in the configuration
        file itself, we leave some of them to 'blank'. This implies
        that parameters set to 'blank' have to be set in the
        command line when calling PIANA.

         --> This is done this way so we can use one configuration
             file for many experiments. For example, in this case we
             might want to use get_example_results.piana_conf for
             other proteins, and therefore, parameters 'input-file',
             'input-id-type', 'output-id-type' and
             'results-prefix' of the configuration file have been left
             to 'blank'.

                As you saw in example 1, in case you usually repeat
                the same PIANA run, you can fix all these parameters
                in the configuration file and call PIANA just with
                argument
                --configuration-file=your_configuration_file.piana_conf

                In get_example_results.piana_conf, we have as well
                left to blank 'piana-dbname' and 'piana-dbhost',
                because we want to run PIANA using different piana
                databases. However, if you only have one piana
                database, you can fix in all your configuration files
                the parameters 'piana-dbname' and 'piana-dbhost', so
                you don't have to write them each time in the command
                line.

                 Note: parameters set through the command line
                       overwrite parameters set in the configuration
                       file. Therefore, you can have a configuration
                       file with the database name you normally use
                       and for those cases in which you want to use
                       another database, set it through the command
                       line.

2.2. Execute PIANA, giving the configuration file as argument, as well
     as the parameters that we left to 'blank' in the configuration
     file

   # go to the piana execution directory
     $> cd piana/code/execs
   
   # execute piana.py
     $> python piana.py --configuration-file=conf_files/get_example_results.piana_conf --piana-dbname=pianaDB_limited --piana-dbhost=localhost  --input-file=dummy_files/input/liver_cancer_proteins.txt --input-id-type=uniacc --output-id-type=uniacc --results-prefix=liver_cancer_results


2.3. In the directory you have executed piana (in the configuration
     file you can change the directory where results are printed),
     you'll find the results files:

   liver_cancer_results.all.print-network.dot                 
            --> the network in .dot format that you can convert into a
                network image using neato

   liver_cancer_results.all.print-table.txt                   
            --> the text table with all the interactions

   liver_cancer_results.compact.print-all-prots-info.txt      
            --> text information for all proteins in your network

   liver_cancer_results.compact.print-connect-prots-info.txt  
            --> text information for proteins that connect your root
                proteins (linker proteins)

   liver_cancer_results.all.print-table.html                   
            --> the html table with all the interactions (use a
                browser to visualize)
   liver_cancer_results.compact.print-all-prots-info.html      
            --> html table with information for all proteins in your
                network (use a browser to visualize)
   
   liver_cancer_results.compact.print-connect-prots-info.html  
            --> html table with information for linker proteins (use a
                browser to visualize)

	
   ( we have placed the files we have obtained under
     piana/code/execs/dummy_files/output/liver_cancer_results/liver_cancer_results.*
     Of course, if you are using a database different from the one we
     provide in our website, the content of files will be different.  )


2.4. Analyze the results you obtained

    2.4.1 -> to visualize the network you can use any software that
             reads .dot format (eg. neato from Graphviz, see
             README.piana_requirements):
       
	 $> neato -Tgif  -o liver_cancer_results.gif liver_cancer_results.all.print-network.dot 
         $> xview liver_cancer_results.gif


     -> you can see that there is only one interaction coming from
        DIP (in red) and the rest are predictions by structural
        similarity (in green)

     -> the proteins that were used to build the network are in
        yellow (root proteins). Their interaction partners in blue.

     -> as you see, no interactions were found for root protein P53985
        
     -> the image we have obtained can be found in
        piana/code/execs/dummy_files/output/liver_cancer_results.gif
     
     -> more instructions on network visualization can be read in
        piana/code/execs/README.visualize_piana_network
       -> color codes used in the network are explained in command
          print-network of file general_template.piana_conf
           ---> color meanings of the network are also shown in this
                image: piana/docs/documentation/network_colors.gif


    2.4.2 -> You can use results file
             liver_cancer_results.compact.print-all-prots-info.txt to
             do searches for specific protein information you are
             interested in (it is maybe easier to read it in its html
             version)

             --> However, for parsing txt mode is more convenient:
                 format followed by output txt files is explained on
                 file
                 piana/code/execs/conf_files/general_template.piana_conf

             --> Other searches can be manual: for example, if we were
                 interested in chaperones, we could do a manual search
                 on this file to see if there are any chaperones in
                 the network

		(If you are using the pianaDB_limited version provided in our web,
		you won't observe this result as the description has been deleted from
		database due to copyright and database size)

             $> grep "chaperone" liver_cancer_results.compact.print-all-prots-info.txt
        ----------------------------------------------------------------------------------------------------------
        Q9NU22  ['MDN1, midasin homolog (Yeast).', 'Midasin (MIDAS-containing protein).', 'midasin', 'MDN1, midasin homolog (yeast)']   
        ['May function as a nuclear chaperone and be involved in the assembly/disassembly of macromolecular complexes in the nucleus.']
        root=0  expression=None      fitness=no      emblpid:CAI13203        emblpid:CA ............................... 
        ...............................
        ----------------------------------------------------------------------------------------------------------

    2.4.3 -> As you saw in the image liver_cancer_results.gif, there
             is one linker protein: P56715, connecting root proteins
             P43304 and P04792

       We have found by working with experimentalist collaborators
       that 'linker proteins' are usally of great interest to them.
       Why? Very simple: if we have produced a network from a list of
       'interesting proteins' and in the network there are proteins
       connecting these 'interesting proteins' it is very likely that
       the proteins that act as connectors are also 'interesting'.

       Therefore, P56715 is probably also involved in liver cancer,
       and it should be subject of further studies. (remember this
       is a dummy example... don't try to submit an article to Nature
       saying that you have found a new protein involved in liver cancer...)

       In this case it was easy to visually identify the linker
       protein. However, in more complex networks it isn't that easy:
       but don't worry! PIANA does it for you.
       
       You can see the list of linker proteins (with additional info
       about them) of the network in the text results file
       'liver_cancer_results.compact.print-connect-prots-info.txt' or
       as an html table in
       'liver_cancer_results.compact.print-connect-prots-info.html'

       In addition to this information about linker proteins, if you
       have created a local GO database (see README.populate_piana_db
       to learn how to create a local GO database) you can also
       produce an html table describing the linkers with their GO
       terms (or fixing at which level of the GO hierarchy you want to
       get the GO term from)

       To do this, we use a separate parser located in
       piana/code/evaluation/tests
            --> the parser takes as input files
                *.print-connect-prots-info.txt
                -> therefore, if you want to get GO information, you
                   should get the linker proteins in 'txt'
                -> as you saw before, you can also get the linker
                   proteins in html, but this file is not parseable
                   by parse_linkers.py
       
       $> cd piana/code/evaluation/tests	
       $> python parse_linkers.py --input-file=../../execs/liver_cancer_results.compact.print-connect-prots-info.txt --input-id-type=uniacc --piana-dbname=pianaDB_limited --piana-dbhost=localhost --results-prefix=liver_cancer_linkers.go --output-format=html --print-go-info --go-dbname=goDB --go-dbhost=localhost --go-level=-1 --label-size=all
 
       (do '$> python parse_linkers.py --help' for more parsing
        options, such as changing the GO level or highlighting keywords)

      This command will print results to files :
 
        liver_cancer_linkers.go.linkers_table.html    
             --> HTML with a table where each linker is described
        
        liver_cancer_linkers.go.dot                   
             --> DOT file for the linkers and roots (to be visualized
                 using neato) with their GO terms
	
      to create the image of the network using GO terms:
         $> neato -Tgif -o liver_cancer_linkers.go.gif liver_cancer_linkers.go.dot

              ( you can see these files under
                dummy_files/output/liver_cancer_linkers.*, and decide
                whether it is worth for you to create a local GO
                database or not.)


2.5. Now.... imagine that you have a set of keywords that you want to
     use to check if the proteins in your network are involved in
     certain processes
     For example, for liver cancer we could check if keywords cancer,
     stress, carcinoma, tumor, apoptosis or death appear in the
     network.

    Note: this example does not work with the current version of PIANA database in the web,
   as the information of description and function has been removed. If
   you are interested in using this information, it must be created the
   database from scratch.

   PIANA does this automatically for you by coloring the nodes in red
   and adding labels to the tables whenever the keywords appear in the 
   protein function, name or description.
   All you have to do is a small change to the configuration file
   get_example_results.piana_conf (or create a new configuration file):

      where it says     list-keywords=blank 
      you should write  list-keywords=cancer:stress:carcinoma:tumor:apoptosis:death

   -> Then, repeat step 2.2 setting the command line argument
      '--results-prefix' to 'liver_cancer_results.keywords' (this ensures
      that results are written to different files)

   -> Then, repeat step 2.4.1. to visualize the network, using neato
      to convert liver_cancer_results.keywords.all.print-network.dot 
      into liver_cancer_results.keywords.gif

   If you do '$> xview liver_cancer_results.keywords.gif' you'll see
   that protein P04792 is now orange, which means that it is a root
   protein that contains a keyword
   
       -> you have the new image we have obtained in
          dummy_files/output/liver_cancer_results.keywords.gif

       -> more instructions on network visualization can be read in
          piana/code/execs/README.visualize_piana_network

       -> color codes used in the network are explained in command
          print-network of file general_template.piana_conf

           ---> color meanings of the network are also shown in this
                image: piana/docs/documentation/network_colors.gif

   If you look to the other results files, you'll see that now in the
   html tables, the proteins that contained a keyword are highlighted
   in red and underlined. In the text results files, the list of
   keywords that were found in that protein are written in tokens
   'user_keyword=word'

       -> you can see the new files with keywords highlighted in
          dummy_files/output/liver_cancer_results.keywords.*

       -> for example, if you open with your browser
          liver_cancer_results.keywords.all.print-table.html you can
          see that 'P04792' is highlighted in the three interactions
          where it appears.

**********************************************************************
EXAMPLE 3 ==> Getting protein code equivalences between uniprot accesion
              numbers and uniprot entry identifiers
**********************************************************************

Situation: we need the uniprot accession equivalents of proteins that 
           we have in a different type of identifiers (eg. uniprot entry names)

3.0. our input file
     piana/code/execs/dummy_files/input/proteosome.uniprot_entries contains
     one uniprot entry per line

3.1. configuration file to be used is
     piana/code/execs/conf_files/protein_code_2_protein_code.piana_conf

   -> we have left input-id-type and output-id-type to
      blank so this configuration file can be used to translate
      between any type of protein identifiers

   -> in this file, we have just written one piana command:
      protein-code-2-protein-code

      (as always, look at this file to better understand how PIANA
       works. We have written the same command twice: one in
       format-mode txt and another one with format-mode html)

3.2. execute PIANA to get the equivalences

   # go to the piana execution directory
   $> cd piana/code/execs

   ( before executing, find out which are the piana type names for
     'uniprot accession numbers' and 'uniprot entry names'

      -> you can get a list of all 'piana names for identifier types' by doing 
          $piana/code/execs> python piana.py --print-reference-card --piana-dbname=pianaDB_limited --piana-dbhost=localhost

      -> in the case of uniprot accession numbers, the 'piana name' is 'uniacc' 
      -> for uniprot entry identifiers, the 'piana name' is 'unientry' 
    )

   $> python piana.py --piana-dbname=pianaDB_limited --piana-dbhost=localhost --configuration-file=conf_files/protein_code_2_protein_code.piana_conf --input-file=dummy_files/input/proteosome.uniprot_entries --input-id-type=unientry --output-id-type=uniacc --results-prefix=proteosome_translation

   
   -> this creates a file
      "proteosome_translation.protein-code-2-protein-code.unientry2uniacc.txt"
      looking like this:

      -------------------------------
      PSB2_YEAST      P22141
      PSA6_YEAST      P21243  P15708
      PSA1_YEAST      P40302
      PSB5_YEAST      P30656
      PSA2_YEAST      P23639
      PSB4_YEAST      P30657
      ..............................
      ..............................
      ------------------------------


   If this operation (ie. translating from uniprot entry to accession)
   becomes rutinary in your work, you can create a new configuration
   file uniprot_entry2uniacc.piana_conf that sets most of the parameters
   in the configuration file itself.

   For example, if you are always going to transform from uniprot entries to
   gi, in your configuration file you would change:

     input-id-type=blank     to   input-id-type=unientry
     output-id-type=blank    to   output-id-type=uniacc

   Then, if you always use the same piana database, in your
   configuration file you would change:

     piana-dbname=blank   to  piana-dbname=pianaDB_limited
     piana-dbhost=blank   to  piana-dbhost=localhost

   If these parameters are set in the configuration file, the command
   line only needs the configuration file, the input file and the results
   prefix.

   Remember that if you are using gene names it is advised to set in the
   configuration file input-proteins-species and output-proteins-species 
   to prevent using gene names that are of a species different from the 
   one being analyzed


********************************************************************
EXAMPLE 4 ==> Doing interaction predictions for proteins in an input
              file
********************************************************************

Situation: we want to get interaction predictions (eg expansion by
           COG) for proteins in
           piana/code/execs/dummy_files/input/liver_cancer_proteins.txt

   --------------------------------------------------------------------
   "expansion by COG" is a prediction based on interologs:

   Each expand-interactions piana command does the following:

   For each protein in the network:

    1. find interactions of this protein in the current network
    2. find proteins in the database that share a certain
       characteristic with this protein (e.g cog code)
    3. for each protein that shares that characteristic:
       - find interactions for protein that shares the characteristic
         in the database
       - find interactions for protein that shares the characteristic
         in the network
       - assign to protein being processed all interactions of protein
         that shares the characteristic
       - assign to protein that shares that characteristic all
         interactions of protein being processed

   This process can be repeated more than once, to reach far-fetched
   deductions
   For example, if root protein is A, and if we know that C and D
   (yeast) interact, and that A =cog= C and B =cog= D
      ( X =cog= Y means that X and Y have the same COG code)

    - simple expansion will predict that A interacts with D
    - double expansion will predict that A interacts with D and that A
      interacts with B
       (ie double expansion predicts interactions from a previous
        prediction)
       (this is achieved by executing two consecutive
        expand-interactions piana commands)
  ---------------------------------------------------------------------

  For this example, we will use user interface
  run_multiple_pianas.py instead of piana.py

  - Why? Because instead of building a complete network with all the
    proteins in the input file, we just build the network for one
    protein and then do the predictions based on that network. This is
    faster and easier to manage for the memory, and the results are
    the same.

  - Attention! In run_multiple_pianas.py you cannot set
               hub-threshold in the configuration file, it must
               be done through the command line (refer to
               general_template.piana_conf if you do not know what is
               hub-threshold for.)

    $> cd piana/code/execs

    $> python run_multiple_pianas.py --input-file=dummy_files/input/liver_cancer_proteins.txt  --input-id-type=uniacc --output-id-type=uniacc --piana-dbname=pianaDB_limited --piana-dbhost=localhost --results-prefix=liver_cancer_predictions --configuration-file=conf_files/get_double_cog_expansions.piana_conf --hub-threshold=0


      --> This will produce files
          'protein_name'.liver_cancer_predictions.expand-interactions.cog.root
          where each line is a protein interaction prediction.

           $> ls -lh *.liver_cancer_predictions.*
           --------------------------------------------------------------------------
	   39K Jul  6 17:46 P04792.liver_cancer_predictions.one_protein_file.txt.liver_cancer_predictions.expand-interactions.cog_thres0.root
	   0 Jul  6 17:41 P43304.liver_cancer_predictions.one_protein_file.txt.liver_cancer_predictions.expand-interactions.cog_thres0.roo
           ------------------------------------------------------------------------

     For P43304 no predictions where made. For the other protein, you might use the 
     predictions at your will or if you
     wish, you can insert these predictions into your piana database
     using parser
     piana/code/dbParsers/expansionParser/expansion2piana.py
     by doing:

       (Attention! These are real predictions made by PIANA,
                   but they are not related in any way with liver cancer)

       $> cd piana/code/dbParsers/expansionParser

       $> python expansion2piana.py --piana-dbname=pianaDB_limited --piana-dbhost=localhost --expansion-file=../../execs/P04792.liver_cancer_predictions.one_protein_file.txt.liver_cancer_predictions.expand-interactions.cog_thres0.root --num-expansions=2 --input-id-type=uniacc --verbose --database-name="expansion"

       As you see in the parser verbose (if you have set it), only 
       178 interactions of 735 
        were inserted into pianaDB_limited, because most of the 
       predictions were made between proteins of different species.
        If you add flag --no-species to the previous commands, all 
       interactions will be inserted into the database regardless 
       of the species of the proteins

	
       --> if you repeat now example 2, you'll see that there are new
           interaction in the network, in orange color.

       --> Expansions introduced to a piana database using
           expansion2piana are labeled with the name you specify at
	   "database-name" parameter. In this case, it is
	   'expansion'. Predictions not
           added to the piana database will not appear in subsequent
           piana executions. However, you can add the predictions to
           the network and continue working with it, by setting argument
           exp-output-mode to add in command expand-interactions. For
           a detailed explanation on which options are available when
           doing predictions read description of command
           expand-interactions in
           piana/code/execs/conf_files/general_template.piana_conf

       --> when doing expansions that are going to be inserted into
           a PIANA database, we recommend using proteinPiana as the
           type of output identifier (ie. output-id-type=proteinPiana)
           Since the interactions are going to be inserted on the 
           database it is better not to do code translations in 
           between the two steps. In any case, never use geneName as
           output type! It will introduce a lot of noise in your
           predictions, because they are ambiguous even within 
           species.

       --> You can also do predictions based on SCOP and InterPro
           codes (ie proteins of the same SCOP family will tend to
           interact with the same proteins). For those predictions,
           you can create a configuration file similar to
           conf_files/get_double_cog_expansions.piana_conf but
           changing the parameters as explained in command
           'expand-interactions' of
           conf_files/general_template.piana_conf

       --> We do not recommend doing predictions based on predictions:
           ie. we do not recommend executing command
           expand-interactions on networks that were built from a
           database with predictions. What we do in our lab is that we
           have a piana database that only contains experimental data
           (DIP, MIPS, HPRD, BIND, ...) and another database with all
           interaction data (DIP, MIPS, HPRD, BIND, STRING,
           expansions, ....). Then, when we want to get predictions,
           we use the experimental database. The predictions made by
           PIANA are then inserted into the database that contains all
           interaction data. In this way, we avoid predictions that
           are based on predictions.

            -> having two separate piana databases is not extrictly
               necessary, since PIANA allows you to choose which
               databases have to be used in each analysis using
               parameter list-source-dbs. But it is more convenient to
               separate experimental interactions from predictions,
               since introducing restrictions has a side-effect: slows
               down the creation of the network. Therefore, if you do
               not have disk space problems, it is easier to have to
               (synchronized) piana databases: one with only
               experimental interactions and the other one with all
               interactions.

***************************************************************
EXAMPLE 5 ==> Matching proteins in the network to spots in a 2D
              electrophoresis gel
***************************************************************

Situation: we have spot ids from a 2D electrophoresis gel, with their
           molecular weights (MW) and isoelectric points (IP). Some of
           those spots were identified by mass spectrometry (that was
           how we obtained the list of proteins in
           liver_cancer_proteins.txt) but other spots were
           unassigned. We can use PIANA to identify some of those
           unnassigned spots, by comparing the MW and IP of the spots
           with the MW and IP of the proteins in the network.

           This is based on the fact that it is very likely that the
           proteins in the 2D gel also appear in the network, since
           the network has been built from the list of proteins of the
           gel that could be identified. And, since all proteins in
           the gel are related, the proteins of the spot will probably
           appear in the network.

	   For example, in the liver cancer experiment, only 4
           proteins could be assigned by mass spectrometry to the 2D
           gel spots.
           Using PIANA, by comparing MW and IP of all spots in the 2D
           gel with MW and IP of all proteins in the network built
           from the root proteins (ie. those 4 proteins that could be
           identified by mass spectrometry), we can make some
           predictions on the correspondances between spots and
           proteins. Then, these predictions can be validated in the
           wet lab.


     Note: This example won't work with the current PIANA database
     in the web, as the information of sequence IP is not available.

5.0. we have a text file with spot ids, MW and IP, formatted as
     indicated in command 'match-proteins-to-spots' of
     general_template.piana_conf

   --> format description as extracted from
       piana/code/execs/conf_files/general_template.piana_conf:
   #
   #   - spots-file-name is a file name following the structure (one spot per line): spot_id<TAB>molecular_weight<TAB>isoeletric_point
   #      -> where decimals are expressed with "."

   --> We will use a dummy file with spot ids, MW and IP for a 2D
       electrophoresis with proteins involved in liver cancer:
          -> you can see this file in
            piana/code/execs/dummy_files/input/formatted_spots_liver_cancer.txt


5.1. we create a configuration file that builds a network from an
     input file, and then executes the command that matches proteins
     to spots
      -> you can see how this file looks like in
         piana/code/execs/conf_files/match_proteins_to_spots.piana_conf

5.2. execute piana with this configuration file and the command line
     parameters required (because we left them to blank in the
     configuration file)

   $> cd piana/code/execs

   $> python piana.py --configuration-file=conf_files/match_proteins_to_spots.piana_conf --piana-dbname=pianaDB_limited --piana-dbhost=localhost --input-file=dummy_files/input/liver_cancer_proteins.txt --input-id-type=uniacc --output-id-type=uniacc --results-prefix=matching_proteins_spot_cancer --depth=1 --spots-file-name=dummy_files/input/formatted_spots_liver_cancer.txt

5.3. the result of the command has been written to the results file
     'matching_proteins_spot_cancer.match-proteins-to-spots.txt'
     (you can also see the results file in html format using a
      browser:
      'matching_proteins_spot_cancer.match-proteins-to-spots.html')

   -> we have placed in piana/code/execs/dummy_files/output/
   -> it looks something like this:

     $> more matching_proteins_spot_cancer.match-proteins-to-spots.txt
       --------------------------------------------------------------------------------
        error level 6 (mw_error 0.1 - ip_error 0.1) spot_id 8201 matches protein P04792
        error level 7 (mw_error 0.2 - ip_error 0.2) spot_id 1305 matches protein P04792
        error level 7 (mw_error 0.2 - ip_error 0.2) spot_id 1306 matches protein P04792
        error level 7 (mw_error 0.2 - ip_error 0.2) spot_id 1307 matches protein P04792
        error level 7 (mw_error 0.2 - ip_error 0.2) spot_id 1301 matches protein P04792
        error level 7 (mw_error 0.2 - ip_error 0.2) spot_id 1303 matches protein P04792
        error level 7 (mw_error 0.2 - ip_error 0.2) spot_id 6904 matches protein P00488
        error level 7 (mw_error 0.2 - ip_error 0.2) spot_id 6903 matches protein P00488
        error level 7 (mw_error 0.2 - ip_error 0.2) spot_id 303 matches protein P04792
        error level 7 (mw_error 0.2 - ip_error 0.2) spot_id 8104 matches protein P61771
        error level 8 (mw_error 0.3 - ip_error 0.3) spot_id 1304 matches protein P04792
        .................................................................................
        .................................................................................
        .................................................................................
        ---------------------------------------------------------------------------------

     where mw_error 0.1 - ip_error 0.1 means allowing 10% error for MW and
     IP when searching for matches

 Attention! Correspondences that appear in a given error level will
            not be shown in higher error levels
            For example, "spot_id 8201 matched protein P04792" does
            not appear in error level 7, althougth it is clear that
            since it was found at 10% error it will also appear at 20%
            error.

 Attention! One spot can be assigned to several proteins, and
            viceversa. This just means that the spots MW and IP
            are within a short range and therefore several assignments
            can be made.


**************************************************************
EXAMPLE 6 ==> Clustering proteins by their molecular function
**************************************************************


Situation: we have a list of proteins for which we want to build their
           interaction network and then analyze their relationship in
           terms of molecular function.


  ==> to do this we are going to use configuration file
       piana/code/execs/conf_files/get_clustered_go_network.piana_conf.

         - The parameters that tell the clustering when to stop are
           detailed in the configuration file.
         - Depending on how specific or general you want the network
           to be, you can play with these parameters.

  ==> we are going to perform the clustering for proteins of the
     proteosome 'piana/code/execs/dummy_files/input/proteosome.uniprot_entries'


Attention! In order to do the clustering, you must have information 
           for distances between go terms in your piana  database 
           (pianaDB_limited only has it for GO terms involved in
           this example).
                
           In case you do not have GO information in your piana
           database, the clustering will not know which is the
           criteria for grouping proteins. Parsing GO takes a long
           time if you want to calculate the distances between all
           the GO terms. Therefore, if you do not have that time
           but you still want to do the clustering, there is the
           option of calculating the distances only between specific 
           GO terms.
             -> How do you do that? Read
   piana/code/dbParsers/goParser/README.limiting_parsing_to_specific_gos
   (it has already been done for pianaDB_limited)


6.1 run piana with the configuration file described above:
    get_clustered_go_network.piana_conf

    $> cd piana/code/execs
    $> python piana.py --configuration-file=conf_files/get_clustered_go_network.piana_conf --input-file=dummy_files/input/proteosome.uniprot_entries --input-id-type=unientry --input-proteins-species=yeast --results-prefix=clustering_proteosome --piana-dbname=pianaDB_limited --piana-dbhost=localhost


6.2 visualize the clustered network

    $> neato -Tgif -o clustering_proteosome.0.2.molecular_function.1.min.3.cluster-by-go-terms.gif clustering_proteosome.0.2.molecular_function.1.min.3.cluster-by-go-terms
    $> xview clustering_proteosome.0.2.molecular_function.1.min.3.cluster-by-go-terms.gif


     (you can see the result of the clustering in
      piana/code/execs/dummy_files/output/clustering_proteosome.0.2.molecular_function.1.min.3.cluster-by-go-terms.gif )
     
  ==> the interpretation of this network is not always
      straightforward... However, in some cases it is very helpful to
      visualize the network from this perspective.

  ==> the clustering can be also performed in terms of biological
      process and cellular location (using GO terms).

      - read description of command 'cluster-by-go-terms' in
        piana/code/execs/conf_files/general_template.piana_conf to
        learn more about changing the molecular_function to
        biological_process and other functionalities of the clustering

      - for this example, we have set level-threshold to 1, which has
        as a consequence that we have very general terms in the network

  ==> the clustering that is implemented right now in PIANA is far
      from optimal. We are working on it to make it faster and more
      relevant to biological problems. We are also working on
      providing the user with more information retrieved during
      the clustering, such as which proteins belong to each cluster.

*******************************************************************
EXAMPLE 7 ==> Getting a flavor of (almost) all the formats in which
              PIANA can produce outputs
*******************************************************************

This example is just for showing you the different formats in which
PIANA can print results.

   --> using configuration file
       get_all_formats_summary_results.piana_conf

   --> using the dummy file
       piana/code/execs/dummy_files/input/liver_cancer_proteins.txt

run:

   $> cd piana/code/execs
   $> python piana.py --configuration-file=conf_files/get_all_formats_summary_results.piana_conf --piana-dbname=pianaDB_limited --piana-dbhost=localhost --input-file=dummy_files/input/liver_cancer_proteins.txt --input-id-type=uniacc --input-proteins-species=all --results-prefix=trying_all_formats --output-id-type=uniacc


Now, take a look to the files trying_all_formats.* 

  trying_all_formats.all.print-all-prots-info.html          
              --> complete info for all proteins in html format (use
                  browser to visualize)
  trying_all_formats.compact.print-all-prots-info.html      
              --> limited info for all proteins in an html table (use
                  browser to visualize)
  trying_all_formats.all.print-connect-prots-info.html      
              --> complete info for linker proteins in html format
                  (use browser to visualize)
  trying_all_formats.compact.print-connect-prots-info.html  
              --> limited info for linker proteins in html format (use
                  browser to visualize)
 
  trying_all_formats.all.print-all-prots-info.txt           
              --> complete info for all proteins in text format
  trying_all_formats.compact.print-all-prots-info.txt       
              --> limited info for all proteins in text format
  trying_all_formats.all.print-connect-prots-info.txt       
              --> complete info for linker proteins in text format
  trying_all_formats.compact.print-connect-prots-info.txt   
              --> limited info for linker proteins in text format

  trying_all_formats.all.print-network.dot                  
              --> DOT file with all interactions in network (use neato
                  to visualize)
  trying_all_formats.connecting.print-network.dot           
              --> DOT file with interactions for root proteins and
                  linker proteins (use neato to visualize)
  trying_all_formats.all_root.print-network.dot             
              --> DOT file for interactions with at least one root
                  involved (use neato to visualize)
  trying_all_formats.only_root.print-network.dot            
              --> DOT file for interactions between root proteins (use
                  neato to visualize)

  trying_all_formats.all.print-table.html                   
              --> html table with all interactions in the network (use
                  browser to visualize)
  trying_all_formats.connecting.print-table.html            
              --> html table with interactions for root and linker
                  proteins (use browser to visualize)
  trying_all_formats.all_root.print-table.html              
              --> html table with interactions with at least one root
                  involved (use browser to visualize)
  trying_all_formats.only_root.print-table.html             
              --> html table with interactions between root proteins
                  (none in this case)

  trying_all_formats.all.print-table.txt                    
              --> text table with all interactions in the network
  trying_all_formats.connecting.print-table.txt             
              --> text table with interactions for root and linker
                  proteins
  trying_all_formats.all_root.print-table.txt               
              --> text table with interactions with at least one root
                  involved
  trying_all_formats.only_root.print-table.txt              
              --> text table with interactions between root proteins
                  (none in this case)


For a full description of the information contained in these files, as
well as which are the parameters needed for each kind of input, read
commands descriptions in
piana/code/execs/conf_files/general_template.piana_conf


*****************************************************************
EXAMPLE 8 ==> Finally! A real example! 
*****************************************************************

Situation: All examples shown up to this point used lists of proteins
           unrelated to the problem we said we were studying... Now,
           let's look to a real example.

           We have used genes that mediate in breast cancer metastasis
           to lung, discovered by the team of J. Massague and
           published in Nature some time ago: 
		
	Minn AJ, Gupta GP, Siegel PM, Bos PD, Shu W, Giri DD, Viale A, 
        Olshen AB, Gerald WL, Massague J.
	Genes that mediate breast cancer metastasis to lung.
	Nature. 2005 Jul 28;436(7050):518-24. 


   The list of genes can be found in
   piana/projects/metastasis/data/metastasis_gene_names.txt

   Starting from this list of genes (hereafter referred as root
   proteins), we are going to do the following:
   (genes are not proteins, of course, but we are going to work with
    their products: PIANA does it automatically for you)

8.1 - create configuration files for the different analyses that we
      want to perform:
      (to create each configuration file, we follow the instructions
       on piana/code/execs/conf_files/general_template.piana_conf)

    (a) -> print the interaction table to obtain a complete
           description of the interactions where these root proteins
           are involved
           -> highlighting proteins that contain a keyword in their
              description, name of function
           -> the list of keywords we are going to use is:
              cancer:carcinoma:tumor:metastasis:apoptosis:death

    (b) -> print the interaction network, highlighting proteins in the
           network that contain keywords related to cancer
           -> the list of keywords we are going to use is:
              cancer:carcinoma:tumor:metastasis:apoptosis:death
           -> this will highlight other proteins that interact with
              "metastasis proteins", which are known to be involved in
               disease being studied

    (c) -> print all the information associated to the proteins in the
           network
           -> this file can be used to do manual searches of specific
              information we are interested in

    (d) -> identify linkers, proteins that connect at least two root
           nodes
           -> these linker proteins must be looked very carefully,
              since it is very likely that they are also involved
              in the mediation of breast cancer metastasis to lung.

    (e) -> print a network only with experimental interactions, not
           taking into account the predictions by structural
           similarity
           -> to do so, just change the parameter list-source-dbs in
              the configuration file
           -> the network obtained will contain less information but
              it will be more reliable than the network built using
              predictions as well

    (f) -> predict new interactions for these genes using interologs
           -> these predictions might be useful for better
              understanding the pathways related to the root proteins,
              thanks to the fact that interactions of these gene
              products have been detected in orthoulogous proteins


    ===> The configuration file that executes commands (a) to (d) for
         metastasis_gene_names.txt is
         piana/code/execs/conf_files/metastasis.piana_conf
         -> Read this configuration file to better understand how
            PIANA is going to perform the analyses
                        
    ===> For interaction predictions (f) we are going to use the
         configuration file get_double_cog_expansions.piana_conf
         and the interface to PIANA 'run_multiple_pianas.py'
         (read explanation in example 4)

    ===> Just to show you another way of doing predictions and
         visualizing interactions, we have created the configuration
         file
         piana/code/execs/conf_files/metastasis_only_dip.piana_conf
                           
         -> This configuration file shows the process from just having
            a network with dip interactions (e) to adding predictions
            by interologs to the network and then printing out again
            the network


8.2 - execute PIANA with the configuration files detailed above


   8.2.1 commands (a) to (d)

     $> cd piana/code/execs
     $> python piana.py --configuration-file=conf_files/metastasis.piana_conf --piana-dbname=pianaDB_limited --piana-dbhost=localhost


   8.2.2 Now, get the PIANA predictions for these proteins: (f)

     $> python run_multiple_pianas.py --input-file=../../projects/metastasis/data/metastasis_gene_names.txt --input-id-type=geneName --output-id-type=geneName --piana-dbname=pianaDB_limited --piana-dbhost=localhost --results-prefix=metastasis_predictions --configuration-file=conf_files/get_double_cog_expansions.piana_conf --hub-threshold=0


   8.2.3 print network with dip interactions (e), add interologs to
         the network and print the new network

     $> python piana.py --configuration-file=conf_files/metastasis_only_dip.piana_conf --piana-dbname=pianaDB_limited --piana-dbhost=localhost


8.3 - analyze the results

    8.3.0 -> these are the files that contain the results:

	- from 8.2.1:

        metastasis_results.all.print-network.dot                 
              --> the network in DOT format for all interactions in
                  the database
        metastasis_results.connecting.print-network.dot          
              --> the network in DOT format for roots and linkers
        metastasis_results.all.print-table.html                  
              --> the table with all interactions   (HTML format)
        metastasis_results.compact.print-all-prots-info.html     
              --> information for all proteins in the network (HTML
                  format)
        metastasis_results.compact.print-connect-prots-info.txt  
              --> information about linker proteins (proteins that
                  connect root nodes)
        metastasis_results.compact.print-connect-prots-info.html 
              --> information about linker proteins (HTML format)

	- from 8.2.2:

    'protein_name'.metastasis_predictions.expand-interactions.cog_thres0.root  
              --> interaction predictions for protein_name


	- from 8.2.3:

        metastasis_results.only_dip.all.print-network.dot         
              --> network in DOT format for DIP interactions in the
                  database
        metastasis_results.only_dip.expanded_network.dot          
              --> network in DOT format for DIP interactions and
                  interologs predictions


    8.3.1 -> visualize the networks:


          - network with all interactions in the database:

           $> neato -Tgif -o metastasis_results.all.gif metastasis_results.all.print-network.dot
           $> xview metastasis_results.all.gif

           Since it is quite a big network you might want to play with
           the parameters of the DOT file
           metastasis_results.all.print-network.dot
             - removing proteins and interactions that do not look
               interesting
             - removing overlap=scale and increasing the len of the
               edges to 10.

           PIANA has what we think are optimal DOT parameters, but in
           some cases, the user has to manually modify the DOT file
           to optimize the image for that particular case.

           In the network you can identify proteins that contain
           keywords (in red and orange).
           If you find interactions that are of particular interest,
           you can look at file
           metastasis_results.all.print-table.html for a more detailed
           description of the interaction


          - network for roots and linkers

           $> neato -Tgif -o metastasis_results.connecting.gif metastasis_results.connecting.print-network.dot
           $> xview metastasis_results.connecting.gif

            This network is very useful for looking at the root
            proteins that are connected directly or via another
            protein. In this case, you can see that there are many
            roots that are not connected to the others, and some
            others that belong to the same graph component.

          - network with only dip interactions

           $> neato -Tgif -o metastasis_results.only_dip.gif metastasis_results.only_dip.all.print-network.dot
           $> xview metastasis_results.only_dip.gif

          - network with dip interactions and interologs

            This network is too big to be visualized with the standard
            PIANA parameters for DOT files. Therefore, you must edit
            file metastasis_results.only_dip.expanded_network.dot and:
               - remove this from the header line: ', pack=true, overlap=scale'
               - do a 'replace all' of 'len=1' to 'len=4'

            Then, you can create the network image (although it is not
            that helpful, due to the large number of interactions
            that have been added when doing the prediction).

            $> neato -Tgif -o metastasis_results.only_dip.expanded_network.gif  metastasis_results.only_dip.expanded_network.dot
            $> xview metastasis_results.only_dip.expanded_network.gif 

            A more practical way to see the predictions would be to
            add the interactions to the database and then visualize
            the network for your root proteins. In the previous
            visualization, you were seeing all (double cog)
            predictions for root proteins, all (single cog)
            predictions for all proteins in the initial network and
            interactions for other proteins in the database
            If you add the predictions to your piana 
            database and then visualize the network for your root 
            proteins, you'll only see (double cog) predictions 
            for your input proteins. See comments on 8.3.3.

         --> color codes are described in PianaGlobals.py and in
             piana/docs/documentation/network_colors.gif
                         

     8.3.2  -> analyze linker proteins: 

         Proteins that connect root proteins between them are probably
         also involved in mediation of breast cancer metastasis to
         lung.
         PIANA identifies these linker proteins and produces an output
         that can be analyzed by the biologist to try to detect
         funcions or biological processes that might have a role in
         his/her problem of interest.

	   - open file
             metastasis_results.compact.print-connect-prots-info.html
             in your web browser to see the list of linker
             proteins. Moreover, linker proteins that have the cancer
             keywords in their name, description or function appear
             in red and underlined.


             If you have created a local GO database (see
             README.populate_piana_db) you can also produce an html
             table describing the linkers with their GO terms (see
             comments on example 2.4.3):
       
             $> cd piana/code/evaluation/tests
             $> python parse_linkers.py --input-file=../../execs/metastasis_results.compact.print-connect-prots-info.txt --input-id-type=geneName --piana-dbname=pianaDB_limited --piana-dbhost=localhost --results-prefix=metastasis_linkers.go --output-format=html --print-go-info --go-dbname=goDB --go-dbhost=localhost --go-level=-1 --label-size=all


             The results of this command are printed to files:

               metastasis_linkers.go.dot                   
                    --> network for roots and linkers using GO terms
                        (use neato to visualize)
               metastasis_linkers.go.linkers_table.html    
                    --> html table with linkers and their GO terms

              ( one node of the metastasis_linkers.go.dot contains a
                lot of information and makes the GIF image difficult
                to visualize. You can edit metastasis_linkers.go.dot
                to remove irrelevant information from the nodes and
                then use neato again to create the new image )


     8.3.3 -> analyze predictions of interactions made by PIANA

          results files
          'protein_name'.metastasis_predictions.expand-interactions.cog_thres0.root
          contain predictions of interactions for each root
          protein. You can insert these interactions into PIANA as
          explained in example 4, or just analyze these interactions
          separately.

          If you are going to insert predictions into your database,
          when parsing the files with expansion2piana.py, instead of
          executing the parsing separately for each file, you can
          merge all *.root files into a single file and do the parsing
          just once:

          $> cat *.metastasis_predictions.expand-interactions.cog_thres0.root > all.metastasis_predictions.expand-interactions.cog_thres0
          $> cd piana/code/dbParsers/expansionParser
          $> python expansion2piana.py --piana-dbname=pianaDB_limited --piana-dbhost=localhost --expansion-file=../../execs/all.metastasis_predictions.expand-interactions.cog_thres0 --num-expansions=2 --code-type-name=geneName


     Attention! all files generated by following this example can be
                seen in piana/projects/metastasis/results/

                These are the results obtained when using interactions
                from DIP and from predictions based on
                sequence/structure distant patterns (these predictions
                are labeled internally as 'ori'). However, if you had
                populated your database with other interaction
                databases as described in README.populate_piana_db,
                the analysis would had been far more complete.


--> TODO!!!!

  - example over/under expressed
  - example special proteins
  - explain command create-report
  - example classify-network-proteins
  

**********************************************************
==> Other PIANA commands: read general_template.piana_conf
**********************************************************

Now that you are an expert in using PIANA, just by looking to
conf_files/general_template.piana_conf you should be able to find out
what other things can be done with PIANA. Take a look to all the piana
commands listed in that file are decide which ones you want to use.


We have included a configuration file that tests most of the PIANA
commands using dummy proteins.  You can take a look at it to see at
work some PIANA commands we haven't used in the examples above:
piana/code/execs/conf_files/test_all_commands.piana_conf We have
placed some comments on this configuration file to guide the user
through the different PIANA possibilities.

To execute it, you can do:
$piana/code/execs> python piana.py --configuration-file=conf_files/test_all_commands.piana_conf


Some things (apart from the ones you've seen in the examples above)
that can be done using PIANA:

   --> ignoring all unreliable interactions (using parameter 
       ignore-unreliable)

   --> doing the intersection of different protein interaction
       databases

   --> using files with infra/over-expressed genes (eg. from a
       microarray experiment) to visualize in your network which
       proteins are infra/over expressed

   --> building the protein interaction network for a given species

   --> limiting the network to contain interactions that were detected
       by a given method (ie. y2h)

   --> create several networks using just one configuration file

   --> avoid adding to the network proteins that have too many
       interactions

   --> getting a list of proteins that are at distance X from another
       protein

   --> creating a network from a text file with interaction pairs, no
       need to have the interactions in the database.

   --> delete interactions from the piana database using
       piana/code/dbModification/delete_interactions_from_db.py
       --> for example, if you want to delete predictions made by
           expansion (ie. interactions labeled 'expansion', you can do:
            $> python2.3 delete_interactions_from_db.py --piana-dbname=pianaDB_limited --piana-dbhost=localhost --db-to-delete=expansion

       --> attention! doing a direct delete over the PIANA database
           (ie. via sql commands) is very dangerous, because you 
           might loose the correspondences between the different 
           tables, or you can delete an interaction from expansion 
           and, at the same time, delete the same interaction that 
           was as well in another database... CONCLUSION: use the
           script described above to delete interactions from the
           database (ie. never manipulate the interaction tables 
           directly)

   --> finding the shortest route between two given proteins: finds 
       the minimal path that goes from one protein in the network
       to another protein in the network

        --> Attention: this command requires changing your PIANA
            mode to 'advanced' or 'developer'. Read 
            general_template.piana_conf for a complete description
            of this command

        --> you've got an example configuration file on
            conf_files/test_shortest_route.piana_conf

   --> ................. ................ ................ ........


If there is something you would like to do with PIANA but you don't
find the piana command for doing it, there are two possibilities:

    a - send us an email (boliva at imim.es) explaining that
        'something' (and wait until we do it, which can take some
        time...)

    b - modify the code so PIANA does that 'something'. It is much
        easier than you might think: yo do not need to know SQL,
        or Graph theory... just a little python and reading
        attentively PianaApi.py and piana.py

If you find bugs or have suggestions about PIANA, please send us an
email to boliva at imim.es

If you develop code based on PIANA that you think might be useful to
other people, please send us an email and we will include your code in
the next release.


**********************************************************
==> Using PIANA as a framework? Programming based on PIANA
**********************************************************

PIANA has been designed in a way that it is easy to use as a library
to develop your own protein interaction network code.

You can use PIANA at different levels:

- as a user: examples shown above in this file

- as a library: use PianaApi methods

   Take a look to piana/code/execs/piana.py It is basically a script
   that reads arguments and then makes calls to PianaApi with those
   arguments. For more information read PianaApi documentation:
   piana/docs/documentation/pydoc_docs/PianaApi.html


- as a developer of new tools: use PIANA classes and methods

    All the code developed for this project is an example on how to
    use PIANA. For example, the Clustering uses the class Graph to
    create a ClusterGraph class.

    For more information, read piana/README.piana_developers and piana
    documentation piana/docs/documentation/piana_documentation.html