Structural BioInformatics Research Lab
  IMIM * UPF * GRIB HOMESBI HOMEPIANA Homepage
 
PIANA example
Analysis of genes that mediate breast cancer metastasis to lung

 
   

To illustrate some the analyses that can be performed by PIANA, we have used genes that mediate in breast cancer metastasis to lung, discovered by the team of J. Massague and published in Nature some time ago: Minn AJ et al. -- Genes that mediate breast cancer metastasis to lung-- Nature. 2005 Jul 28;436(7050):518-24.

Let's suppose that we are interested in analyzing these genes, as well as finding other genes that might be involved in the same biological process. One way of doing this is using PIANA. How? These would be the main steps that a user would have to take in order to analyze these genes using PIANA:

  • Create a PIANA configuration file that sets the input parameters, how to output results and the PIANA commands to be executed. This is an example of a configuration file for analyzing these genes.
  • Execute PIANA on the command line with the previous configuration file as argument
    $> python piana.py --configuration-file=metastasis.piana_conf
  • Analyze the results (read below for more information on the types of PIANA results you might expect)
For a step-by-step explanation of the process followed to analyze these genes, read example 8 in this file: README.piana_examples

Starting from this list of genes (hereafter referred as root proteins), we have done the following:

(genes are not proteins, of course, but we are going to work with their products: PIANA does it automatically for you)

(all images have been downscaled for web presentation purposes)

  • print the interaction network highlighting proteins in the network that contain keywords related to cancer
    • the list of keywords we have used is: cancer:carcinoma:tumor:metastasis:apoptosis:death
    • the color code used can be seen in the following image: network_colors.gif

  • print the interaction table highlighting proteins in the table that contain keywords related to cancer
    • the list of keywords we have used is: cancer:carcinoma:tumor:metastasis:apoptosis:death

  • print all the information associated to the proteins in the network
    • this file can be used to do manual searches of specific information we are interested in

  • identify linkers, proteins that connect at least two root nodes between them
    • these linker proteins must be looked very carefully, since it is very likely that they are also involved in the mediation of breast cancer metastasis to lung. The reasoning is that if two root proteins interact via a linker protein, the linker protein is probably involved in the same function as the root proteins.


  • print a network only with experimental interactions, (ie. interactions from DIP) not taking into account the predictions by structural similarity
    • this network is more reliable than the network built using predictions as well, since all interactions have been probed experimentally.

  • predict novel interactions for root proteins using interologs (interactions of these gene products that have been detected for orthologous proteins)

  • print a network with DIP interactions and PIANA predictions using interologs,
    • this network is quite chaotic because there is too much information. Read the detailed explanation on how this network was generated to better understand other possibilities for visualizing predictions and large networks

  • Other analyses that we could have done with the information available are: finding the intersection between predictions and experimetally probed interactions, doing interaction predictions based on sharing a SCOP family code, removing from the output non-human proteins, do a clustering of the interaction network based on GO terms (e.g biological process), ...

  • If we had had extra information about these proteins (their expression levels, 2D gels, stress levels, ...) it could had been used to further analyze the network. Depending on your problem of interest, different data can be used to identify relevant proteins, remove uninsteresting cases or print the results in different formats.

These results were obtained as described in README.piana_examples, using the following configuration files:

For clarity's sake, these results were obtained using a limited version of the piana database (see description of this database). For complete PIANA analyses it is recommended to use as much data as possible, using all the parsers designed for populating PIANA databases.

Return to PIANA homepage

If you encounter problems using PIANA, or have suggestions on how to improve it, send an e-mail to boliva at imim.es
Copyright © 2005

PIANA  is under GNU General Public License.
 



Disclaimer Structural BioInformatics Research Laboratory webmaster: agonzalez at imim.es

Last Updated Wed, 8 Feb 2006