Python: module Graph

Graph

index
../../../../piana/code/Graph/Graph.py

File : Graph.py Author : Daniel Jaeggi and Ramon Aragues Creation : 8.2003 Contents : main class providing graph processing methods Called from : programs/classes that need to use graphs ======================================================================================================= This file implements class Graph, which has methods for graph creation, analysis and manipulation. The easiest way to create a Graph would be: from Graph import * graph = Graph("1") node1 = graph.get_node(1) node2= graph.get_node(2) edge1 = graph.get_edge(node1, node2) graph.add_node(node1) graph.add_node(node2) graph.add_edge(edge1) If you want to use a database with edges (eg. interactions) to build your graph, then you need to use a GraphBuilder. If this is the case, you'll need to implement method GraphBuilder.get_links(node). This method gets all the links for a node, and it is problem specific since the links are differently retrieved depending on the problem being faced. For example, look at PianaGraph, a subclass of Graph that handles protein-protein interaction networks (piana_access is just an interface to the DB, but it is not required if you are implementing your own type of Graph):

Modules

PianaGlobals
copy
numarray
piana_configuration_parameters
sys

Classes



__builtin__.object

Graph

class Graph(__builtin__.object)

    The main graph class. Implements all methods related to creation and analysis of general-purpose graphs Other Graphs (eg. PianaGraph) will be subclasses of this class.

Methods defined here:

__init__(self, graphID=None, memory_mode='memory')
Optionally, give a graphID to your graph. "memory_mode" is the flag that tells how to manage memory. It can be:               "memory" Edge objects and Edge Attributes will be stored in memory. It takes more memory, but it is faster. So, memory usage will be high               "speed" Edge Objects will not be stored in memory: they will be generated on the flight when are needed. So, speed will be decreased but memory usage will decrease considerably

__str__(self)

add_edge(self, edge_object)
Adds an edge to the graph. Argument is a GraphEdge object. Attention! The nodes have to be added to the graph before adding the edge!!!!   -> add_edge does not add the nodes! adds the edge to the graph after checking its existance. Therefore, it can be used wihtout checking if it was in the network

add_node(self, node_object)
Adds a node to the graph. "node_object" must be previously created from GraphNode

add_root_node(self, node_id)
Adds a root node id to the set of root nodes

build_graph(self, user_graph_builder)

clean_nodes(self)
Looks for and removes nodes with no edges. This is applied to the UNFILTERED graph - filtering has no effect so a node which has no edges when filtered will NOT be removed.

connected(self)
Returns 1 if graph is fully connected, 0 otherwise. Uses fortran subroutine.

create_edge(self, node_id1=None, node_id2=None, attribute_object=None)
Returns a tuple [edge object, new], where new= 1 means edge didn't exist in the graph before If edge already existed, in edge returned attribute_object is merged with attribute of existing edge "node_id1" and "node_id2" are node identifiers Attention!!! The edge is not added to the graph! That must be done by the user if he wants to do so...

edge_exists(self, identifier1=None, identifier2=None)
Returns a 1 if edge already exists. 0 otherwise if identifier1 and identifier2 are not None, both identifiers are node_id if identifier2 is None, then identifier1 is an edge_id

expand(self, expansion=None, behaviour='fresh', expansion_mode='all', exp_output_mode='add', output_target=None, id_type=None, alternative_id_types=None, expansion_threshold=0, hub_threshold=0, class_name='all')
Expands node and edges of the graph based on behaviour defined by an object Expansion   -> expand is used to "propagate" edges of one node to another node in case they share a certain characteristic       - if node A has interactions a, b and c       - if node B has interactions d, e and f       - if node A and node B share characteristic X (defined by object expansion)       - if we perform a expand on both nodes, the new network will be:          - A will have interactions a, b, c, d, e and f          - B will have interactions a, b, c, d, e and f "expansion" is an expansion object, created using a subclass of Expansion (and specific to the database you're working with) "expansion_mode" determines if all nodes are expanded or only those that were used as root to create the network   - expansion_mode == "all" creates expanded edges for all nodes in the graph   - expansion_mode == "root" creates expanded edeges only for root nodes "expansion_threshold" sets the threshold determining whether to expand a node or not   - if a node shares a certain characteristic with lots of other nodes, it might not be a relevant characteristic     for these cases, we introduce a threshold that determines that if there are more than "expansion_threshold" nodes sharing the characteristic     with the node, then edges from those nodes will not be propagated to the node.       - if 0, no thresholds will be applied "hub_threshold" sets the threshold for how many edges can a node have in order to be added to the network   - if a node has more than "hub_threshold" edges, these edges will not be added to the network.      - if 0, no threshold is applied "output_mode" determines if expansions performed will be added to the current graph or printed to stdout   - output_mode == "add" adds expansions to graph   - output_mode == "print" will print expansions to stdout "id_type" is a code type used in database, and is the code type that will be used for printing out propagated edges (in exp_output_mode "print") "alternative_id_types" are the alternative code types in case nothing is found for "id_type" (identifier will be preceded by "type:xxxx") "class_name" is used in output_mode print to choose the class of nodes we want to print out    - "all" will print all classes of nodes    - other classes are defined by the subclass of Graph you are using (eg. in PianaGraph, class_name is the protein taxonomy id)    - expansions will only be printed out if both nodes in the edge are of class "class_name"

filter(self, filters, behaviour='fresh')
Hides node and edges in the graph based on filters. Behaviour can either be "fresh" or "inc". "fresh" (the default) unhides all nodes/edges previously hidden before applying the new filters. "inc" incrementally applies the new filters on the current visible graph. not working at the minute!

find_routes(self, node1_id, node2_id)
When passed two node objects, finds routes between them. Make sure nodes are retreived from the graph by extracting them with g.get_node(node_nodeID,get_mode="error") for example. THIS IS BROKEN AT THE MINUTE - IT'S NOT RETURNING ALL THE PATHS

find_shortest_route(self, start_node_id, end_node_id)
Uses Dijkstra's algorithm to find the shortest route between start_node_id and end_node_id returns a tuple ( distance, route)

get_all_distances(self)
Uses Dijkstra's algorithm to find every pairwise distance in the graph. This is more efficient than calling get_distances() for each node. NOT IMPLEMENTED YET!!!

get_ave_degree(self)
Returns the average degree of the graph. This variable may be directly accessed as Graph.ave_degree but interface through this method is preferable as it ensures that the value is current.

get_clus_coef(self)
Returns the clustering coefficient of the graph. This is an intelligent interface - if the graph has been modified since the last calculation of the clustering coefficient (via a call to Graph.calc_clus_coef()), its value will be re-calculated before being returned.

get_connecting_nodes_dic(self, distance=None)
Returns a dictionary of node_ids that connect root_nodes of the graph format is: {connecting_node_id:[list of root nodes that connects], connecting_node_id:[], ...} These are the nodes that belong to the path between two root nodes, at a maximun distance "distance" Note: it doesn't return nodes that are only connected to one root node. The minimum number of root nodes is 2.

get_distances(self, query_node_id)
Implements Dijkstra's algorithm for finding the distance to all nodes of the graph from one query node. This is _much_ more efficient than calling find_route() for all node combinations. Returns a dictionary of distances, the keys being the node ids and the contents the distance at which that node can be found

get_edge(self, identifier1, identifier2=None, attribute_object=None, get_mode='new')
Returns an edge object. "identifier1" and "identifier2" can be edgeIDs,  edge objects GraphEdge, nodeIDs or nodes objects GraphNode. If the edge described in the arguments already exists in the graph, returns that edge object. Otherwise the behaviour is defined by parameter get_mode (see below) get_mode can be "new" or "error" (default is "new")   - "new" - returns a new edge if it doesn't exist   - "error" - raises error if it doesn't exist Both get_modes return a reference to the edge when it already existed Attention: get_edge does not add the edge to the Graph!!!!!

get_edge_attribute_parameters(self, edgeID)
Returns the tuple that contains the values needed to construct the edge attribute object

get_edge_ids_list(self)
Returns a list with edge ids of the graph

get_edge_object_list(self)
Returns a list with edge objects of the graph Avoid to use it because Edge Object are generated on the flight!

get_graph_id(self)

get_hidden_edges(self)
Returns a list of hidden edges of the graph.

get_hidden_nodes(self)
Returns a list of hidden node objects in the graph.

get_memory_mode(self)

get_node(self, identifier, attribute=None, get_mode='new')
Returns a node object. "identifier" can be a nodeID or a GraphNode object. Attention: get_node doesn't add the node to the graph!!! If the node nodeID already exists in the graph, returns that node. Otherwise the behaviour is defined by parameter get_mode (see below) get_mode can be "new" or "error" (default is "new")   - "new" returns a new node in case node.nodeID doesn't exist   - "error" returns an error in case node.nodeID doesn't exist

get_node_clus_coef(self, node_id)
Returns the clustering coefficient for an specific node

get_node_ids_list(self)
Returns a list with node ids of current graph

get_node_ids_list_deep_copy(self)
Returns a new list (a deep copy, not just a reference to node_ids) with node ids of current graph

get_node_node_links(self)
returns a list of node_id pairs that have an edge between them returned list looks like this: [ [node1, node2], [node1, node3], [node2,node4], ...]

get_node_object_list(self)
Returns a list with node objects of the graph

get_node_root_connectivity(self, node)
returns the number of root nodes that node "node" connects

get_nodes_connectivities(self, distance=1)
returns a dictionary of connectivities of all nodes in the graph the dictionary returned follows format:    {node_id 1: list of node ids at distance 1 of node id 1,     node_id 2: list of node ids at distance 1 of node id 2,     .............................    } "distance" is currently not being used

get_root_node_ids(self)
Returns a set of root node ids of the graph.

get_root_node_ids_deep_copy(self)
Returns a list of root nodes (a deep copy, not just a reference to node objects) of the graph.

get_root_node_ids_set(self)

get_unhidden_edges(self)
Returns a list of unhidden edges of the graph.

get_unhidden_nodes(self)
Returns a list of unhidden nodes objects in the graph.

has_root_node_id(self, node_id)
Returns 1 if the graph contains a root with node_id equal to "node_id". If not, returns None

hide_edge(self, edge_id)
Adds a hidden edge to the set of hidden edges This does not check in edge_id exists in the graph, so, be careful

hide_node(self, node_id)
Adds a hidden node to the set of hidden nodes This does not check if node_id exists in the graph, so, be careful

is_hidden(self, node_id=None, edge_id=None)
Returns True or False if node/edge is hidden

join_graphs(self, graph_object)
adds to current graph the nodes and edges from graph passed as argument "graph_object"

load_graph_from_text_file(self, file_name)
loads graph data described in a text file into the current graph object input file must follow format: node--node node--node where node are identifiers of nodes and each line is a link between those identifiers

nodes_connected(self, node1, node2)
Returns 1 is node1 is connected to node2 (at any network depth), 0 otherwise.

old_connected(self)
Returns 1 if graph is fully connected, 0 otherwise. Not very efficient! Use connected() instead! If you want to use it, you'll have to:     - add the import line: import LinearAlgebra     - uncomment the following lines if self.laplacian is None:     _build_laplacian() try:     ev = LinearAlgebra.eigenvalues(self.laplacian)     eigenvalues = []     k_node = len(get_unhidden_nodes())     for i in range(k_node):         eigenvalues.append(ev[i])         eigenvalues.sort()     # print eigenvalues     # secont smallest eigenvalue gives info about connectivity     # check smallest is zero first though     if not ( abs(eigenvalues[0]) < 0.000001 ):         raise Exception("Internal error: Eigenvalue problem")     if eigenvalues[1] > 0.0:         return 1     else:         return 0 except:     # having problems here - need to sort this out     return 1

output_dot_file(self, filter_mode='all', output_target=None, use_alternative_id='no')
Generates .dot file output that can be fed directly into the GraphViz program neato. This produces very nice network images... "output_target" is a file object where the dot file will be written filter_mode can be: - "all": prints all edges of the graph - "hidden": prints hidden edges of the graph - "unhidden": prints unhidden edges of the graph "use_alternative_id" can be   - "yes" --> uses alternative id for printing graph   - "no"  --> uses internal id for printing graph    use_alternative_id is used to codify in the node which is the label to be used for that node. This is useful when    we want to label the node with something different than the node id. to produce a GIF file from the .dot file returned by this method, do the following: cat output_of_this_method | neato -Tgif -o output_network.gif

output_edges_table(self, filter_mode='all', output_target=None)
prints  pairs of interacting nodes to output_target "output_target" is a file object (sys.stdout to print to screen) filter_mode can be: - "all": prints all edges of the graph - "hidden": prints hidden edges of the graph - "unhidden": prints unhidden edges of the graph The output will look like this:     node1 id<TAB>node2 id     node3 id<TAB>node4 id     ..............

print_edges(self, filter_mode='all')
prints edges in the graph filter_mode can be: - "all": prints all edges of the graph - "hidden": prints hidden edges of the graph - "unhidden": prints unhidden edges of the graph

print_nodes(self, filter_mode='all')
Prints nodes in Graph filter_mode can be: - "all": prints all nodes of the graph - "hidden": prints hidden nodes of the graph - "unhidden": prints unhidden nodes of the graph

print_route(self, distance, route)
prints the route codified in "route" (which is just a list of node objects)

replace_filters(self, new_filters)
Replaces current set of filters on the graph with some new filters. Saves the old filters for restoration through restore_filters() not working at the minute!

restore_filters(self)
Restores an old set of filters replaced by replace_filters. not working at the minute!

rm_edge(self, edge_id)
Removes a edge from the graph. "edge_object" is a  GraphEdge object of the graph Attention!!! This method only deletes the edge. Affected nodes are not deleted, even though they are not implied in other edges

rm_node(self, node_object)
Removes a node from the graph. "node_object" is a  GraphNode object of the graph

set_graph_id(self, new_graph_id)

unfilter(self)
Un-hides all nodes and edges of the graph. not working at the minute!

unhide_edge(self, edge_id)
Unhides an edge of the set of hidden edges

unhide_node(self, node_id)
Unhiddes node to the set of hidden nodes If the node is not hidden, print error message to stderr

Static methods defined here:

build_cluster_graph(node_list=None, old_edges_dictionary=None, old_node_2_new_node_dictionary=None, new_node_to_old_node_list_dictionary=None, newGraph=None)
Method that builds a new cluster_graph from: "node_list": a list of new nodes to be added to the new cluster_graph "old_edge_dictionary": a dictionary that in each position there is a list of nodes that had edge with this node "old_node_2_new_node_dictionary": a dictionary that references old nodes_id to new_node_id e.g: old_node_2_new_node[old_node_id]=new_node_id "node_to_old_node_list_dictionary": a dictionary that references new nodes_id to a list of old nodes_id that contents inside                                     eg: node_to_old_node_list_dictionary[new_node_id]=[old_node_id1,old_node_id2...]

build_expanded_graph(node_list=None, old_edges_dictionary=None, old_node2new_node_dic=None, new_node2old_node_dic=None, newGraph=None)
Method that builds a new cluster_graph from: "node_list": a list of new nodes to be added to the new cluster_graph "old_edge_dictionary": a dictionary that in each position there is a list of nodes that had edge with this node "old_node2new_node_dic": a dictionary that references old nodes_id to new_node_id                          e.g:old_node_2_new_node[old_node_id]=[new_node_id,new_node_id2...] "new_node2old_node_dic": a dictionary that references new nodes_id to a list of old nodes_id that contents inside                          eg: node_to_old_node_list_dictionary[new_node_id]=[old_node_id1,old_node_id2...]

Data and other attributes defined here:

__dict__ = <dictproxy object>
dictionary for instance variables (if defined)

__weakref__ = <attribute '__weakref__' of 'Graph' objects>
list of weak references to the object (if defined)

Data

root_nodes = {}
verbose = 0
verbose_add_edge_detailed = 0
verbose_add_edge_shallow = 0
verbose_expansion_detailed = 0
verbose_expansion_minimal = 0
verbose_expansion_shallow = 0
verbose_get_edge = 0
verbose_get_node = 0
verbose_join = 0
verbose_propagate = 0
verbose_propagate_print = 0

Data
		root_nodes = {} verbose = 0 verbose_add_edge_detailed = 0 verbose_add_edge_shallow = 0 verbose_expansion_detailed = 0 verbose_expansion_minimal = 0 verbose_expansion_shallow = 0 verbose_get_edge = 0 verbose_get_node = 0 verbose_join = 0 verbose_propagate = 0 verbose_propagate_print = 0