Package biana :: Package utilities :: Module utilities
[hide private]
[frames] | no frames]

Module utilities

source code

BIANA: Biologic Interactions and Network Analysis Copyright (C) 2009 Javier Garcia-Garcia, Emre Guney, Baldo Oliva

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

Functions [hide private]
 
return_non_commented_non_empty_lines(file_name=None)
returns a list with one line each element
source code
 
sequence2md5(sequence)
Return MD5 code for sequence "sequence" (MD5 hexdigestion of sequence + its leading 4 chars + its last 4 chars)
source code
 
get_id_type(protein_id)
method that returns a list with potential types of protein code (ie database column) of a given protein name "protein_id" for which we do not know the type of code
source code
 
get_clean_sequence(input_sequence)
cleans an input sequence from all spaces, tabs, and special characters it might have, leaving only a contigous list of aminoacids
source code
 
parse_string_field_value(input_string=None, separator_field_value=None, global_separators=None)
parses a string that has field_names and values and returns a list of pairs [[field_name,value], [field_name, value], ...]
source code
 
return_dic_gi_vs_tax(file_name=None)
returns a dictionary { gi: tax_id, gi: tax_id, ......
source code
Variables [hide private]
  verbose = 0
  verbose_detailed = 0
  verbose_very_detailed = 0
  verbose_matrix = 0
  verbose_string_utilities = 0
  verbose_blast_report = 0
Function Details [hide private]

return_non_commented_non_empty_lines(file_name=None)

source code 

returns a list with one line each element

Non of the elements will be lines that started with a '#', and lines were not empty

get_id_type(protein_id)

source code 

method that returns a list with potential types of protein code (ie database column) of a given protein name "protein_id" for which we do not know the type of code

This method should be called prior to PianaDBaccess.get_list_proteinPiana() if the identifier type is not known

Attention!!! This function is only being used by string2piana

THIS IS CURRENTLY ONLY BEING USED IN THE STRING PARSER string2piana: that is why I am currently only looking for codes that might appear in STRING

parse_string_field_value(input_string=None, separator_field_value=None, global_separators=None)

source code 

parses a string that has field_names and values and returns a list of pairs [[field_name,value], [field_name, value], ...]

global_separators is a list with all the string separators that can act as a string separator (e.g. [" ", "|", ";" ])

separator_field_value can only be one character

string must follow format:

[global_separator]*field[separator_field_value]value[global_separator]*field[separator_field_value]value[global_separator]*.....

meaning that each field has a value

for example, stringX

"    ;Name=Ramon    ;   and   Name=Pedro   , Synonim=Juan  ;  "

could be converted into a list [[Name, Ramon], [Name, Pedro], [Synonim, Juan] by calling parse_string_field_value(input_string=stringX,
                                                                                                                    separator_field_value="=",
                                                                                                                    global_separators=[" ",";"])

Attention!!! Even if space (ie " ") is not in global_separators, a strip() is done before returning the pairs, to remove trailing spaces from the
field names and field values. So, if trailing spaces are needed, something else has to be done...

return_dic_gi_vs_tax(file_name=None)

source code 


returns a dictionary { gi: tax_id,
                       gi: tax_id,
                       ......
                       }

filled with info from "file_name" (gis and tax_ids are both integers

"file_name" is a file name of a file that has two tab-separated columns
1st one is gi code
2nd one is tax id for that gi