Updating Protein Omega Counts

Views

From TheGPMWiki

The protein_omega_count table is a part of the peakdb database. The table consists of five columns:

label: the accession number of a protein, found in the proseq table in GPMDB
seq: the residue sequence of a peptide identified as part of that protein, found in the peptide table in GPMDB
z1total: an integer for the number of times this peptide has been identified in the corresponding protein in a singly charged state
z2total: an integer for the number of times this peptide has been identified in the corresponding protein in a doubly charged state
z3total: an integer for the number of times this peptide has been identified in the corresponding protein in a triply charged state

Updating the table is accomplished by two scripts: /thegpm/scripts/refresh_protein_omega_count_table.pl and /thegpm/scripts/dump_protein_omega_count_script.sql Executing the perl script will rename the dumped contents of the table (if they exist), execute the SQL commands to create a new dump file, and then import the newly-dumped information into the database.

The SQL script contains commands to build the full content of the table with a single SQL command from the content of the peptide and proseq table in GPMDB. The dumped information is by default created in the /gpmdb/ directory, and it's name is hard-coded into both the perl and SQL script. The total runtime for the process is between three and four hours with a peptide table size of 17GiB.

Once the process is complete, the newly-created dump file can be transferred to any machine with a peakdb installation. From the MySQL command prompt, a user with the proper permissions can delete the existing content and load the new set.

Retrieved from "http://wiki.thegpm.org/wiki/Updating_Protein_Omega_Counts"

Updating Protein Omega Counts

Views

Navigation

Search

Toolbox

Personal tools