Technical Overview, GPMDB Updates

From TheGPMWiki
Revision as of 21:49, 7 April 2011 by WikiSysop (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

In the workflow below, script names are in bold and database table names are in italic.

Image:Gpmdb_update_workflow.jpg

  1. calls function set_lock
    1. filesystem change: makes a lock file
  2. calls system "gzip"
    1. filesystem change: unzips all xml files in archive directory
  3. calls script anon.pl
    1. filesystem change: removes some data from xml files in archive directory
  4. calls function archive_files
    1. filesystem change: copies files from archive directory to their proper subdirectory.
  5. calls function gzip_files
    1. filesystem change: gzips all xml files in archive directory.
  6. calls function ftp_archive
    1. calls system "ftp" on file list for each client server.
  7. calls function process_archives
    1. calls script processArchives.pl
      1. for each GPM file in archive directory
      2. calls script popGPMDB.pl
        1. parses the xml file for data to add to the gpmdb tables:
          1. tables inserted into: result, proseq, protein, peptide, aa
          2. tables updated/inserted into: best_expect
          3. calls script populate_gpmnotes_table.pl
            1. parses the xml file for data to add to enspmapdb
              1. tables inserted into or updated: gpmnotes
  8. calls function ftp_statistics
    1. calls system “ftp” to send statistics file to each client server
      1. filesystem change: creates and deletes ftp script
  9. calls function update_slaves
    1. uses LWP to trigger script gpm_process_slave.pl on each client server
      1. calls function update_words
  10. calls script update_peptide_words_table_new.pl
    1. script for updating the peptide_words_index table for faster tag searches
    2. calls script generate_seq_word_list_new.pl
      1. checks for peptide sequences not yet added to peptide_words table, and creates a new file if oldest found file is out of date.
      2. Filesystem change: may add a file to the log directory containing nonredundant list of peptide sequences added since last update.
      3. calls script add_seq_words_info_new.pl
        1. queries for pepids newer that previous max pepid to add to peptide_words table
          1. tables inserted into: peptide_word_index
  11. calls function update_peptide_mutations
    1. script for updating the peptide_mut table.
    2. calls script update_peptide_mut_table.pl
      1. calls script generate_peptide_mut_list.pl
        1. generates list of recently-added files, peptides, proteins and spectrum id that contain mutations, if any.
        2. Filesystem change: may add a file to the pep_mut directory that contains location information of peptides that contain a mutation.
      2. calls script add_peptide_mut_info.pl
        1. updates the database with mutation information in the just-added files.
          1. Tables inserted into: peptide_mut
  12. calls function cleanup_files
    1. cleans up the archive directory
    2. calls system "del"
      1. filesystem change: deletes the gzip files in the base archive directory.
  13. calls function remove_lock
      1. filesystem change: deletes the lock file
Personal tools