Technical Overview, peptide word index table

Views

From TheGPMWiki

Revision as of 19:24, 16 February 2009 by WikiSysop (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The peptide_word_index table is used for doing peptide residue subsequence searches. It is an on-disk index of the peptides in which four-residue strings can be found, so is very large. A full worked example of the operation of this table can be found at http://www.thegpm.org/GPMDB/subsequence_searches.pdf

The contents of this table are updated incrementally as part of the update procedure which executes daily.

Columns:

keyid: the unique identifier for a given row of data. A single keyid will be associated with a single four-residue word, but a four-residue word can be associated with many keyids.
word: the four-residue word. The only letters that will not appear in this field are J and O; U appears for selenocysteine, and all three placeholder residue letters are also stored in some records in GPMDB.
pepid_list: a maximum 65,000 character list of peptide identifiers, each enclosed by a single pipe ("|") character; e.g. "|123||456||789|".
ts_created: the system timestamp for the addition or updating of a row of data. This is used only in troubleshooting; it has no effect on the usage of the table or the search results.

Retrieved from "http://wiki.thegpm.org/wiki/Technical_Overview,_peptide_word_index_table"

Technical Overview, peptide word index table

Views

Navigation

Search

Toolbox

Personal tools