Technical Overview, peptide word index table

Views

From TheGPMWiki

(Difference between revisions)

Revision as of 19:24, 16 February 2009

The peptide_word_index table is used for doing peptide residue subsequence searches. It is an on-disk index of the peptides in which four-residue strings can be found, so is very large. A full worked example of the operation of this table can be found at http://www.thegpm.org/GPMDB/subsequence_searches.pdf

The contents of this table are updated incrementally as part of the update procedure which executes daily.

Columns:

keyid: the unique identifier for a given row of data. A single keyid will be associated with a single four-residue word, but a four-residue word can be associated with many keyids.
word: the four-residue word. The only letters that will not appear in this field are J and O; U appears for selenocysteine, and all three placeholder residue letters are also stored in some records in GPMDB.
pepid_list: a maximum 65,000 character list of peptide identifiers, each enclosed by a single pipe ("|") character; e.g. "|123||456||789|".
ts_created: the system timestamp for the addition or updating of a row of data. This is used only in troubleshooting; it has no effect on the usage of the table or the search results.

Retrieved from "http://wiki.thegpm.org/wiki/Technical_Overview,_peptide_word_index_table"

@@ Line 6: / Line 6: @@
 *'''keyid''': the unique identifier for a given row of data.  A single keyid will be associated with a single four-residue word, but a four-residue word can be associated with many keyids.
 *'''word''': the four-residue word.  The only letters that will not appear in this field are ''J'' and ''O''; ''U'' appears for selenocysteine, and all three placeholder residue letters are also stored in some records in GPMDB.
-*'''pepid_list''': a maximum 65,000 character list of peptide identifiers, each enclosed by a single pipe('|') character; e.g. "|123||456||789|".
+*'''pepid_list''': a maximum 65,000 character list of peptide identifiers, each enclosed by a single pipe ("|") character; e.g. "|123||456||789|".
 *'''ts_created''': the system timestamp for the addition or updating of a row of data.  This is used only in troubleshooting; it has no effect on the usage of the table or the search results.

Technical Overview, peptide word index table

Views

Revision as of 19:24, 16 February 2009

Navigation

Search

Toolbox

Personal tools