The wealth of structural information stored in the Protein Data Bank is the result of the work of thousands of researchers and millions of work-hours. By properly exploiting this wealth of information mankind may get solutions for a wide spectrum of health-related problems and illnesses, debilitating or killing hundreds of millions every year.
Widely available computational techniques, such as well developed data structures and algorithms, database applications together with the low-cost, reliable and high-power computer hardware would clearly imply the existence of a plethora of fully automated algorithmic solutions for handling the Protein Data Bank (PDB).
Unfortunately, this does not hold. Most possible this discrepancy may be due to the fact that the Protein Data Bank started to function as the depository of the crystallographic data, complementing journal publications: researchers solved the structure of a protein, wrote a paper on the result, and deposited the data of the solution in the publicly available PDB.
The irregularities of the structure deposited (such as lacking atomic coordinates, broken chains, unidentified substructures) are mostly remarked in the cited publication and also in the remark-fields of the PDB file. The textual annotations, however, make the automatic processing of the protein-structures difficult.
We, at the Math-for-Health.com, developed a new database, called RS-PDB (Rich-Structure PDB) database, staring from the mmCIF format of the data, cleaned numerous inconsistencies and re-built the database in a strictly logical and an easily searchable way. The main result of our work was the reliable identification and description of the protein-ligand complexes found in the PDB. Moreover, we are able to conduct refined structural searches involving even the most unusual criteria on the whole PDB.
For a short technical description of the database, see this link.
For some examples of the queries and results, consult this link.
We also developed the spatial extension of the RS-PDB Database, the SRS-PDB database. There, by using simplicial decomposition of the atomic structure of the proteins, we can answer spatial queries concerning the 3D structure of PDB entries.
We offer fast, comprehensive answers for your structural queries concerning the whole PDB.
For further details, inquiries or orders please contact:
research AT math-for-health.com