13 if contactType ==
'full':
24 def __init__(self, partnerA_res, partnerB_res, atomContactList):
31 if contactType ==
'full':
34 for atom_contact
in atomContactList:
35 if atom_contact.isHbond:
39 tmp_copy = copy.copy(self)
42 for i
in range(Natomcontacts):
43 tmp_copy.atomContactList[i] = atom_contact \
49 This changes the form of resContact from a contact-map type format
50 to a contact-profile type format. A contact map is a list of
51 contacts, whereas a contact profile is a list of residues with
54 In a contact profile, each residue only needs to know which other
55 residues it contacts, not the complete resContact format.
57 When creating a contact profile, each contact in the map is added
58 to both residues involved. This means changing the format from:
76 For example, for residue i, I want to keep only the info for j,k, and
77 l since I already have the info for residue i
79 for residue j, I want to keep only the information for k,l, and m. For
80 the i-j contact it is more complicated because I need to invert the
81 order to j-i before removing residue j.
86 2) delete the info for residue i ('res1', 'index1')
87 3) add contact to residue i
90 5a) invert identity of res1/res2, index1/index2
91 b) invert each of the atom contacts
92 6) delete res1, index1 which now match j.
94 Input parameter res determines which option is picked
95 If 'res1', carry out steps 1-3
96 If 'res2', carry out steps 4-6.
104 atom_contact.numSwap()
105 delattr(self,
'res1')
106 delattr(self,
'index1')
113 printid: printid of the atom's residue
114 index: index of the atoms residue
115 atomname: the atom name
116 count: number of times that shellAtom occurs (for multiple monomers)
117 atomid: a tag for easy comparison of shell atoms
127 if loadedPDBres.max_radius !=
None:
130 for atom
in loadedPDBres.coordlist:
132 max_radius_sq =
max(atomdist_sq, max_radius_sq)
133 setattr(loadedPDBres,
'max_radius', math.sqrt(max_radius_sq))
137 Add certain attributes to loadedPDB to make it a contactPDB:
138 centroid: centroid of the residue or ligand
139 max_radius: farthest atom from the centroid
141 To be computed later:
142 contactList: a list of all contacts made by that residue
146 for res
in loadedPDB:
148 if getAtoms ==
'centroid':
152 if getAtoms ==
'sccentroid':
155 if contactType ==
'full':
156 setattr(res,
'Ncontacts', 0)
157 setattr(res,
'Nhbonds', 0)
158 setattr(res,
'contactList', [])
160 setattr(res,
'NresContacts', 0)
164 This is the major function that allows one find-contacts script to
165 calculate contacts in a variety of different ways.
166 Using the parserule, the appropriate segment(s) of the pdb are extracted.
168 Two types of contact calculations are enabled by this routine, and the
169 type of calculation determines the inputs and actions:
170 internal contacts (no '-' in the parserule)
171 interface/binding site contacts ('-' in the parserule determining which
172 subsets of the protein to use)
174 For internal calculations:
175 The following types of input are accepted:
176 'all' - all residues in the protein, including all chains and any ligand
178 'A' - chain A only, all internal contacts
179 'AB' - chains A and B, all internal + A-B interface
181 action: this subroutine extracts the appropriate portion of the pdb and
182 then passes it to pair_close_internal(contactPDB), and returns the result
184 For interface calculations:
185 'A-B', 'AB-C', 'AB-CD', etc: docking partners. Can be multiple chains
187 'A-ligand', 'AB-ligand', 'ligand-A' - subset of protein vs. all ligands
188 'all-ligand' - all protein vs. all ligands
189 action: extracts the units separated by '-' and passes them through
190 pair_close_interface, and returns the result.
193 Input: contactPDB (see class contactProtein)
194 Output: [partnerA, partnerB] where partnerA and partnerB are subsets of
195 the contactPDB such that partnerA-partnerB contacts are calculated but not
196 internal contacts within either partner A or partner B. A parsing rule is
197 used to split loadePDB into partners A and B.
198 Currently, 3 parsing rules are implemented:
200 This routine is flexible so that new parsing rules can be
201 added independently of any previous rules
205 splitRule = string.split(parserule,
'-')
207 if 'ligand' in splitRule:
209 if splitRule[0] ==
'ligand':
210 proteinChain = splitRule[1]
213 proteinChain = splitRule[0]
217 if proteinChain ==
'all':
218 partnerA = contactPDB
230 if parserule ==
'all':
240 Input: subset of contact PDB
242 Output: all pairs of residue pairs i, j where a contact is possible
243 according to the reference (max radius to centroid) distances for
246 Nres =
len(contactPDBsubset)
248 for i
in range(0, Nres):
249 for j
in range(i+1, Nres):
252 if not(contactPDBsubset[i].chain
in refChain
or
253 contactPDBsubset[j].chain
in refChain):
269 filter_sq = (contactPDBsubset[i].max_radius +
270 contactPDBsubset[j].max_radius + contactFilter +
273 contactPDBsubset[j].centroid)
274 if cendist_sq <= filter_sq:
278 close_pair_list.append([contactPDBsubset[i],
279 contactPDBsubset[j]])
280 return close_pair_list
284 Input: partner A and partner B of a contactPDB
285 In internal case, just [partnerA] is given to save memory
287 Output: all pairs of residues between partner A and
288 partner B for which a contact is possible according to the reference
289 distances for each group
292 NresA =
len(partnerA)
293 NresB =
len(partnerB)
294 for i
in range(0, NresA):
295 for j
in range(0, NresB):
308 filter_sq = (partnerA[i].max_radius + partnerB[j].max_radius +
309 contactFilter + 1.0)**2
311 partnerB[j].centroid)
312 if cendist_sq <= filter_sq:
315 close_pair_list.append([partnerA[i], partnerB[j]])
316 return close_pair_list
320 Filter a close_residue_list (output of pair_close_refAtoms) for residues
321 with at least one atom-atom contact of < 4.0A+
322 (the filter can be set to another distance in class contactProtein)
326 for close_pair
in close_residue_list:
328 partnerA_res = close_pair[0]
329 partnerB_res = close_pair[1]
333 contactfilter_sq = contactFilter**2
336 if contactAtoms ==
'all':
337 idA =
int(partnerA_res.resid)
338 idB =
int(partnerB_res.resid)
339 A_atoms = partnerA_res.atomlist
340 B_atoms = partnerB_res.atomlist
345 A_atoms = B_atoms = [contactAtoms]
347 for A_atom
in A_atoms:
348 for B_atom
in B_atoms:
350 partnerB_res[B_atom])
351 if atomdist_sq >= bondfilter_sq
and atomdist_sq > 16.0
and atomdist_sq <= 64.0:
361 if atomContactList != []:
363 resContactList.append(
resContact(partnerA_res, partnerB_res,
366 return resContactList
370 Takes the contactMap of a contactProtein structure and adds the contacts
371 into the original loadedPDB
372 In makeContactPDB(loadedPDB), an empty contactList was added to each
373 residue in the loaded PDB. This list will now be filled up with all the
374 contacts that each residue makes.
376 Also fill in Ncontacts and Nhbonds for each residue, which were initialized
383 for contact
in contactMap:
384 res1 = loadedPDB[contact.index1]
385 res2 = loadedPDB[contact.index2]
388 contact_copy1 = contact.copy()
389 contact_copy2 = contact.copy()
390 contact_copy1.reduceContact(
'res1')
391 contact_copy2.reduceContact(
'res2')
392 res1.contactList.append(contact_copy1)
393 res2.contactList.append(contact_copy2)
394 res1.NresContacts = res1.NresContacts + 1
395 res2.NresContacts = res2.NresContacts + 1
397 if contactType ==
'full':
398 res1.Ncontacts = res1.Ncontacts + contact.Ncontacts
399 res1.Nhbonds = res1.Nhbonds + contact.Nhbonds
400 res2.Ncontacts = res2.Ncontacts + contact.Ncontacts
401 res2.Nhbonds = res2.Nhbonds + contact.Nhbonds
405 This converts a contact profile (list of residue-residue contacts for each
406 residue) to a shell profile (list of the atomic shell for each residue).
408 Contact profile format is useful for listing residues that contact a
409 particular residue, but it is tedious for analyzing the atomic shell of a
410 residue since one must search the atoms of each residue in the list. Here
413 contact profile format:
415 contacts for residue LYS A 4:
419 contact shell format:
421 shell for residue LYS A 4:
426 This allows direct iteration over the atomic shell without having to iterate
427 through particular residues first.
429 Practically, for each atom in the shell, I want to keep track of printid
430 (residue label), index (list identifier used to compare different pdbs),
431 atom name, counts (used if multiple monomers are added), and an
432 I will keep an atom label that combines the index and the name for easy
433 referencing. These attributes will be recorded in the shellatom class
436 for res
in contactProfile:
437 setattr(res,
'atomShell', [])
438 for resContact
in res.contactList:
440 for atom_contact
in resContact.atomContactList:
442 if not(atom_contact.atom2
in atomList):
445 atomLabel = resContact.res2.name + atom_contact.atom2
448 res.atomShell.append(atomLabel)
449 atomList.append(atom_contact.atom2)
450 delattr(res,
'contactList')
457 getAtoms =
'centroid'
468 global flags dependent on mode:
470 smartfilter: filter on centroid-centroid distance in full atom mode,
471 but not in CA-CA mode or sidechain-sidechain centroid mode
473 contactAtoms: defines atom list to iterate over when calculating contacts.
474 Allows full atom mode and single atom modes to use the same contact-
477 contactType: control creation and printing of certain contact attributes
478 (Ncontacts, Nhbonds) only relevant to full atom mode
480 contactFilter: the contact distance cutoff is dependent on the mode.
481 Currently, it is 4.0A for full atom mode, 7.0A for CA-CA mode, and 6.0A for
482 sidechain-sidechain centroid mode.
484 getAtoms: global flag to determine if (whole residue) centroid and
485 sidechain centroid need to be initialized. To save computing time.
487 global smartfilter, contactType, contactAtoms, contactFilter, getAtoms
488 if mode ==
'fullatom':
492 if mode
in [
'calpha',
'sccentroid',
'disulf']:
494 contactType =
'brief'
499 if mode ==
'sccentroid':
500 contactAtoms =
'sccentroid'
502 getAtoms =
'sccentroid'
510 Add contacts to a Protein object from loadPDB.py
511 This is somewhat complicated because I need to convert the Protein into a
512 loadedPDB to calculate the contacts and then convert it back again.
513 However, this works efficiently because the getAllRes() function of a
514 Protein object maintains the original order of residues so that when
515 the protein is recreated using parse(), the original structure is
520 newLoadedPDB = startProtein.getAllRes()
522 newLoadedPDB = startProtein.getAllProteinRes()
523 print 'ligand flag off; not counting ligands in contacts'
525 newContactProtein.contactPDB = newLoadedPDB
526 newContactProtein.find_contacts(parserule)
527 newContactProtein.initContactProfile()
531 newProtein.parse(newContactProtein.contactPDB)
532 startProtein = newProtein
536 Calculate contacts between any two pdbs
546 The contactProtein class contains information about the protein's contacts
549 contact profile (contacts by residue)
565 print 'You are trying to calculate contacts without a loadedPDB'
566 print 'Call initializePDB(PDBfile) to load the PDB from a PDB'
567 print 'file or load the PDB externally and add it to the \
570 if contactType ==
'brief':
580 if self.
mode ==
'shell':
585 convert from contact profile format (list of residue contacts for each
586 residue) to shell format (list of atoms in the atomic shell of a given
593 Output method to either print contacts or write them to a file
594 dest = 'print' outputs the contact map to the screen
595 dest = 'file' outputs the contact map to a file
598 pdbid = string.split(PDBfile,
'.')[0]
599 outfname = pdbid +
'.contactmap'
600 output =
open(outfname,
'w')
609 printList = [resContact.res1.printid, resContact.res2.printid]
614 for item
in printList:
615 outstr = outstr + str(item) +
' '
619 output.write(outstr +
'\n')
623 Output method to print contact profile or write them to a file.
624 Number of contacts and hbonds for each residue
625 equivalent to find_interface.pl in interface mode
626 dest = 'print' sends the output to the screen
627 dest = 'file sends the output to a file (pdb.contactprofile)
629 print 'wtf!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
631 pdbid = string.split(PDBfile,
'.')[0]
632 outfname = pdbid +
'.contactprofile'
633 output =
open(outfname,
'w')
634 header =
'restype chain resnum NresContacts'
635 if contactType ==
'full':
636 header = header +
' Ncontacts Nhbonds'
640 output.write(header +
'\n')
642 if res.NresContacts > 0:
643 printList = [res.printid, res.NresContacts]
644 if contactType ==
'full':
645 printList = printList + [res.Ncontacts, res.Nhbonds]
647 for item
in printList:
648 outstr = outstr + str(item) +
' '
652 output.write(outstr +
'\n')
BooleanOptionKey const range
Fstring::size_type len(Fstring const &s)
Length.
bool open(utility::io::izstream &db_stream, std::string const &db_file, bool warn)
Open a database file on a provided stream.