Application overview
Michał Nowotka
ChEMBL Group
EMBL-EBI
Used languages:
Network architecture: (thin) Client -
Server
Model View Controller
Technique for converting data between incompatible type systems in object-oriented programming languages. This creates, in effect, a "virtual object database" that can be used from within the programming language.
In simple words:
class CompoundStructures(ChemblCoreAbstractModel):
molecule = models.OneToOneField(MoleculeDictionary)
molfile = ChemblTextField(null=True)
standard_inchi = ChemblCharField(max_length=4000)
standard_inchi_key = ChemblCharField(unique=True)
canonical_smiles = ChemblCharField(db_index=True)
molformula = ChemblCharField(help_text="Molecular formula of compound")
...
Classes code is generated semi-automatically from existing database
from chembl.models import MoleculeDictionary
molecule = MoleculeDictionary.objects.get(molregno=97)
assertEqual(molecule.pref_name, 'PRAZOSIN')
assertEqual(molecule.molecule_type, 'Small molecule')
Assays.objects.filter(curated_by__curated_by__startswith='Expert')
Assays.objects.filter(description__icontains='affinity')
Assays.objects.filter(assay_cell_type__startswith='CHO')
Assays.objects.filter(assay_tissue__endswith='Brain')
Assays.objects.filter(chembl__isnull=False).exclude(chembl__entity_type__exact='ASSAY')
Assays.objects.filter(updated_on__range=(start_date, end_date))
Assays.objects.filter(activity_count__isnull=False).exclude(activity_count__gte=5)
Assays.objects.filter(doc__doc_id=9964)
Assays.objects.filter(src__src_id=1).exists()
# test OneToOneFields:
assertEquals(molecule.compoundproperties.molecular_species, 'NEUTRAL')
assertEquals(molecule.moleculehierarchy.parent_molecule, molecule)
assertEquals(molecule.compoundstructures.standard_inchi_key, 'IENZQIKPVFGBNW-UHFFFAOYSA-N')
act = molecule.activities_set.all()[0]
assertEquals(act.activity_type, 'ED50')
rec = molecule.compoundrecords_set.all()[0]
synonyms = molecule.moleculesynonyms_set.all()[0]
assertEquals(synonyms.synonyms, 'CP-12299')
td = TargetDictionary.objects.get(pk=104088)
docs = td.docs.all()
doc = Docs.objects.get(pk=57482)
targets = doc.targetdictionary_set.all()
ctabs = CompoundMols.objects.with_substructure(smiles)
ids = ctabs.values_list('molecule_id').distinct()
ctabs = CompoundMols.objects.similar_to(smiles,simscore)
TargetType.objects.filter(parent_type__isnull=False).exclude(
parent_type__in=map(lambda x: x[0],TargetType.objects.
values_list('target_type').distinct())).exists()
def checkOSRA(molecule):
img = molecule.compoundimages.png_500
im = Image.open(StringIO.StringIO(molecule.compoundimages.png_500))
canonical_smiles = molecule.compoundstructures.canonical_smiles
smile = smileFromImage(img, OSRA_BINARIES_LOCATION, canonical_smiles)
im.show()
return canonical_smiles == Chem.MolToSmiles(Chem.MolFromSmiles(smile[0]), True)
All examples are taken from test.py file (2267 lines!). All classes/fields/relations are covered.
python manage.py test chembl
python manage.py shell
Shell can be configured to display SQL statements executed by middleware:
python manage.py debugsqlshell
>>> from chembl.models import MoleculeDictionary
>>> molecules = MoleculeDictionary.objects.all()
>>> molecules.count()
SELECT COUNT(*)
FROM "MOLECULE_DICTIONARY" [1.82ms]
1254575
limit
python manage.py migrate --sourceDatabase=ora --targetDatabase=pg
Migrating chemtst to curation_interface
ChemblIdLookup [################################] 2010/2010
Version [################################] 1/1
Docs [################################] 48/48
Source [################################] 1/1
MoleculeDictionary [# ] 51/1281
All model classes are automatically exposed as REST resources. Documentation is generated automatically as well!
http://localhost:8000/api/v1/moleculedictionary/?format=json
http://localhost:8000/api/v1/compoundstructures/2?format=json
http://localhost:8000/api/v1/compoundstructures/2.json
http://localhost:8000/api/v1/targetdictionary/set/11283;11292/
/api/v1/moleculedictionary/?molregno__gte=3&format=json
/api/v1/moleculedictionary/?molregno__gte=3&structure_type__startswith=M&format=json
/api/v1/compoundstructures/?molformula__icontains=H20&format=json
http://localhost:8000/api/v1/moleculedictionary/search/?q=TRIMETHOPRIM&format=json
Questions?