Systems Biology, Systems Pharmacology, Biomedical Big Data, Bioinformatics, Computational Biology, Data Mining, Software Engineering, Network Analysis, Artificial Intelligence
Research Team:
Program Director: Sherry Jenkins, MS
Research Assistant Professor: Alexander Lachmann, PhD
Data Scientist: Daniel Clarke, MS
Bioinformatician: John Erol Evangelista, MS
Bioinformatics Software Engineers: Nasheath Ahmed, BS, AB; Anna Byrd, MEng; Ido Diamant, BS; Giacomo Marino, ScB, AB
Systems Analyst: Heesu Kim, MBA, MS
2024 Undergrad and Post-bac Research Trainees: Bilal Ali, Eugenia Ampofo BA, Andrew Chung, Sophie Gideon, Eric Lee, Kareena Legare, Nathania Lingam, Tejal Nair, Lucas Sasaya, Andrew Stein
Summary of Research Studies:
Largest and Most Diverse Collection of Annotated Gene Sets
Gene set enrichment analysis is central to many biological and biomedical projects that measure mRNA and protein expression at the whole-genome scale. Gene set enrichment analysis is typically limited to few literature-base background knowledge libraries such as those created from the Gene Ontology and from pathway databases such as KEGG, WikiPathways, and Reactome. We have demonstrated that enrichment analysis can be expanded to using data from many other biological domains. For developing the tools Enrichr, Enrichr-KG, Rummagene, Rummageo, kinase enrichment analysis (KEA), ChIP-seq enrichment analysis (ChEA), and Harmonizome, we have integrated data from many key biomedical resources into useful gene set libraries. These libraries better inform enrichment analyses from omics studies. So far, over 2 million unique users used these bioinformatics software applications with a current rate of ~4,000 unique users per day.
Original Methods to Identify Differentially Expressed Genes, Perform Gene Set Enrichment Analyses, and Benchmark these Data Analysis Methods
One of the key statistical tests in the fields of transcriptomics is the identification of differentially expressed genes. We developed a multivariate method called the Characteristic Direction to better identify the “correct” differentially expressed genes. The Characteristic Direction method was extended to also perform improved enrichment analysis using a similar concept. Using a unique benchmarking strategy, we can objectively evaluate the Characteristic Direction method and many other leading methods for differential expression and enrichment analyses such as limma, GSEA and DESeq.
Translational Computational Research in Cancer and Kidney Disease
In collaboration with other experimental and computational biology laboratories, we have made great strides in the past several years in studying kidney disease, diabetes, HIV, and cancer. We have developed unique computational methods that led to the identification of potential targets and drugs for attenuating kidney fibrosis, diabetic kidney disease, and HIVAN. Our collaborative work also proposed treatment combinations for early-stage kidney disease intervention. These advances were possible by applying the unique algorithms that we developed which include: Expression2Kinases, SigCom LINCS, and TargetRanger.
Innovative Bioinformatics Software Infrastructure
To lower the barrier of entry for bioinformaticians and to streamline the development of bioinformatics software applications, we developed Appyters. With Appyters bioinformaticians can rapidly develop full-stack web-based bioinformatics applications from their Jupyter Notebook. Currently over 100 Appyters are available from the Appyters Catalog. For a CFDE Partnership project, our team developed the Playbook Workflow Builder, a platform that facilitates the visual dynamic construction of bioinformatics workflows. Along these efforts, we also created FAIRshake, a flexible framework for performing manual and automated evaluation of digital objects for adherence to defined community established standards.
For more information, please visit the Ma'ayan Laboratory website.