View Our Website View All Jobs

IBD Plexus Data Scientist

Play an influential role in a multi-million dollar initiative to build a novel Inflammatory Bowel Diseases (IBD) resource to speed progress toward precision medicine, leading to better diagnostics, treatments and, ultimately, cures for Crohn’s disease and ulcerative colitis.

About IBD Plexus:

IBD Plexus (named for its complex network of parts) is a first-of-its-kind, IBD-specific research and information exchange platform.

  • IBD Plexus is transforming how research is being conducted by involving every stakeholder in both the design and operations of the platform. Stakeholders include:
    • Researchers and scientists from the top academic institutions
    • Leading experts from both the pharmaceutical and nutrition industry
    • Clinicians and other healthcare providers
    • Patients of all ages
  • IBD Plexus utilizes best-in-class technology and novel data sharing / collaboration mechanisms to enable scientists, researchers, clinicians and patients to capture, organize, mine and analyze data for new insights into Crohn's disease and ulcerative colitis
  • IBD Plexus aggregates data and information from adult and pediatric clinical registries, a central biobank, molecular reference labs, a quality of care improvement program, and a patient-powered research network
  • IBD Plexus will link clinical data such as electronic medical record data, biosample metadata including derived genetic / multi-‘omics data and patient reported / generated data from mobile health applications and wearbles


Job Description

Position Summary

The Crohn’s & Colitis Foundation is offering an exciting opportunity for a biologically-focused data scientist to dramatically impact the field of IBD. Under the general supervision of the director of IBD Plexus, the IBD Plexus Data Scientist will lead activities across the data management and analytics lifecycle: use case definition, experimental and clinical data collection, cleaning, integration, visualization, and exploration. The Data Scientist will have a depth knowledge of methods for management of clinical and multi-omic data. S/he will work closely with academic and industry researchers to provision research-ready datasets and provide services to aid in study design, data management and analysis and interpretation of study findings. The Data Scientist will also work within and across the IBD Plexus study cohorts to facilitate demonstration projects and to help train IBD Plexus users to better leverage the platform.


This position provides the flexibility of being based at the Foundation’s New York City headquarters or functioning as a remote, work from home position.

Some travel is required for internal meetings, conferences, etc.

Essential Functions & Responsibilities

Data Management

  • Provide direction for data management and scientific analysis activities for IBD Plexus
  • Responsible for maintaining the data glossary and data dictionary
  • Lead adoption of technologies related to ingestion and integration of data such as metadata management tools
  • Aid in establishing data governance standards, policies, and processes, including data quality plans
  • Assist in the development of reporting tools, metrics and queries to monitor the program
  • Act as scientific contact for data integration vendors

Data Services

  • Collaborate with consortium partners and investigators from IBD Plexus study cohorts to ensure data integration and analysis needs are met
  • Provide consultation to data users (e.g., researchers from industry, academia, government, and other stakeholder groups) regarding study design and feasibility and to facilitate understanding and appropriate use of existing data assets.   This may also include assisting with “preparatory to research” queries.
  • Provide services to aid in querying the database and dissemination of data, including preparing research-ready dataset and conducting data curation
  • Provide training to users (internal and external, ranging from data analysts to researchers) on analytics capabilities in terms of end users’ data requirements


The ideal candidate will have:

  • At least 4 years of experience in working an analytic/data analysis environment, with substantial experience in the use of healthcare, translational or biological data and the application of appropriate analyses and reporting  
  • An advance degree, or equivalent experience, in Computer Science, Biostatistics, Biophysics, Epidemiology or related field with a substantial quantitative and computational component
  • Working knowledge of data flows, data architecture, ETL and processing of structured, unstructured and semi-structured data
  • Demonstrated knowledge of retrospective and prospective observational study designs and related methodology
  • Knowledge of medical terminology, clinical epidemiology and biostatistics
  • Experience working with diverse data sources:  clinical study data, electronic health record data, laboratory information management data, claims data, and disease/patient registry data
  • Experience working with large molecular datasets
  • Working knowledge of SQL and relational databases
  • Excellent problem identification, analysis, and solving skills
  • Excellent written and oral communication skills
  • Self-starter, willingness to learn various technical skills, resourcefulness
  • Highly motivated and independent

Preferred skill set:

  • Knowledge/familiarity in research focused analytics capabilities such as i2b2 and tranSMART
  • Experience with Hadoop platforms (e.g., Cloudera) and related “big data” technologies
  • Experience working in Cloud Computing environments
  • Knowledge of statistical packages such as R and/or Python, SAS, SPSS, STATA
Read More

Apply for this position

Apply with Indeed
Attach resume as .pdf, .doc, or .docx (limit 2MB) or Paste resume

Paste your resume here or Attach resume file