Subfamily Classification and Functional Analysis of CAZy Family GH5
Developed a bioinformatics workflow for the classification and functional characterization of the CAZy GH5 glycoside hydrolase family, a highly diverse enzyme family involved in carbohydrate metabolism. The project integrated sequence analysis, domain annotation, network biology, and phylogenetics to identify GH5 subfamilies and investigate their potential functional relationships.
Key Technical Work
- Processed and curated large-scale protein sequence datasets
- Performed domain annotation using HMMER and dbCAN
- Extracted protein modules based on conserved domain architectures
- Constructed and analyzed Sequence Similarity Networks (SSNs) using SSNpipe
- Integrated characterized proteins and EC numbers for functional interpretation
- Visualized biological networks and protein relationships using Cytoscape
- Conducted multiple sequence alignment and phylogenetic analysis using MAFFT
- Developed Linux/Bash-based workflows and automated analysis pipelines on High-Performance Computing (HPC) environments
GitHub Repository
Technologies
- Python
- Bash
- Linux
- HMMER
- dbCAN
- SSNpipe
- Cytoscape
- MAFFT
- High-Performance Computing (HPC)
Skills
- Protein Family Classification
- Bioinformatics Workflow Development
- Linux & Bash/Shell Scripting
- Phylogenetic Analysis
- Sequence Similarity Network Analysis
