An interactive Streamlit application demonstrating AI-powered protein structure prediction using AlphaFold/ESMFold. Perfect for biology students with no programming background!
- Predicts 3D protein structures from amino acid sequences
- Visualizes proteins in interactive 3D
- Explains the difference between creating vs using AI models
- Introduces modern bio-AI tools and resources
- Encourages hands-on exploration ("vibe coding")
You'll need:
- A computer (Windows, Mac, or Linux)
- Internet connection
- 15 minutes for setup
We've included automated setup scripts to make installation easier!
Windows:
- Double-click
setup.bat - Wait for installation to complete
- Double-click
run.batto start the app
Mac/Linux:
- Open Terminal in the project folder
- Run:
bash setup.sh - Run:
bash run.shto start the app
The scripts will automatically:
- Create a virtual environment
- Install all dependencies
- Launch the app in your browser
Windows:
- Go to https://www.python.org/downloads/
- Download Python 3.10 or newer
- Run the installer
- IMPORTANT: Check the box "Add Python to PATH"
- Click "Install Now"
Mac:
- Open Terminal (search for "Terminal" in Spotlight)
- Install Homebrew if you don't have it:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" - Install Python:
brew install python
Linux (Ubuntu/Debian):
sudo apt update
sudo apt install python3 python3-pipOption A: Using Git (if you have it)
git clone <your-repository-url>
cd alphafolddemoOption B: Download ZIP
- Click the green "Code" button on GitHub
- Click "Download ZIP"
- Extract the ZIP file
- Open Terminal/Command Prompt and navigate to the folder:
cd path/to/alphafolddemo
A virtual environment keeps this project's packages separate from your system Python.
Windows:
# Create virtual environment
python -m venv venv-windows
# Activate it
venv-windows\Scripts\activate
# Install packages
pip install -r requirements.txtMac/Linux:
# Create virtual environment
python3 -m venv venv
# Activate it
source venv/bin/activate
# Install packages
pip install -r requirements.txtYou should see (venv-windows) (Windows) or (venv) (Mac/Linux) at the beginning of your command prompt, indicating the virtual environment is active.
This will install:
streamlit- The web app frameworkpy3Dmol- 3D molecule visualizationstmol- Streamlit integration for py3Dmolrequests- For making API calls
Make sure your virtual environment is activated (you should see (venv) in your prompt), then run:
streamlit run app.pyThe app will automatically open in your web browser at http://localhost:8501
If it doesn't open automatically, copy that URL into your browser.
Important: Every time you open a new terminal to run the app, you need to activate the virtual environment first:
- Windows:
venv-windows\Scripts\activate - Mac/Linux:
source venv/bin/activate
- Choose a protein: Select from the dropdown menu or paste your own sequence
- Click "Predict Structure": Wait 30-60 seconds for the prediction
- Explore the 3D visualization: Rotate, zoom, and examine the structure
- Read the educational content: Check the sidebar and tabs for learning materials
- Download structures: Save PDB files for further analysis
app.py- The main Streamlit applicationrequirements.txt- Python package dependenciesPRESENTATION.md- Complete presentation slides (20+ slides)README.md- This file!setup.sh/setup.bat- Automated setup scripts (Mac/Linux & Windows)run.sh/run.bat- Quick run scripts (Mac/Linux & Windows)venv/- Mac/Linux virtual environment folder (created after setup.sh)venv-windows/- Windows virtual environment folder (created after setup.bat)
- Interactive 3D protein visualization with py3Dmol
- Pre-loaded example proteins (insulin, lysozyme, myoglobin)
- Educational sidebar explaining:
- What is AlphaFold?
- Creating vs Using AI models
- Other bio-AI tools (Benchling, PubMed, BioRender, etc.)
- Career paths in computational biology
- Three educational tabs:
- About Proteins (biology basics)
- About the AI (how AlphaFold works)
- Vibe Coding Tips (getting started with programming)
- Download predictions as PDB files
- Free API usage (ESMFold from Meta)
- Human Insulin - Hormone regulating blood sugar
- Human Lysozyme - Antibacterial enzyme
- Myoglobin - Oxygen storage protein
- Custom - Enter your own sequence!
Make sure your virtual environment is activated first:
Windows:
venv-windows\Scripts\activate
streamlit run app.pyMac/Linux:
source venv/bin/activate
streamlit run app.pyIf that doesn't work, run streamlit via Python:
Windows:
python -m streamlit run app.pyMac/Linux:
python3 -m streamlit run app.pyMake sure your virtual environment is activated, then reinstall:
# Activate venv first (see above)
pip install --upgrade -r requirements.txtWindows (PowerShell): You may need to enable script execution:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUserThen try activating again:
venv-windows\Scripts\activateMac/Linux: Make sure you're using source:
source venv/bin/activate- Check your internet connection
- Try a shorter protein sequence (< 200 amino acids)
- Wait a few minutes and try again (API might be busy)
Change the port:
streamlit run app.py --server.port 8502Manually open: http://localhost:8501
# 1. Import libraries
import streamlit as st # Web app framework
import requests # API calls
import py3Dmol # 3D visualization
from stmol import showmol # Display molecules
# 2. Set up the page
st.set_page_config(...) # Configure appearance
# 3. Create UI elements
st.header("...") # Headers
st.text_area("...") # Input boxes
st.button("...") # Buttons
# 4. Make predictions
response = requests.post(api_url, data=sequence)
# 5. Visualize results
view = py3Dmol.view() # Create viewer
view.addModel(pdb_data) # Add structure
showmol(view) # DisplayYour Sequence
↓
requests.post() sends to ESMFold API
↓
ESMFold AI model predicts structure
↓
Returns PDB file (3D coordinates)
↓
py3Dmol visualizes in browser
-
Add your own protein examples:
- Find a protein on UniProt (https://www.uniprot.org/)
- Copy the sequence
- Add to
example_proteinsdictionary inapp.py
-
Change colors:
- Find
view.setStyle({'cartoon': {'color': 'spectrum'}}) - Change 'spectrum' to 'red', 'blue', 'green', etc.
- Find
-
Add information:
- Edit the sidebar expanders
- Add your favorite proteins or diseases
-
Add protein properties:
- Calculate molecular weight
- Count amino acid types
- Predict isoelectric point
-
Batch predictions:
- Upload a FASTA file with multiple sequences
- Predict all structures
- Download as a ZIP file
-
Compare structures:
- Predict two proteins
- Visualize side-by-side
- Calculate RMSD (structural similarity)
-
Use real AlphaFold:
- Set up Google Colab
- Run full AlphaFold2
- Get highest accuracy predictions
-
Add analysis tools:
- Ramachandran plots
- Secondary structure prediction
- Binding site identification
-
Create database:
- Store predictions in SQLite
- Track prediction history
- Compare multiple versions
- Protein Data Bank: https://www.rcsb.org/
- UniProt: https://www.uniprot.org/
- NCBI: https://www.ncbi.nlm.nih.gov/
- AlphaFold Database: https://alphafold.ebi.ac.uk/
- ESMFold: https://esmatlas.com/
- Original Paper: Nature (2021) "Highly accurate protein structure prediction with AlphaFold"
- Rosalind (bioinformatics problems): http://rosalind.info/
- Python for Biologists: https://pythonforbiologists.com/
- BioPython Tutorial: https://biopython.org/wiki/Documentation
- Kaggle (data science): https://www.kaggle.com/learn
- PyMOL: https://pymol.org/ (professional visualization)
- ChimeraX: https://www.cgl.ucsf.edu/chimerax/ (free, powerful)
- Mol*: https://molstar.org/ (web-based)
- Claude: https://claude.ai/ (AI assistant with bio MCP servers)
- Benchling: https://www.benchling.com/ (lab management)
- BioRender: https://biorender.com/ (scientific figures)
- PubMed: https://pubmed.ncbi.nlm.nih.gov/ (literature)
The PRESENTATION.md file contains a complete 20-slide presentation covering:
- Introduction to protein folding
- The AlphaFold breakthrough
- Creating vs using AI models
- Foundation models in biology
- Other bio-AI tools
- Career paths
- Vibe coding philosophy
- Hands-on exercises
- Real-world impact
- Getting started guide
Option 1: Convert to PowerPoint
Use a markdown-to-slides tool:
- Marp: https://marp.app/
- Slidev: https://sli.dev/
- Deckset (Mac): https://www.deckset.com/
Option 2: Present from Markdown
- Open in a markdown viewer
- Use presentation mode in VS Code
- Convert to PDF and present
Option 3: Create Custom Slides
- Use the content as a script
- Create slides in PowerPoint/Google Slides
- Add images and animations
-
Introduction (5 min)
- Slides 1-3: Problem setup
-
AlphaFold Breakthrough (10 min)
- Slides 4-6: The solution
-
Creating vs Using (10 min)
- Slides 5-8: Key distinction
-
Live Demo (10 min)
- Slide 10: Use the app!
-
Career Paths (10 min)
- Slides 11-12: Future opportunities
-
Hands-On (15 min)
- Slide 13: Let students try it
-
Q&A (10 min)
- Slide 19: Discussion
Total: ~70 minutes (adjust as needed)
- Learn structural biology: See how sequence determines structure
- Explore proteins: Visualize your favorite proteins
- Class projects: Use for presentations and reports
- Research: Predict structures for your lab work
- Lecture demos: Show real AI in action
- Lab exercises: Let students predict structures
- Assignments: "Predict and analyze protein X"
- Inspiration: Encourage computational thinking
- Quick predictions: Fast structure estimation
- Hypothesis generation: "What if this mutant..."
- Preliminary analysis: Before expensive experiments
- Visualization: Share structures with collaborators
If you use this demo in your work or teaching, please cite:
AlphaFold:
Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure
prediction with AlphaFold. Nature 596, 583–589 (2021).
https://doi.org/10.1038/s41586-021-03819-2
ESMFold:
Lin, Z., Akin, H., Rao, R. et al. Evolutionary-scale prediction of atomic-level
protein structure with a language model. Science 379, 1123-1130 (2023).
https://doi.org/10.1126/science.ade2574
This demo is for educational purposes. The AlphaFold and ESMFold models have their own licenses:
- AlphaFold: Apache 2.0 License
- ESMFold: MIT License
- This demo code: MIT License (free to use, modify, share)
Want to improve this demo? Great! Here's how:
- Fork the repository
- Make your changes
- Test thoroughly
- Submit a pull request
- Describe what you changed and why
Ideas for contributions:
- Add more example proteins
- Improve visualization options
- Add more educational content
- Fix bugs or improve performance
- Translate to other languages
- Add accessibility features
- GitHub Issues: Report bugs or request features
- Discussions: Ask questions and share ideas
- Email: [Your email if you want to provide support]
- ESMFold is slightly less accurate than AlphaFold2
- Long sequences (>400 aa) may be slow or fail
- No support for protein complexes (use AlphaFold-Multimer instead)
- Requires internet connection
- API may have rate limits
- DeepMind for AlphaFold
- Meta AI for ESMFold and the free API
- Streamlit for the web framework
- 3Dmol.js for visualization
- The open-source bioinformatics community
Q: Is this real AlphaFold? A: This demo uses ESMFold, a similar but faster model from Meta. For the highest accuracy, use AlphaFold2.
Q: Can I use this for my research? A: Yes! But validate critical predictions experimentally.
Q: Do I need a GPU? A: No! The prediction happens on Meta's servers.
Q: Is it free? A: Yes! ESMFold's API is currently free to use.
Q: How accurate is it? A: ~85-90% accuracy on average, similar to experimental methods for many proteins.
Q: Can I predict protein complexes? A: Not with this demo. Use AlphaFold-Multimer for complexes.
Q: What's the maximum sequence length? A: Technically ~1000 amino acids, but shorter (<400) is recommended for speed.
Q: Can I run this offline? A: No, it requires internet for the API. You could set up local AlphaFold/ESMFold for offline use.
Q: Can I modify and share this? A: Yes! It's open source (MIT License).
After trying this demo, consider:
- Explore AlphaFold Database: Download pre-computed structures
- Learn PyMOL/ChimeraX: Professional visualization tools
- Try BioPython: Analyze sequences and structures programmatically
- Take a course: Rosalind, Coursera, or edX bioinformatics
- Join a lab: Get hands-on research experience
- Build something: Modify this app or create your own!
- v1.0 (2024): Initial release
- Basic prediction and visualization
- Educational content
- Example proteins
Questions? Suggestions? Found a bug?
- GitHub: [Your GitHub]
- Email: [Your email]
- Twitter/X: [Your handle]
Remember: You don't need to understand everything to get started. Just start exploring!
The best way to learn is by doing. Pick a protein, predict its structure, and see what you discover.
Happy folding!