Welcome to ecocompute-dynamic-eval Discussions! #4

hongping-zh · 2026-02-14T11:03:05Z

hongping-zh
Feb 14, 2026
Maintainer

👋 Welcome to EcoCompute Discussions!

Thank you for your interest in energy-efficient LLM inference research!

🎯 What is this project?

We discovered that default bitsandbytes INT8 uses 17-147% MORE energy than FP16, contrary to common belief. Through systematic ablation experiments across three NVIDIA GPU architectures, we identified mixed-precision decomposition as the root cause.

🔥 Key Findings

Configuration	Energy vs FP16	Throughput vs FP16
INT8 Default	+17% to +147% ⚠️	-55% to -76% ⚠️
INT8 Pure (threshold=0.0)	-5.5% ✅	+79% to +98% ✅

Root cause: INT8↔FP16 type conversions in outlier-aware decomposition, not INT8 arithmetic itself.

📊 Research Quality

✅ 32 measurements across 3 GPU architectures (RTX 5090 Blackwell, RTX 4090D Ada Lovelace, A800 Ampere)
✅ High precision: Coefficient of Variation < 2% (n=10 per configuration)
✅ Causal analysis: Controlled ablation experiments on dual models × dual architectures
✅ Full reproducibility: Complete metadata, scripts, and protocols publicly available

📁 View Metadata →
📊 Interactive Dashboard →

🤝 How to Participate

💡 Share Ideas

Have suggestions for new experiments or visualizations? Start a discussion in Ideas!

🙋 Ask Questions

Confused about the methodology or results? Ask in Q&A and I'll respond within 24-48 hours.

📊 Share Your Results

Run benchmarks on your GPU and share findings in Results Sharing. We especially need:

H100, A100 (Hopper, Ampere datacenter)
AMD MI300, Intel GPUs
LLaMA-3, Gemma, other models
Different batch sizes

🎓 Academic Discussion

Discuss methodology, statistical approaches, or related research in Research.

🤝 Find Collaborators

Looking for co-authors for extended studies? Post in Collaboration!

🚀 Quick Links

📄 Paper (Draft): GitHub
📊 Dashboard: Live Demo
📁 Metadata: Complete Dataset
💻 Code: Benchmark Scripts
🎓 Cite: CITATION.cff

📧 Contact

Author: Hongping Zhang
Email: zhanghongping1982@gmail.com
Location: Changsha, Hunan, China

🌟 Current Focus

I'm actively working on:

Accuracy assessment: Perplexity and downstream task evaluation for pure INT8
Extended GPU coverage: H100, A100 validation
Alternative methods: GPTQ, AWQ, TensorRT-LLM comparison
Paper submission: Preparing for MLSys/ICML 2026

💬 Discussion Guidelines

Be respectful: Constructive criticism is welcome, personal attacks are not
Stay on topic: Keep discussions relevant to energy efficiency and LLM inference
Share data: If posting results, include hardware specs and software versions
Cite sources: Reference papers, code, or data when making claims
Ask for help: No question is too basic!

Looking forward to your contributions and discussions! 🚀

— Hongping Zhang
"Measure, don't assume. Reproduce, don't trust. Share, don't hoard."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Welcome to ecocompute-dynamic-eval Discussions! #4

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Welcome to ecocompute-dynamic-eval Discussions! #4

Uh oh!

Uh oh!

hongping-zh Feb 14, 2026 Maintainer

👋 Welcome to EcoCompute Discussions!

🎯 What is this project?

🔥 Key Findings

📊 Research Quality

🤝 How to Participate

💡 Share Ideas

🙋 Ask Questions

📊 Share Your Results

🎓 Academic Discussion

🤝 Find Collaborators

🚀 Quick Links

📧 Contact

🌟 Current Focus

💬 Discussion Guidelines

Replies: 0 comments

hongping-zh
Feb 14, 2026
Maintainer