EdgeCompress

EdgeCompress is a graduation project developed by senior Computer Science students from Capital University, Egypt.

Our work focuses on compressing Large Language Models (LLMs) to make them efficient enough to run on edge devices with limited computational resources.

Project Overview

Large Language Models typically require significant memory, storage, and computational power. This makes them difficult to deploy on edge hardware such as embedded systems, IoT devices, and low-power GPUs.

Our project explores different model compression techniques to reduce the size and resource requirements of LLMs while maintaining acceptable performance.

Research Focus

We investigate multiple compression approaches, including:

Quantization
Model pruning
Knowledge Distillation
Low-precision inference
Memory-efficient deployment strategies

Edge Deployment

After compression, the models are evaluated on edge computing environments to determine:

Memory usage
Inference latency
Performance degradation after compression
Suitability for real-time edge AI applications

What You Will Find Here

This organization hosts:

Compressed LLM checkpoints
Experiments with different compression techniques
Benchmark results on edge hardware
Research artifacts from our graduation project

Goal

Our goal is to enable efficient deployment of LLMs on edge devices, making advanced AI models more accessible in real-world and resource-constrained environments.