Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
211 changes: 211 additions & 0 deletions device-anomaly-detection-demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
# Device Anomaly Detection Demo
PR [#6543](https://github.com/kubeedge/kubeedge/pull/6543) introduces device anomaly detection capabilities to KubeEdge. This project demonstrates an IoT device anomaly detection system using KubeEdge mapper framework, gRPC communication, and machine learning-based anomaly detection, and serves as a reference demonstration for developers to use features introduced in the PR.

## Project Structure

```
device-anomaly-detection-demo/
├── api/ # Protocol definitions
│ └── external_anomaly_detector.proto
├── external-anomaly-detector/ # Anomaly detection service
│ ├── main.py
│ ├── pyproject.toml
│ ├── data/ # Training and observation data
│ ├── externalanomalydetectorpb/ # Generated gRPC code (Python)
│ └── model_training/ # ML model training logic
├── me-restful-mapper/ # KubeEdge mapper application
│ ├── cmd/ # Main entry point
│ ├── api/ # Generated gRPC code (Go)
│ ├── data/ # Database and streaming adapters
│ ├── device/ # Device management logic
│ ├── devicecrds/ # Kubernetes device CRDs
│ ├── driver/ # Device driver implementations
│ ├── resource/ # Kubernetes deployment resources
│ ├── build.sh # Build script
│ ├── config.yaml # Mapper configuration
│ └── Dockerfile_nostream # Docker build file
├── room-simulator/ # Room device simulator
│ ├── main.py
│ ├── pyproject.toml
│ └── README.md
└── generate_grpc_code.sh # Script to generate gRPC code from proto
```

### Component Overview

#### `api/`
Contains the Protocol Buffer definition (`external_anomaly_detector.proto`) for gRPC communication between the mapper and the anomaly detection service.

#### `external-anomaly-detector/`
A Python-based gRPC service that receives device data from the mapper, stores it in memory, and performs anomaly detection using a machine learning model. The service:
- Collects device property updates via gRPC
- Trains a Bayesian Network model on historical observation data
- Detects anomalies by comparing actual vs. predicted device states
- Localizes the faulty device by systematically flipping reported light states and re-running inference
- Updates DeviceStatus CRD `.status.extensions.anomaly` in Kubernetes for each device

#### `me-restful-mapper/`
A KubeEdge mapper written in Go that:
- Interfaces with edge devices via RESTful APIs
- Manages device lifecycle and state synchronization
- Communicates with the cloud anomaly detection service via gRPC
- Supports multiple data persistence backends (InfluxDB, MySQL, Redis, TDengine)
- Publishes telemetry data via HTTP, MQTT, or OpenTelemetry

#### `room-simulator/`
A Python FastAPI application that simulates IoT devices in a room:
- Two controllable light devices (`light1`, `light2`)
- One brightness sensor whose value equals `light1_state × 1 + light2_state × 2` (range 0–3: 0=dark, 1=dim, 2=moderate, 3=bright)
- RESTful API for device control and status queries
- Fault injection capabilities for testing anomaly detection

## Architecture Diagram

<p align="center">
<img src="./images/architecture_diagram.png" alt="Architecture Diagram" width="800"/>
</p>

### Architecture Flow

1. **Device Simulation**: The room-simulator exposes RESTful APIs to simulate IoT devices (lights and brightness sensor)
2. **Device Management**: The me-restful-mapper discovers and manages devices through HTTP requests to the simulator
3. **Edge-Cloud Sync**: The mapper communicates with EdgeCore via DMI protocol, which syncs with CloudCore
4. **Data Collection**: Device property updates are sent to the external-anomaly-detector via gRPC
5. **Anomaly Detection**: The detector analyzes device data using ML models and identifies anomalies
6. **Status Update**: When anomalies are detected, the detector updates DeviceStatus CRD `.status.extensions` in Kubernetes

## How to Run the Demo(One-Click with Docker Compose)

Before running the demo, please ensure your KubeEdge `cloudcore` and `edgecore` are properly set up and running, and include codes from PR [#6543](https://github.com/kubeedge/kubeedge/pull/6543). For convenience, the `room-simulator`, `external-anomaly-detector`, and `me-restful-mapper` are planned to be run in Docker containers, so please ensure Docker is installed on your system. If you prefer to run them directly on your host machine, please use `uv` to install the required Python dependencies for `room-simulator` and `external-anomaly-detector`, and build binary with Go for `me-restful-mapper`.
Comment on lines +77 to +79
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are some minor grammatical and formatting issues in the documentation that could be improved for clarity.

Suggested change
## How to Run the Demo(One-Click with Docker Compose)
Before running the demo, please ensure your KubeEdge `cloudcore` and `edgecore` are properly set up and running, and include codes from PR [#6543](https://github.com/kubeedge/kubeedge/pull/6543). For convenience, the `room-simulator`, `external-anomaly-detector`, and `me-restful-mapper` are planned to be run in Docker containers, so please ensure Docker is installed on your system. If you prefer to run them directly on your host machine, please use `uv` to install the required Python dependencies for `room-simulator` and `external-anomaly-detector`, and build binary with Go for `me-restful-mapper`.
## How to Run the Demo (One-Click with Docker Compose)
Before running the demo, please ensure your KubeEdge `cloudcore` and `edgecore` are properly set up and running, and include the changes from PR [#6543](https://github.com/kubeedge/kubeedge/pull/6543).


Theoretically, the `external-anomaly-detector` can be deployed on cloud side or edge side, since it communicates with `me-restful-mapper` via standard gRPC, which is workable as long as network connectivity is ensured. In order to make the demo easy to run, here we deploy all components on one KubeEdge edge node.

### Step 1. Create DeviceModel and Device CRDs in Kubernetes

First, enter the `me-restful-mapper/devicecrds/` directory and apply the DeviceModels.

```bash
cd me-restful-mapper/devicecrds/
kubectl apply -f brightness-sensor-model.yaml -f light-model.yaml
```

Then, you need to set `.spec.nodeName` to your edge node name in the Device CRD files `brightness-sensor-1.yaml`, `light-1.yaml`, and `light-2.yaml`. After that, apply the Device CRDs.

```bash
kubectl apply -f brightness-sensor-1.yaml -f light-1.yaml -f light-2.yaml
```

### Step 2. Run three components with Docker Compose

```bash
bash start.sh
```

After this process, you should see three containers running:

```
$ docker ps
a2cc34c56101 huajuan6848/external-anomaly-detector:1.0.0 "uv run python main.…" About an hour ago Up About an hour 0.0.0.0:6804->6804/tcp, :::6804->6804/tcp external-anomaly-detector
6bc6d74a36de huajuan6848/room-simulator:1.0.0 "uv run uvicorn main…" About an hour ago Up About an hour 0.0.0.0:8000->8000/tcp, :::8000->8000/tcp room-simulator
098a371611f7 huajuan6848/me-restful-mapper:1.0.0 "/kubeedge/main --co…" About an hour ago Up About an hour me-restful-mapper
```

### Step 3. Test Anomaly Detection

Wait for a few seconds, then use `kubectl get devicestatus light-1 -o yaml` to check the DeviceStatus of `light-1`. You should see the `.status.extensions` field showing anomaly detection results, for example:

```yaml
apiVersion: devices.kubeedge.io/v1beta1
kind: DeviceStatus
metadata:
creationTimestamp: "2025-12-09T06:31:44Z"
generation: 923
name: light-1
namespace: default
ownerReferences:
- apiVersion: devices.kubeedge.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: Device
name: light-1
uid: 18e0ee48-e85e-4dce-8ce3-d03876a057ad
resourceVersion: "709890"
uid: ecae6dc5-8b2b-4865-ad52-59b4b2e6bd30
spec: {}
status:
extensions:
# No anomaly. This field is propagated by external-anomaly-detector
anomaly: no anomaly
lastOnlineTime: "2025-12-09T09:05:08Z"
state: ok
twins:
- observedDesired:
metadata:
timestamp: "1765271099157"
type: string
value: ""
propertyName: working_state
reported:
metadata:
timestamp: "1765271099157"
type: string
value: "0"
```

Then, we inject faults into `light-1` by calling the room-simulator API to make it abnormal, reporting a working_state of "0"(off) while actually it is "1"(on), causing an anomaly.

```bash
curl -X POST "http://localhost:8000/lights/light1/fault" -d '1'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The curl command for injecting faults uses -d '1' (plain text), but the README_zh.md example for the same operation uses json={"fault": true}. The Python main.py for inject_fault expects plain text. Please ensure consistency in the documentation and with the API implementation. The English README example is correct for the current Python implementation.

```

Since the collect cycle for `light-1` working_state is 10 seconds, and the `external-anomaly-detector` detects anomalies every 10 seconds, after at most 20 seconds, you can check the DeviceStatus of `light-1` again. You should see the `.status.extensions.anomaly` field indicating an anomaly is detected:

```yaml
apiVersion: devices.kubeedge.io/v1beta1
kind: DeviceStatus
metadata:
creationTimestamp: "2025-12-09T06:31:44Z"
generation: 979
name: light-1
namespace: default
ownerReferences:
- apiVersion: devices.kubeedge.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: Device
name: light-1
uid: 18e0ee48-e85e-4dce-8ce3-d03876a057ad
resourceVersion: "710172"
uid: ecae6dc5-8b2b-4865-ad52-59b4b2e6bd30
spec: {}
status:
extensions:
# Anomaly detected, details shown here. This field is propagated by external-anomaly-detector
anomaly: "Device fault: reported state=0 but should be=1"
lastOnlineTime: "2025-12-09T09:09:38Z"
state: ok
twins:
- observedDesired:
metadata:
timestamp: "1765271379164"
type: string
value: ""
propertyName: working_state
reported:
metadata:
timestamp: "1765271379164"
type: string
value: "0"
````

### Step 4. Clean Up

You can stop all services and remove the created DeviceModels and Devices by running:

```bash
cd me-restful-mapper/devicecrds/
kubectl delete -f light-1.yaml -f light-2.yaml -f brightness-sensor-1.yaml
kubectl delete -f light-model.yaml -f brightness-sensor-model.yaml
cd ../../
bash stop.sh
```
27 changes: 27 additions & 0 deletions device-anomaly-detection-demo/api/external_anomaly_detector.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
syntax = "proto3";

package externalanomalydetector;

import "google/protobuf/timestamp.proto";

// External Anomaly Detector Service
service ExternalAnomalyDetector {
// Detects anomalies based on the provided data
rpc DetectAnomaly (AnomalyRequest) returns (AnomalyResponse);
Comment on lines +8 to +10
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The RPC name DetectAnomaly is a bit misleading. Based on its usage, this RPC is for reporting device property data, while the actual anomaly detection happens asynchronously. Renaming it to something like ReportPropertyData would better reflect its purpose and improve the API's clarity.

Suggested change
service ExternalAnomalyDetector {
// Detects anomalies based on the provided data
rpc DetectAnomaly (AnomalyRequest) returns (AnomalyResponse);
service ExternalAnomalyDetector {
// Reports property data for anomaly detection
rpc ReportPropertyData (AnomalyRequest) returns (AnomalyResponse);

}

// Request message containing the anomaly detection data
message AnomalyRequest {
google.protobuf.Timestamp timestamp = 1; // Timestamp of the data
string deviceName = 2; // Name of the device
string namespace = 3; // Namespace of the device
string propertyName = 4; // Property name to check for anomalies
string propertyValue = 5; // Value of the property
}

// Response message containing the anomaly detection result
message AnomalyResponse {
bool success = 1; // Indicates if the upload was successful
int32 errorCode = 2; // Error code if any issue occurred
string message = 3; // Additional information or error message
}
37 changes: 37 additions & 0 deletions device-anomaly-detection-demo/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
version: '3.8'

services:
room-simulator:
image: huajuan6848/room-simulator:1.0.0
container_name: room-simulator
ports:
- "8000:8000"
networks:
- device-anomaly-detection-demo
restart: unless-stopped

external-anomaly-detector:
image: huajuan6848/external-anomaly-detector:1.0.0
container_name: external-anomaly-detector
ports:
- "6804:6804"
volumes:
- ~/.kube/config:/root/.kube/config:ro
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Mounting ~/.kube/config directly into the container can be a security risk as it exposes the host's Kubernetes credentials. While acceptable for a demo, consider more secure methods for production deployments, such as using Kubernetes Secrets or service accounts, or limiting the scope of the mounted config.

networks:
- device-anomaly-detection-demo
restart: unless-stopped

me-restful-mapper:
image: huajuan6848/me-restful-mapper:1.0.0
container_name: me-restful-mapper
volumes:
- /etc/kubeedge:/etc/kubeedge
- ./me-restful-mapper/config.yaml:/kubeedge/config.yaml
networks:
- device-anomaly-detection-demo
command: ["/kubeedge/main", "--config-file", "/kubeedge/config.yaml"]
restart: unless-stopped

networks:
device-anomaly-detection-demo:
driver: bridge
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.venv
34 changes: 34 additions & 0 deletions device-anomaly-detection-demo/external-anomaly-detector/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# If you prefer the allow list template instead of the deny list, see community template:
# https://github.com/github/gitignore/blob/main/community/Golang/Go.AllowList.gitignore
#
# Binaries for programs and plugins
*.exe
*.exe~
*.dll
*.so
*.dylib

# Test binary, built with `go test -c`
*.test

# Code coverage profiles and other test artifacts
*.out
coverage.*
*.coverprofile
profile.cov

# Dependency directories (remove the comment below to include it)
# vendor/

# Go workspace file
go.work
go.work.sum

# env file
.env

# Editor/IDE
# .idea/
# .vscode/

build
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.10
12 changes: 12 additions & 0 deletions device-anomaly-detection-demo/external-anomaly-detector/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# syntax=docker/dockerfile:1
FROM ghcr.io/astral-sh/uv:python3.10-bookworm-slim

WORKDIR /app

# Copy project files
COPY . .

# Install dependencies using uv
RUN uv sync --frozen --no-cache

CMD ["uv", "run", "python", "main.py"]
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash
# Build Docker image for external-anomaly-detector
set -e
IMAGE_NAME="huajuan6848/external-anomaly-detector:1.0.0"
docker build -t $IMAGE_NAME .
echo "Build complete: $IMAGE_NAME"
Loading