Skip to content

edgeengineer/edgeai

Repository files navigation

EdgeAI Swift Package

A Swift 6.1 package for deploying AI models on edge devices with support for TensorRT and ONNX Runtime.

Features

  • Swift 6 Concurrency: Built with Swift 6.1, leveraging modern concurrency features
  • Multiple Runtime Support: TensorRT and ONNX Runtime backends
  • Apple Silicon Optimized: Automatic Neural Engine and Metal acceleration
  • Type-Safe API: Strongly typed interfaces with Sendable conformance
  • Performance Monitoring: Built-in performance tracking and profiling
  • Platform Support: iOS 16+, macOS 13+, tvOS 16+, watchOS 9+, visionOS 1+

Installation

Add this package to your Xcode project or Package.swift:

dependencies: [
    .package(url: "https://github.com/yourusername/EdgeAI.git", from: "1.0.0")
]

Usage

Basic Example with ONNX Runtime

import EdgeAI
import ONNXRuntime

// Create and load model
let modelPath = Bundle.main.url(forResource: "model", withExtension: "onnx")!
let model = ONNXModel(modelPath: modelPath)
try await model.loadModel()

// Prepare input
let inputTensor = ONNXTensor(
    data: inputData,
    shape: [1, 224, 224, 3],
    dataType: .float32
)
let input = ONNXInput(tensors: ["input": inputTensor])

// Run inference
let output = try await model.predict(input)
print("Inference time: \(output.inferenceTime)s")

// Clean up
try await model.unloadModel()

TensorRT Example

import EdgeAI
import TensorRT

// Create TensorRT model
let model = TensorRTModel(modelPath: enginePath)
try await model.loadModel()

// Prepare input
let input = TensorRTInput(
    data: inputData,
    shape: [1, 3, 224, 224]
)

// Run inference
let output = try await model.predict(input)

Performance Monitoring

let monitor = PerformanceMonitor()

// Record inference metrics
await monitor.recordInference(
    modelName: "ResNet50",
    inferenceTime: output.inferenceTime,
    preprocessingTime: 0.01,
    postprocessingTime: 0.005
)

// Get average performance
let avgTime = await monitor.getAverageInferenceTime(for: "ResNet50")
print("Average inference time: \(avgTime ?? 0)s")

Model Loading with Options

// Automatic optimization for your platform
let model = ONNXModelWithOptions(
    modelPath: modelPath,
    options: .performanceOptions  // Auto-detects best providers
)

// Or specifically for Apple Silicon
let model = ONNXModelWithOptions(
    modelPath: modelPath,
    options: .appleSiliconOptimized  // M1/M2/M3 optimized
)

// Or prioritize Neural Engine
let model = ONNXModelWithOptions(
    modelPath: modelPath,
    options: .neuralEngineOptimized  // Maximum efficiency
)

Module Structure

  • EdgeAI: Core module with base protocols and types
  • TensorRT: NVIDIA TensorRT integration for optimized inference
  • ONNXRuntime: Microsoft ONNX Runtime integration

TensorRT Setup

Prerequisites

To use TensorRT functionality, you need:

  1. NVIDIA GPU with CUDA support
  2. CUDA Toolkit (11.0 or later)
  3. TensorRT (8.0 or later)

Installation on Linux/WSL

# Install CUDA (if not already installed)
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda

# Install TensorRT
# Download TensorRT from NVIDIA Developer (requires account)
# https://developer.nvidia.com/tensorrt

# Install TensorRT packages
sudo dpkg -i nvinfer*.deb
sudo apt-get update
sudo apt-get install -y libnvinfer-dev libnvonnxparser-dev libnvinfer-plugin-dev

Environment Setup

When using TensorRT with this package, you need to set the following environment variables:

# Add to your .bashrc or .zshrc
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda

Or set them before running your Swift application:

export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
swift run YourApp

Building with TensorRT

The package will automatically link to TensorRT libraries if they're installed. If TensorRT is not available, the package will still build but TensorRT functionality will return appropriate errors at runtime.

Converting Models to TensorRT

import TensorRT

// Convert ONNX model to TensorRT engine
let optimizer = TensorRTOptimizer()
let config = TensorRTOptimizer.OptimizationConfig(
    precision: .fp16,              // Use FP16 for better performance
    maxBatchSize: 8,              // Maximum batch size
    maxWorkspaceSize: 1 << 30     // 1GB workspace
)

try await optimizer.optimizeModel(
    inputPath: URL(fileURLWithPath: "model.onnx"),
    outputPath: URL(fileURLWithPath: "model.trt"),
    config: config
)

Requirements

  • Swift 6.1+
  • Xcode 16.0+ (for iOS/macOS development)
  • iOS 16.0+ / macOS 13.0+ / tvOS 16.0+ / watchOS 9.0+ / visionOS 1.0+

Optional Requirements

  • For TensorRT: NVIDIA GPU, CUDA 11.0+, TensorRT 8.0+
  • For ONNX Runtime: No additional requirements (CPU inference supported)

Troubleshooting

TensorRT Not Found

If you get errors about TensorRT libraries not being found:

  1. Verify TensorRT is installed:

    ldconfig -p | grep nvinfer
  2. Check library paths:

    ls /usr/lib/x86_64-linux-gnu/libnvinfer*
  3. Update library cache:

    sudo ldconfig

Build Errors

If the package fails to build:

  1. For Linux/WSL, ensure you have the development packages:

    sudo apt-get install libnvinfer-dev libnvonnxparser-dev
  2. The package includes fallback mock implementations, so it should build even without TensorRT installed.

License

This project is available under the MIT license.

About

A Cross Platform Swift 6.1 Library for running inference for AI models on the Edge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published