A Swift 6.1 package for deploying AI models on edge devices with support for TensorRT and ONNX Runtime.
- Swift 6 Concurrency: Built with Swift 6.1, leveraging modern concurrency features
- Multiple Runtime Support: TensorRT and ONNX Runtime backends
- Apple Silicon Optimized: Automatic Neural Engine and Metal acceleration
- Type-Safe API: Strongly typed interfaces with Sendable conformance
- Performance Monitoring: Built-in performance tracking and profiling
- Platform Support: iOS 16+, macOS 13+, tvOS 16+, watchOS 9+, visionOS 1+
Add this package to your Xcode project or Package.swift:
dependencies: [
.package(url: "https://github.com/yourusername/EdgeAI.git", from: "1.0.0")
]
import EdgeAI
import ONNXRuntime
// Create and load model
let modelPath = Bundle.main.url(forResource: "model", withExtension: "onnx")!
let model = ONNXModel(modelPath: modelPath)
try await model.loadModel()
// Prepare input
let inputTensor = ONNXTensor(
data: inputData,
shape: [1, 224, 224, 3],
dataType: .float32
)
let input = ONNXInput(tensors: ["input": inputTensor])
// Run inference
let output = try await model.predict(input)
print("Inference time: \(output.inferenceTime)s")
// Clean up
try await model.unloadModel()
import EdgeAI
import TensorRT
// Create TensorRT model
let model = TensorRTModel(modelPath: enginePath)
try await model.loadModel()
// Prepare input
let input = TensorRTInput(
data: inputData,
shape: [1, 3, 224, 224]
)
// Run inference
let output = try await model.predict(input)
let monitor = PerformanceMonitor()
// Record inference metrics
await monitor.recordInference(
modelName: "ResNet50",
inferenceTime: output.inferenceTime,
preprocessingTime: 0.01,
postprocessingTime: 0.005
)
// Get average performance
let avgTime = await monitor.getAverageInferenceTime(for: "ResNet50")
print("Average inference time: \(avgTime ?? 0)s")
// Automatic optimization for your platform
let model = ONNXModelWithOptions(
modelPath: modelPath,
options: .performanceOptions // Auto-detects best providers
)
// Or specifically for Apple Silicon
let model = ONNXModelWithOptions(
modelPath: modelPath,
options: .appleSiliconOptimized // M1/M2/M3 optimized
)
// Or prioritize Neural Engine
let model = ONNXModelWithOptions(
modelPath: modelPath,
options: .neuralEngineOptimized // Maximum efficiency
)
- EdgeAI: Core module with base protocols and types
- TensorRT: NVIDIA TensorRT integration for optimized inference
- ONNXRuntime: Microsoft ONNX Runtime integration
To use TensorRT functionality, you need:
- NVIDIA GPU with CUDA support
- CUDA Toolkit (11.0 or later)
- TensorRT (8.0 or later)
# Install CUDA (if not already installed)
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda
# Install TensorRT
# Download TensorRT from NVIDIA Developer (requires account)
# https://developer.nvidia.com/tensorrt
# Install TensorRT packages
sudo dpkg -i nvinfer*.deb
sudo apt-get update
sudo apt-get install -y libnvinfer-dev libnvonnxparser-dev libnvinfer-plugin-dev
When using TensorRT with this package, you need to set the following environment variables:
# Add to your .bashrc or .zshrc
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
Or set them before running your Swift application:
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
swift run YourApp
The package will automatically link to TensorRT libraries if they're installed. If TensorRT is not available, the package will still build but TensorRT functionality will return appropriate errors at runtime.
import TensorRT
// Convert ONNX model to TensorRT engine
let optimizer = TensorRTOptimizer()
let config = TensorRTOptimizer.OptimizationConfig(
precision: .fp16, // Use FP16 for better performance
maxBatchSize: 8, // Maximum batch size
maxWorkspaceSize: 1 << 30 // 1GB workspace
)
try await optimizer.optimizeModel(
inputPath: URL(fileURLWithPath: "model.onnx"),
outputPath: URL(fileURLWithPath: "model.trt"),
config: config
)
- Swift 6.1+
- Xcode 16.0+ (for iOS/macOS development)
- iOS 16.0+ / macOS 13.0+ / tvOS 16.0+ / watchOS 9.0+ / visionOS 1.0+
- For TensorRT: NVIDIA GPU, CUDA 11.0+, TensorRT 8.0+
- For ONNX Runtime: No additional requirements (CPU inference supported)
If you get errors about TensorRT libraries not being found:
-
Verify TensorRT is installed:
ldconfig -p | grep nvinfer
-
Check library paths:
ls /usr/lib/x86_64-linux-gnu/libnvinfer*
-
Update library cache:
sudo ldconfig
If the package fails to build:
-
For Linux/WSL, ensure you have the development packages:
sudo apt-get install libnvinfer-dev libnvonnxparser-dev
-
The package includes fallback mock implementations, so it should build even without TensorRT installed.
This project is available under the MIT license.