Claude Code Plugins

Community-maintained marketplace

Feedback
157
0

Use when deploying ANY machine learning model on-device, converting models to CoreML, compressing models, or implementing speech-to-text. Covers CoreML conversion, MLTensor, model compression (quantization/palettization/pruning), stateful models, KV-cache, multi-function models, async prediction, SpeechAnalyzer, SpeechTranscriber.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name axiom-ios-ml
description Use when deploying ANY machine learning model on-device, converting models to CoreML, compressing models, or implementing speech-to-text. Covers CoreML conversion, MLTensor, model compression (quantization/palettization/pruning), stateful models, KV-cache, multi-function models, async prediction, SpeechAnalyzer, SpeechTranscriber.

iOS Machine Learning Router

You MUST use this skill for ANY on-device machine learning or speech-to-text work.

When to Use

Use this router when:

  • Converting PyTorch/TensorFlow models to CoreML
  • Deploying ML models on-device
  • Compressing models (quantization, palettization, pruning)
  • Working with large language models (LLMs)
  • Implementing KV-cache for transformers
  • Using MLTensor for model stitching
  • Building speech-to-text features
  • Transcribing audio (live or recorded)

Routing Logic

CoreML Work

Implementation patterns/skill coreml

  • Model conversion workflow
  • MLTensor for model stitching
  • Stateful models with KV-cache
  • Multi-function models (adapters/LoRA)
  • Async prediction patterns
  • Compute unit selection

API reference/skill coreml-ref

  • CoreML Tools Python API
  • MLModel lifecycle
  • MLTensor operations
  • MLComputeDevice availability
  • State management APIs
  • Performance reports

Diagnostics/skill coreml-diag

  • Model won't load
  • Slow inference
  • Memory issues
  • Compression accuracy loss
  • Compute unit problems

Speech Work

Implementation patterns/skill speech

  • SpeechAnalyzer setup (iOS 26+)
  • SpeechTranscriber configuration
  • Live transcription
  • File transcription
  • Volatile vs finalized results
  • Model asset management

Decision Tree

User asks about on-device ML or speech
  ├─ Machine learning?
  │   ├─ Implementing/converting? → coreml
  │   ├─ Need API reference? → coreml-ref
  │   └─ Debugging issues? → coreml-diag
  └─ Speech-to-text?
      └─ Any speech work → speech

Critical Patterns

coreml:

  • Model conversion (PyTorch → CoreML)
  • Compression (palettization, quantization, pruning)
  • Stateful KV-cache for LLMs
  • Multi-function models for adapters
  • MLTensor for pipeline stitching
  • Async concurrent prediction

coreml-diag:

  • Load failures and caching
  • Inference performance issues
  • Memory pressure from models
  • Accuracy degradation from compression

speech:

  • SpeechAnalyzer + SpeechTranscriber setup
  • AssetInventory model management
  • Live transcription with volatile results
  • Audio format conversion

Example Invocations

User: "How do I convert a PyTorch model to CoreML?" → Invoke: /skill coreml

User: "Compress my model to fit on iPhone" → Invoke: /skill coreml

User: "Implement KV-cache for my language model" → Invoke: /skill coreml

User: "Model loads slowly on first launch" → Invoke: /skill coreml-diag

User: "My compressed model has bad accuracy" → Invoke: /skill coreml-diag

User: "Add live transcription to my app" → Invoke: /skill speech

User: "Transcribe audio files with SpeechAnalyzer" → Invoke: /skill speech

User: "What's MLTensor and how do I use it?" → Invoke: /skill coreml-ref