Claude Code Plugins

Community-maintained marketplace

Feedback

|

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name langchain
description LangChain framework for LLM applications. Covers model wrappers (HuggingFace, Ollama), prompt templates, few-shot learning, output parsing, and chaining techniques for building sophisticated LLM workflows.

LangChain Framework

Overview

LangChain is a framework for building LLM applications. It provides abstractions for prompts, models, chains, and output parsing that work with both local models (HuggingFace, Ollama) and cloud APIs (OpenAI, Anthropic).

Quick Reference

Component Purpose
ChatOpenAI Connect to Ollama (OpenAI-compatible)
HuggingFacePipeline Wrap local HuggingFace models
ChatHuggingFace Chat interface for HF models
PromptTemplate Single-string prompt formatting
ChatPromptTemplate Multi-message prompt formatting
PydanticOutputParser Structured output parsing

Model Wrappers

Ollama via OpenAI-Compatible API

import os
from langchain_openai import ChatOpenAI

OLLAMA_HOST = os.getenv("OLLAMA_HOST", "http://ollama:11434")
MODEL = "hf.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO-GGUF:Q4_K_M"

llm = ChatOpenAI(
    base_url=f"{OLLAMA_HOST}/v1",
    api_key="ollama",  # Required by library, ignored by Ollama
    model=MODEL,
    temperature=0.7,
    max_tokens=150
)

response = llm.invoke("What is Python?")
print(response.content)

HuggingFace Local Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline
from langchain_huggingface import HuggingFacePipeline, ChatHuggingFace

HF_MODEL = "NousResearch/Nous-Hermes-2-Mistral-7B-DPO"

# 4-bit quantization for memory efficiency
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(HF_MODEL)
model = AutoModelForCausalLM.from_pretrained(
    HF_MODEL,
    device_map="auto",
    quantization_config=quantization_config
)

# Create pipeline
text_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=150,
    return_full_text=False
)

# Wrap for LangChain
llm = HuggingFacePipeline(pipeline=text_pipeline)
chat_llm = ChatHuggingFace(llm=llm)

LLM Methods

invoke() - Single Input

response = llm.invoke("Tell me a fact about Mars.")
print(response)

batch() - Multiple Inputs

prompts = ["Tell me a joke", "Translate to German: Hello!"]
results = llm.batch(prompts)

for prompt, result in zip(prompts, results):
    print(f"Prompt: {prompt}")
    print(f"Response: {result}\n")

generate() - With Metadata

results = llm.generate(["Where should I go for a Safari?"])

for gen in results.generations:
    print(gen[0].text)

# Access token counts
print(results.llm_output)

stream() - Token Streaming

for chunk in llm.stream("Tell me a story about a cat."):
    print(chunk, end="", flush=True)

Prompt Templates

Basic PromptTemplate

from langchain_core.prompts import PromptTemplate

template = PromptTemplate(
    input_variables=["topic"],
    template="Explain {topic} in simple terms."
)

formatted = template.format(topic="quantum computing")
response = llm.invoke(formatted)

ChatPromptTemplate

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import SystemMessage, HumanMessage

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful legal translator."),
    ("human", "Simplify this legal text: {legal_text}")
])

messages = chat_prompt.format_messages(legal_text="...")
response = chat_llm.invoke(messages)

Few-Shot Learning

from langchain_core.prompts import ChatPromptTemplate

# Define examples
examples = [
    {"input": "Legal term 1", "output": "Plain explanation 1"},
    {"input": "Legal term 2", "output": "Plain explanation 2"}
]

# Build few-shot prompt
messages = [
    ("system", "Translate legal terms to plain language.")
]
for ex in examples:
    messages.append(("human", ex["input"]))
    messages.append(("assistant", ex["output"]))
messages.append(("human", "{new_input}"))

few_shot_prompt = ChatPromptTemplate.from_messages(messages)

Output Parsing

Pydantic Parser

from pydantic import BaseModel, Field
from langchain.output_parsers import PydanticOutputParser

class LegalClause(BaseModel):
    parties: list[str] = Field(description="Parties involved")
    obligations: str = Field(description="Main obligations")
    conditions: str = Field(description="Key conditions")

parser = PydanticOutputParser(pydantic_object=LegalClause)

prompt = PromptTemplate(
    input_variables=["clause"],
    template="Parse this legal clause:\n{clause}\n\n{format_instructions}",
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

formatted = prompt.format(clause="...")
response = llm.invoke(formatted)
parsed = parser.parse(response)

print(parsed.parties)
print(parsed.obligations)

Chaining

Sequential Chain (Pipe Syntax)

from langchain_core.prompts import PromptTemplate

# Define chains
template1 = "Give a bullet point outline for a blog about {topic}"
template2 = "Write a blog post from this outline:\n{outline}"

chain1 = PromptTemplate.from_template(template1) | llm
chain2 = PromptTemplate.from_template(template2) | llm

# Compose
full_chain = chain1 | chain2

result = full_chain.invoke({"topic": "AI"})

Multi-Step Processing

template1 = "Summarize this review:\n{review}"
template2 = "Identify weaknesses:\n{summary}"
template3 = "Create improvement plan:\n{weaknesses}"

chain_1 = PromptTemplate.from_template(template1) | llm
chain_2 = PromptTemplate.from_template(template2) | llm
chain_3 = PromptTemplate.from_template(template3) | llm

full_chain = chain_1 | chain_2 | chain_3
result = full_chain.invoke(employee_review)

Router Chain

from langchain.chains.router import MultiPromptChain

beginner_template = "Explain {input} simply for a child."
expert_template = "Explain {input} technically for an expert."

prompt_infos = [
    {"name": "beginner", "description": "For simple questions", "prompt_template": beginner_template},
    {"name": "expert", "description": "For technical questions", "prompt_template": expert_template}
]

chain = MultiPromptChain.from_prompts(llm, prompt_infos, verbose=True)
result = chain.invoke("How do Feynman diagrams work?")

Caching

import langchain
from langchain.cache import SQLiteCache

langchain.llm_cache = SQLiteCache(database_path=".langchain.db")

# First call - hits LLM
response1 = llm.invoke("What is Python?")

# Second call - uses cache (instant)
response2 = llm.invoke("What is Python?")

Messages

from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="What is 2+2?"),
    AIMessage(content="4"),
    HumanMessage(content="And times 3?")
]

response = chat_llm.invoke(messages)

When to Use This Skill

Use when:

  • Building LLM applications with structured workflows
  • Need prompt templating and variable substitution
  • Chaining multiple LLM calls together
  • Parsing structured output from LLMs
  • Working with both local and cloud models

Cross-References

  • bazzite-ai-jupyter:rag - RAG pipelines using LangChain
  • bazzite-ai-jupyter:evaluation - LLM evaluation
  • bazzite-ai-ollama:openai - Ollama OpenAI compatibility
  • bazzite-ai-ollama:python - Native Ollama Python library