Claude Code Plugins

Community-maintained marketplace

Feedback

Naive Bayes classifier for probabilistic classification

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name td-naive-bayes
description Naive Bayes classifier for probabilistic classification

Teradata Naive Bayes Classifier

Skill Name Teradata Naive Bayes Classifier
Description Naive Bayes classifier for probabilistic classification
Category Classification Analytics
Function TD_NaiveBayes

Core Capabilities

  • Complete analytical workflow from data exploration to model deployment
  • Automated preprocessing including scaling, encoding, and train-test splitting
  • Advanced TD_NaiveBayes implementation with parameter optimization
  • Comprehensive evaluation metrics and model validation
  • Production-ready SQL generation with proper table management
  • Error handling and data quality checks throughout the pipeline
  • Business-focused interpretation of analytical results

Table Analysis Workflow

This skill automatically analyzes your provided table to generate optimized SQL workflows. Here's how it works:

1. Table Structure Analysis

  • Column Detection: Automatically identifies all columns and their data types
  • Data Type Classification: Distinguishes between numeric, categorical, and text columns
  • Primary Key Identification: Detects unique identifier columns
  • Missing Value Assessment: Analyzes data completeness

2. Feature Engineering Recommendations

  • Numeric Features: Identifies columns suitable for scaling and normalization
  • Categorical Features: Detects columns requiring encoding (one-hot, label encoding)
  • Target Variable: Helps identify the dependent variable for modeling
  • Feature Selection: Recommends relevant features based on data types

3. SQL Generation Process

  • Dynamic Column Lists: Generates column lists based on your table structure
  • Parameterized Queries: Creates flexible SQL templates using your table schema
  • Table Name Integration: Replaces placeholders with your actual table names
  • Database Context: Adapts to your database and schema naming conventions

How to Use This Skill

  1. Provide Your Table Information:

    "Analyze table: database_name.table_name"
    or
    "Use table: my_data with target column: target_var"
    
  2. The Skill Will:

    • Query your table structure using SHOW COLUMNS FROM table_name
    • Analyze data types and suggest appropriate preprocessing
    • Generate complete SQL workflow with your specific column names
    • Provide optimized parameters based on your data characteristics

Input Requirements

Data Requirements

  • Source table: Teradata table with analytical data
  • Target column: Dependent variable for classification analysis
  • Feature columns: Independent variables (numeric and categorical)
  • ID column: Unique identifier for record tracking
  • Minimum sample size: 100+ records for reliable classification modeling

Technical Requirements

  • Teradata Vantage with ClearScape Analytics enabled
  • Database permissions: CREATE, DROP, SELECT on working database
  • Function access: TD_NaiveBayes, TD_NaiveBayesPredict

Output Formats

Generated Tables

  • Preprocessed data tables with proper scaling and encoding
  • Train/test split tables for model validation
  • Model table containing trained TD_NaiveBayes parameters
  • Prediction results with confidence metrics
  • Evaluation metrics table with performance statistics

SQL Scripts

  • Complete workflow scripts ready for execution
  • Parameterized queries for different datasets
  • Table management with proper cleanup procedures

Classification Use Cases Supported

  1. Text classification: Comprehensive analysis workflow
  2. Probabilistic prediction: Comprehensive analysis workflow
  3. Feature independence: Comprehensive analysis workflow

Best Practices Applied

  • Data validation before analysis execution
  • Proper feature scaling and categorical encoding
  • Train-test splitting with stratification when appropriate
  • Cross-validation for robust model evaluation
  • Parameter optimization using systematic approaches
  • Residual analysis and diagnostic checks
  • Business interpretation of statistical results
  • Documentation of methodology and assumptions

Example Usage

-- Example workflow for Teradata Naive Bayes Classifier
-- Replace 'your_table' with actual table name

-- 1. Data exploration and validation
SELECT COUNT(*),
       COUNT(DISTINCT your_id_column),
       AVG(your_target_column),
       STDDEV(your_target_column)
FROM your_database.your_table;

-- 2. Execute complete classification workflow
-- (Detailed SQL provided by the skill)

Scripts Included

Core Analytics Scripts

  • preprocessing.sql: Data preparation and feature engineering
  • table_analysis.sql: Automatic table structure analysis
  • complete_workflow_template.sql: End-to-end workflow template
  • model_training.sql: TD_NaiveBayes training procedures
  • prediction.sql: TD_NaiveBayesPredict execution
  • evaluation.sql: Model validation and metrics calculation

Utility Scripts

  • data_quality_checks.sql: Comprehensive data validation
  • parameter_tuning.sql: Systematic parameter optimization
  • diagnostic_queries.sql: Model diagnostics and interpretation

Limitations and Disclaimers

  • Data quality: Results depend on input data quality and completeness
  • Sample size: Minimum sample size requirements for reliable results
  • Feature selection: Manual feature engineering may be required
  • Computational resources: Large datasets may require optimization
  • Business context: Statistical results require domain expertise for interpretation
  • Model assumptions: Understand underlying mathematical assumptions

Quality Checks

Automated Validations

  • Data completeness verification before analysis
  • Statistical assumptions testing where applicable
  • Model convergence monitoring during training
  • Prediction quality assessment using validation data
  • Performance metrics calculation and interpretation

Manual Review Points

  • Feature selection appropriateness for business problem
  • Model interpretation alignment with domain knowledge
  • Results validation against business expectations
  • Documentation completeness for reproducibility

Updates and Maintenance

  • Version compatibility: Tested with latest Teradata Vantage releases
  • Performance optimization: Regular query performance reviews
  • Best practices: Updated based on analytics community feedback
  • Documentation: Maintained with latest ClearScape Analytics features
  • Examples: Updated with real-world use cases and scenarios

This skill provides production-ready classification analytics using Teradata ClearScape Analytics TD_NaiveBayes with comprehensive data science best practices.