| name | adf-master |
| description | Comprehensive Azure Data Factory knowledge base with official documentation sources, CI/CD methods, deployment patterns, and troubleshooting resources |
Azure Data Factory Master Knowledge Base
🚨 CRITICAL GUIDELINES
Windows File Path Requirements
MANDATORY: Always Use Backslashes on Windows for File Paths
When using Edit or Write tools on Windows, you MUST use backslashes (\) in file paths, NOT forward slashes (/).
Examples:
- ❌ WRONG:
D:/repos/project/file.tsx - ✅ CORRECT:
D:\repos\project\file.tsx
This applies to:
- Edit tool file_path parameter
- Write tool file_path parameter
- All file operations on Windows systems
Documentation Guidelines
NEVER create new documentation files unless explicitly requested by the user.
- Priority: Update existing README.md files rather than creating new documentation
- Repository cleanliness: Keep repository root clean - only README.md unless user requests otherwise
- Style: Documentation should be concise, direct, and professional - avoid AI-generated tone
- User preference: Only create additional .md files when user specifically asks for documentation
This skill provides comprehensive reference information about Azure Data Factory, including official documentation sources, CI/CD deployment methods, and troubleshooting resources. Use this to access detailed ADF knowledge on-demand.
🚨 CRITICAL 2025 UPDATE: Deprecated Features### Apache Airflow Workflow Orchestration Manager - DEPRECATEDStatus: Available only for existing customers as of early 2025Retirement Date: Not yet announced, but feature is officially deprecatedImpact: New customers cannot provision Apache Airflow in Azure Data FactoryOfficial Deprecation Notice:- Apache Airflow Workflow Orchestration Manager is deprecated with no retirement date set- Only existing deployments can continue using this feature- No new Airflow integrations can be created in ADFMigration Path:- Recommended: Migrate to Fabric Data Factory with native Airflow support- Alternative: Use standalone Apache Airflow deployments (Azure Container Instances, AKS, or VM-based)- Alternative: Migrate orchestration logic to native ADF pipelines with control flow activitiesWhy Deprecated:- Microsoft focus shifted to Fabric Data Factory as the unified data integration platform- Fabric provides modern orchestration capabilities superseding Airflow integration- Limited adoption and maintenance burden for standalone Airflow feature in ADFAction Required:- If using Airflow in ADF: Plan migration within 12-18 months- For new projects: Do NOT use Airflow in ADF - use Fabric or native ADF patterns- Monitor Microsoft announcements for official retirement timelineReference:- Microsoft Roadmap: https://www.directionsonmicrosoft.com/roadmaps/ref/azure-data-factory-roadmap/## 🆕 2025 Feature Updates### Microsoft Fabric Integration (GA June 2025)ADF Mounting in Fabric:- Bring existing ADF pipelines into Fabric workspaces without rebuilding- General Availability as of June 2025- Seamless integration enables hybrid ADF + Fabric workflowsCross-Workspace Pipeline Orchestration:- New Invoke Pipeline activity supports cross-platform calls- Invoke pipelines across Fabric, Azure Data Factory, and Synapse- Managed VNet support for secure cross-workspace communicationVariable Libraries:- Environment-specific variables for CI/CD automation- Automatic value substitution during workspace promotion- Eliminates separate parameter files per environmentConnector Enhancements:- ServiceNow V2 (V1 End of Support)- Enhanced PostgreSQL and Snowflake connectors- Native OneLake connectivity for zero-copy integration### Node.js 20.x Requirement for CI/CDCRITICAL: As of 2025, npm package @microsoft/azure-data-factory-utilities requires Node.js 20.xBreaking Change:- Older Node.js versions (14.x, 16.x, 18.x) may cause package incompatibility errors- Update CI/CD pipelines to use Node.js 20.x or compatible versionsGitHub Actions:yaml- name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20.x'Azure DevOps:yaml- task: UseNode@1 inputs: version: '20.x'
🚨 CRITICAL 2025 UPDATE: Deprecated Features
Apache Airflow Workflow Orchestration Manager - DEPRECATED
Status: Available only for existing customers as of early 2025 Retirement Date: Not yet announced, but feature is officially deprecated Impact: New customers cannot provision Apache Airflow in Azure Data Factory
Official Deprecation Notice:
- Apache Airflow Workflow Orchestration Manager is deprecated with no retirement date set
- Only existing deployments can continue using this feature
- No new Airflow integrations can be created in ADF
Migration Path:
- Recommended: Migrate to Fabric Data Factory with native Airflow support
- Alternative: Use standalone Apache Airflow deployments (Azure Container Instances, AKS, or VM-based)
- Alternative: Migrate orchestration logic to native ADF pipelines with control flow activities
Why Deprecated:
- Microsoft focus shifted to Fabric Data Factory as the unified data integration platform
- Fabric provides modern orchestration capabilities superseding Airflow integration
- Limited adoption and maintenance burden for standalone Airflow feature in ADF
Action Required:
- If using Airflow in ADF: Plan migration within 12-18 months
- For new projects: Do NOT use Airflow in ADF - use Fabric or native ADF patterns
- Monitor Microsoft announcements for official retirement timeline
Reference:
🆕 2025 Feature Updates
Microsoft Fabric Integration (GA June 2025)
ADF Mounting in Fabric:
- Bring existing ADF pipelines into Fabric workspaces without rebuilding
- General Availability as of June 2025
- Seamless integration enables hybrid ADF + Fabric workflows
Cross-Workspace Pipeline Orchestration:
- New Invoke Pipeline activity supports cross-platform calls
- Invoke pipelines across Fabric, Azure Data Factory, and Synapse
- Managed VNet support for secure cross-workspace communication
Variable Libraries:
- Environment-specific variables for CI/CD automation
- Automatic value substitution during workspace promotion
- Eliminates separate parameter files per environment
Connector Enhancements:
- ServiceNow V2 (V1 End of Support)
- Enhanced PostgreSQL and Snowflake connectors
- Native OneLake connectivity for zero-copy integration
Node.js 20.x Requirement for CI/CD
CRITICAL: As of 2025, npm package @microsoft/azure-data-factory-utilities requires Node.js 20.x
Breaking Change:
- Older Node.js versions (14.x, 16.x, 18.x) may cause package incompatibility errors
- Update CI/CD pipelines to use Node.js 20.x or compatible versions
GitHub Actions:
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20.x'
Azure DevOps:
- task: UseNode@1
inputs:
version: '20.x'
Official Documentation Sources
Primary Microsoft Learn Resources
Main Documentation Hub:
- URL: https://learn.microsoft.com/en-us/azure/data-factory/
- Last Updated: February 2025
- Coverage: Complete ADF documentation including tutorials, concepts, how-to guides, and reference materials
- Key Topics: Pipelines, datasets, triggers, linked services, data flows, integration runtimes, monitoring
Introduction to Azure Data Factory:
- URL: https://learn.microsoft.com/en-us/azure/data-factory/introduction
- Summary: Managed cloud service for complex hybrid ETL, ELT, and data integration projects
- Key Features: 90+ built-in connectors, serverless architecture, code-free UI, single-pane monitoring
Context7 Library Documentation
Library ID: /websites/learn_microsoft_en-us_azure_data-factory
- Trust Score: 7.5
- Code Snippets: 10,839
- Topics: CI/CD, ARM templates, pipeline patterns, data flows, monitoring, troubleshooting
How to Access:
Use Context7 MCP tool to fetch latest documentation:
mcp__context7__get-library-docs:
- context7CompatibleLibraryID: /websites/learn_microsoft_en-us_azure_data-factory
- topic: "CI/CD continuous integration deployment pipelines ARM templates"
- tokens: 8000
CI/CD Deployment Methods
Modern Automated Approach (Recommended)
npm Package: @microsoft/azure-data-factory-utilities
- Latest Version: 1.0.3+ (check npm for current version)
- npm URL: https://www.npmjs.com/package/@microsoft/azure-data-factory-utilities
- Node.js Requirement: Version 20.x or compatible
Key Features:
- Validates ADF resources independently of service
- Generates ARM templates programmatically
- Enables true CI/CD without manual publish button
- Supports preview mode for selective trigger management
package.json Configuration:
{
"scripts": {
"build": "node node_modules/@microsoft/azure-data-factory-utilities/lib/index",
"build-preview": "node node_modules/@microsoft/azure-data-factory-utilities/lib/index --preview"
},
"dependencies": {
"@microsoft/azure-data-factory-utilities": "^1.0.3"
}
}
Commands:
# Validate resources
npm run build validate <rootFolder> <factoryId>
# Generate ARM templates
npm run build export <rootFolder> <factoryId> [outputFolder]
# Preview mode (only stop/start modified triggers)
npm run build-preview export <rootFolder> <factoryId> [outputFolder]
Official Documentation:
- URL: https://learn.microsoft.com/en-us/azure/data-factory/continuous-integration-delivery-improvements
- Last Updated: January 2025
- Topics: Setup, configuration, build commands, CI/CD integration
Traditional Manual Approach (Legacy)
Method: Git integration + Publish button
Process:
- Configure Git integration in ADF UI (Dev environment only)
- Make changes in ADF Studio
- Click "Publish" button to generate ARM templates
- Templates saved to
adf_publishbranch - Release pipelines deploy from
adf_publishbranch
When to Use:
- Migrating from existing setup
- No build pipeline infrastructure
- Simple deployments without validation
Limitations:
- Requires manual publish action
- No validation until publish
- Not true CI/CD (manual step required)
- Can't validate on pull requests
Migration Path: Modern approach recommended for new implementations
ARM Template Deployment
PowerShell Deployment
Primary Command: New-AzResourceGroupDeployment
Syntax:
New-AzResourceGroupDeployment `
-ResourceGroupName "<resource-group-name>" `
-TemplateFile "ARMTemplateForFactory.json" `
-TemplateParameterFile "ARMTemplateParametersForFactory.<environment>.json" `
-factoryName "<factory-name>" `
-Mode Incremental `
-Verbose
Validation:
Test-AzResourceGroupDeployment `
-ResourceGroupName "<resource-group-name>" `
-TemplateFile "ARMTemplateForFactory.json" `
-TemplateParameterFile "ARMTemplateParametersForFactory.<environment>.json" `
-factoryName "<factory-name>"
What-If Analysis:
New-AzResourceGroupDeployment `
-ResourceGroupName "<resource-group-name>" `
-TemplateFile "ARMTemplateForFactory.json" `
-TemplateParameterFile "ARMTemplateParametersForFactory.<environment>.json" `
-factoryName "<factory-name>" `
-WhatIf
Azure CLI Deployment
Primary Command: az deployment group create
Syntax:
az deployment group create \
--resource-group <resource-group-name> \
--template-file ARMTemplateForFactory.json \
--parameters ARMTemplateParametersForFactory.<environment>.json \
--parameters factoryName=<factory-name> \
--mode Incremental
Validation:
az deployment group validate \
--resource-group <resource-group-name> \
--template-file ARMTemplateForFactory.json \
--parameters ARMTemplateParametersForFactory.<environment>.json \
--parameters factoryName=<factory-name>
What-If Analysis:
az deployment group what-if \
--resource-group <resource-group-name> \
--template-file ARMTemplateForFactory.json \
--parameters ARMTemplateParametersForFactory.<environment>.json \
--parameters factoryName=<factory-name>
PrePostDeploymentScript
Current Version: Ver2
Key Improvement in Ver2:
- Turns off/on ONLY triggers that have been modified
- Ver1 stopped/started ALL triggers (slower, more disruptive)
- Compares trigger payloads to determine changes
Download Command:
# Linux/macOS/Git Bash
curl -o PrePostDeploymentScript.Ver2.ps1 https://raw.githubusercontent.com/Azure/Azure-DataFactory/main/SamplesV2/ContinuousIntegrationAndDelivery/PrePostDeploymentScript.Ver2.ps1
# PowerShell
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Azure/Azure-DataFactory/main/SamplesV2/ContinuousIntegrationAndDelivery/PrePostDeploymentScript.Ver2.ps1" -OutFile "PrePostDeploymentScript.Ver2.ps1"
Parameters
Pre-Deployment (Stop Triggers):
./PrePostDeploymentScript.Ver2.ps1 `
-armTemplate "<path-to-ARMTemplateForFactory.json>" `
-ResourceGroupName "<resource-group-name>" `
-DataFactoryName "<factory-name>" `
-predeployment $true `
-deleteDeployment $false
Post-Deployment (Start Triggers & Cleanup):
./PrePostDeploymentScript.Ver2.ps1 `
-armTemplate "<path-to-ARMTemplateForFactory.json>" `
-ResourceGroupName "<resource-group-name>" `
-DataFactoryName "<factory-name>" `
-predeployment $false `
-deleteDeployment $true
PowerShell Requirements
Version: PowerShell Core (7.0+) recommended
- Azure DevOps: Use
pwsh: truein AzurePowerShell@5 task - Locally: Use
pwshcommand, notpowershell
Modules Required:
- Az.DataFactory
- Az.Resources
Official Documentation:
- URL: https://learn.microsoft.com/en-us/azure/data-factory/continuous-integration-delivery-sample-script
- Last Updated: January 2025
GitHub Actions CI/CD
Official Resources
Medium Article (Recent 2025):
- URL: https://medium.com/microsoftazure/azure-data-factory-build-and-deploy-with-new-ci-cd-flow-using-github-actions-cd46c95054e0
- Author: Jared Zagelbaum (Microsoft Azure)
- Topics: Modern CI/CD flow, npm package usage, GitHub Actions setup
Microsoft Community Hub:
- URL: https://techcommunity.microsoft.com/blog/fasttrackforazureblog/azure-data-factory-cicd-with-github-actions/3768493
- Topics: End-to-end GitHub Actions setup, workload identity federation
Community Blog (February 2025):
- URL: https://linusdata.blog/2025/03/14/automating-azure-data-factory-deployments-with-github-actions/
- Topics: Practical implementation guide, troubleshooting tips
Key GitHub Actions
Essential Actions:
actions/checkout@v4- Checkout repositoryactions/setup-node@v4- Setup Node.jsactions/upload-artifact@v4- Publish ARM templatesactions/download-artifact@v4- Download ARM templates in deploy workflowazure/login@v2- Authenticate to Azureazure/arm-deploy@v2- Deploy ARM templatesazure/powershell@v2- Run PrePostDeploymentScript
Authentication Methods
Service Principal (JSON credentials):
{
"clientId": "<GUID>",
"clientSecret": "<STRING>",
"subscriptionId": "<GUID>",
"tenantId": "<GUID>"
}
Store in GitHub secret: AZURE_CREDENTIALS
Workload Identity Federation (More secure):
- No secrets stored
- Uses OIDC (OpenID Connect)
- Recommended for production
- Setup: https://learn.microsoft.com/en-us/azure/developer/github/connect-from-azure
Azure DevOps CI/CD
Official Resources
Microsoft Learn:
- URL: https://learn.microsoft.com/en-us/azure/data-factory/continuous-integration-delivery-automate-azure-pipelines
- Topics: Build pipeline, release pipeline, service connections, variable groups
Community Guides:
- Adam Marczak Blog: https://marczak.io/posts/2023/02/quick-cicd-for-data-factory/
- Topics: Quick setup, best practices, folder structure
Towards Data Science:
- URL: https://towardsdatascience.com/azure-data-factory-ci-cd-made-simple-building-and-deploying-your-arm-templates-with-azure-devops-30c30595afa5
- Topics: ARM template build and deployment workflow
Key Azure DevOps Tasks
Build Pipeline Tasks:
UseNode@1- Install Node.jsNpm@1- Install packages, run build commandsPublishPipelineArtifact@1- Publish ARM templates
Release Pipeline Tasks:
DownloadPipelineArtifact@2- Download ARM templatesAzurePowerShell@5- Run PrePostDeploymentScriptAzureResourceManagerTemplateDeployment@3- Deploy ARM template
Service Connection Requirements
Permissions Needed:
- Data Factory Contributor (on all Data Factories)
- Contributor (on Resource Groups)
- Key Vault access policies (if using secrets)
Configuration:
- Project Settings → Service connections → New service connection
- Type: Azure Resource Manager
- Authentication: Service Principal (recommended) or Managed Identity
Troubleshooting Resources
Official Troubleshooting Guide
URL: https://learn.microsoft.com/en-us/azure/data-factory/ci-cd-github-troubleshoot-guide Last Updated: January 2025
Common Issues Covered:
- Template parameter validation errors
- Integration Runtime type cannot be changed
- ARM template size exceeds 4MB limit
- Git connection problems
- Authentication failures
- Deployment errors
Diagnostic Logs
Enable Diagnostic Settings:
Azure Portal → Data Factory → Diagnostic settings → Add diagnostic setting
Send to: Log Analytics workspace
Logs to Enable:
- PipelineRuns
- TriggerRuns
- ActivityRuns
- SandboxPipelineRuns
- SandboxActivityRuns
Kusto Queries for Troubleshooting:
// Failed pipeline runs in last 24 hours
ADFPipelineRun
| where Status == "Failed"
| where TimeGenerated > ago(24h)
| project TimeGenerated, PipelineName, RunId, Status, ErrorMessage, Parameters
| order by TimeGenerated desc
// Failed CI/CD deployments
ADFActivityRun
| where ActivityType == "ExecutePipeline"
| where Status == "Failed"
| where TimeGenerated > ago(7d)
| project TimeGenerated, PipelineName, ActivityName, ErrorCode, ErrorMessage
| order by TimeGenerated desc
// Performance analysis
ADFActivityRun
| where TimeGenerated > ago(7d)
| extend DurationMinutes = datetime_diff('minute', End, Start)
| summarize AvgDuration = avg(DurationMinutes) by ActivityType, ActivityName
| where AvgDuration > 10
| order by AvgDuration desc
Common Error Patterns
Error: "Template parameters are not valid"
- Cause: Deleted triggers still referenced in parameters
- Solution: Regenerate ARM template or use PrePostDeploymentScript cleanup
Error: "Updating property type is not supported"
- Cause: Trying to change Integration Runtime type
- Solution: Delete and recreate IR (not in-place update)
Error: "Operation timed out"
- Cause: Network connectivity, large data volume, insufficient compute
- Solution: Increase timeout, optimize query, increase DIUs
Error: "Authentication failed"
- Cause: Service principal expired, missing permissions, wrong credentials
- Solution: Verify credentials, check role assignments, renew if expired
Best Practices
Repository Structure
Recommended Folder Layout:
repository-root/
├── adf-resources/ # ADF JSON files (if using npm approach)
│ ├── dataset/
│ ├── pipeline/
│ ├── trigger/
│ ├── linkedService/
│ └── integrationRuntime/
├── .github/
│ └── workflows/ # GitHub Actions workflows
│ ├── adf-build.yml
│ └── adf-deploy.yml
├── azure-pipelines/ # Azure DevOps pipelines
│ ├── build.yml
│ └── release.yml
├── parameters/ # Environment-specific parameters
│ ├── ARMTemplateParametersForFactory.dev.json
│ ├── ARMTemplateParametersForFactory.test.json
│ └── ARMTemplateParametersForFactory.prod.json
├── package.json # npm configuration
└── README.md
Git Configuration
Only Configure Git on Development ADF:
- Development: Git-integrated for source control
- Test: CI/CD deployment only (no Git)
- Production: CI/CD deployment only (no Git)
Rationale: Prevents accidental manual changes in higher environments
Multi-Environment Strategy
Environment Flow:
Dev (Git) → Build → Test → Approval → Production
↓
ARM Templates
Parameter Management:
- Separate parameter file per environment
- Store secrets in Azure Key Vault
- Reference Key Vault in parameter files
- Never commit secrets to source control
Monitoring and Alerting
Set up alerts for:
- Build pipeline failures
- Deployment failures
- Pipeline run failures
- Performance degradation
- Cost anomalies
Recommended Tools:
- Azure Monitor (Metrics and Alerts)
- Log Analytics (Kusto queries)
- Application Insights (for custom logging)
- Azure Advisor (optimization recommendations)
Additional Resources
GitHub Repositories
Official Azure Data Factory Samples:
- URL: https://github.com/Azure/Azure-DataFactory
- Path: SamplesV2/ContinuousIntegrationAndDelivery/
- Contents: PrePostDeploymentScript.Ver2.ps1, example pipelines, documentation
Community Examples:
- Search GitHub for "azure-data-factory-cicd" for real-world examples
- Many organizations publish their CI/CD patterns as reference
Community Support
Microsoft Q&A:
- URL: https://learn.microsoft.com/en-us/answers/tags/130/azure-data-factory
- Active community, Microsoft employees respond
Stack Overflow:
- Tag:
azure-data-factory - Large knowledge base of resolved issues
Azure Status:
- URL: https://status.azure.com
- Check for service outages and incidents
When to Fetch Latest Information
Situations requiring current documentation:
- npm package version updates
- New ADF features or activities
- Changes to ARM template schema
- Updates to PrePostDeploymentScript
- New GitHub Actions or Azure DevOps tasks
- Breaking changes or deprecations
How to Fetch:
- Use WebFetch for Microsoft Learn articles
- Check npm for latest package version
- Use Context7 for comprehensive topic coverage
- Review Azure Data Factory GitHub for script updates
This knowledge base should be your starting point for all Azure Data Factory questions. Always verify critical information with the latest official documentation when making production decisions.