Regex Tester
Test and debug regular expressions with detailed match visualization, plain-English explanations, and pattern generation from examples.
Quick Start
from scripts.regex_tester import RegexTester
# Test pattern
tester = RegexTester()
result = tester.test(r"\d{3}-\d{4}", "Call 555-1234 today")
print(result['matches']) # ['555-1234']
# Explain pattern
explanation = tester.explain(r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b")
print(explanation)
# Generate pattern from examples
pattern = tester.generate_pattern(["555-1234", "123-4567", "999-0000"])
print(pattern) # r"\d{3}-\d{4}"
Features
- Pattern Testing: Test against text with detailed results
- Match Visualization: See exactly what matched and where
- Pattern Explanation: Plain-English breakdown of regex
- Pattern Generation: Create patterns from example matches
- Find & Replace: Test substitution patterns
- Common Patterns: Library of pre-built patterns
- Validation: Check pattern syntax before use
API Reference
Testing
tester = RegexTester()
# Basic test
result = tester.test(r"\d+", "There are 42 items")
# {
# 'pattern': r'\d+',
# 'text': 'There are 42 items',
# 'matches': ['42'],
# 'match_count': 1,
# 'positions': [(10, 12)], # (start, end)
# 'groups': []
# }
# With flags
result = tester.test(r"hello", "Hello World", ignore_case=True)
# All matches with groups
result = tester.test(r"(\d{3})-(\d{4})", "555-1234 and 999-0000")
# matches: ['555-1234', '999-0000']
# groups: [('555', '1234'), ('999', '0000')]
Explanation
explanation = tester.explain(r"\b\w+@\w+\.\w{2,}\b")
# Returns:
# \b - Word boundary
# \w+ - One or more word characters
# @ - Literal '@'
# \w+ - One or more word characters
# \. - Literal '.'
# \w{2,} - Two or more word characters
# \b - Word boundary
Pattern Generation
# Generate pattern from examples
examples = ["user@test.com", "admin@site.org", "info@company.net"]
pattern = tester.generate_pattern(examples)
# Suggests: r"[a-z]+@[a-z]+\.(com|org|net)"
# With negative examples (what NOT to match)
pattern = tester.generate_pattern(
positive=["555-1234", "999-0000"],
negative=["555-12", "12345"]
)
Find & Replace
result = tester.replace(
pattern=r"(\d{3})-(\d{4})",
replacement=r"(\1) \2",
text="Call 555-1234"
)
# "Call (555) 1234"
Validation
# Check if pattern is valid
is_valid, error = tester.validate(r"[invalid")
# is_valid: False
# error: "unterminated character set"
Common Patterns
# Get pre-built patterns
email = tester.patterns['email']
phone = tester.patterns['phone']
url = tester.patterns['url']
ip = tester.patterns['ipv4']
CLI Usage
# Test pattern
python regex_tester.py --pattern "\d+" --text "There are 42 items"
# Test from file
python regex_tester.py --pattern "\w+@\w+\.\w+" --file emails.txt
# Explain pattern
python regex_tester.py --explain "\b[A-Z][a-z]+\b"
# Find and replace
python regex_tester.py --pattern "(\d+)" --replace "[$1]" --text "Item 42"
# Generate pattern
python regex_tester.py --generate "555-1234,999-0000,123-4567"
# Interactive mode
python regex_tester.py --interactive
CLI Arguments
| Argument |
Description |
Default |
--pattern |
Regex pattern |
- |
--text |
Text to test against |
- |
--file |
File to test against |
- |
--explain |
Explain pattern |
False |
--replace |
Replacement pattern |
- |
--generate |
Generate from examples |
- |
--ignore-case |
Case insensitive |
False |
--multiline |
Multiline mode |
False |
--interactive |
Interactive mode |
False |
Examples
Email Validation
tester = RegexTester()
email_pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
test_emails = [
"user@example.com",
"invalid.email",
"user@domain",
"test.user+tag@sub.domain.org"
]
for email in test_emails:
result = tester.test(email_pattern, email)
status = "Valid" if result['matches'] else "Invalid"
print(f"{email}: {status}")
Extract Data from Log
tester = RegexTester()
log_line = '[2024-01-15 14:30:22] ERROR: Connection timeout (192.168.1.100)'
pattern = r'\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\] (\w+): (.+) \((\d+\.\d+\.\d+\.\d+)\)'
result = tester.test(pattern, log_line)
if result['groups']:
timestamp, level, message, ip = result['groups'][0]
print(f"Time: {timestamp}")
print(f"Level: {level}")
print(f"Message: {message}")
print(f"IP: {ip}")
Phone Number Formatter
tester = RegexTester()
# Normalize various phone formats
phones = ["5551234567", "(555) 123-4567", "555.123.4567"]
pattern = r"[\(\)\.\s-]?"
for phone in phones:
# Remove all formatting
clean = tester.replace(r"[\(\)\.\s-]", "", phone)
# Format consistently
formatted = tester.replace(
r"(\d{3})(\d{3})(\d{4})",
r"(\1) \2-\3",
clean
)
print(f"{phone} -> {formatted}")
Understand Complex Pattern
tester = RegexTester()
# Explain a complex pattern
pattern = r"(?:https?://)?(?:www\.)?([a-zA-Z0-9-]+)\.([a-zA-Z]{2,})(?:/\S*)?"
explanation = tester.explain(pattern)
print(explanation)
# (?:https?://)? - Optional non-capturing group: 'http://' or 'https://'
# (?:www\.)? - Optional non-capturing group: 'www.'
# ([a-zA-Z0-9-]+) - Capturing group 1: domain name
# \. - Literal '.'
# ([a-zA-Z]{2,}) - Capturing group 2: TLD
# (?:/\S*)? - Optional non-capturing group: path
Common Patterns Library
| Name |
Pattern |
Matches |
email |
Complex |
user@domain.com |
phone_us |
\d{3}[-.]?\d{3}[-.]?\d{4} |
555-123-4567 |
url |
Complex |
https://example.com |
ipv4 |
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} |
192.168.1.1 |
date_iso |
\d{4}-\d{2}-\d{2} |
2024-01-15 |
time_24h |
\d{2}:\d{2}(:\d{2})? |
14:30:00 |
hex_color |
#[0-9A-Fa-f]{6} |
#FF5733 |
zipcode_us |
\d{5}(-\d{4})? |
12345-6789 |
Dependencies
(No external dependencies - uses Python standard library re module)
Limitations
- Pattern generation is heuristic (may not be optimal)
- Some advanced regex features are Python-specific
- Explanation works best with common patterns
- Very long patterns may have simplified explanations