Module: SchemaGraphy::RegexpUtils
- Defined in:
- lib/schemagraphy/regexp_utils.rb
Overview
A utility module for robustly parsing and using regular expressions.
It handles various formats, including literals and plain strings,
and provides helpers for extracting captured content.
Class Method Summary collapse
-
.create_regexp(pattern, flags = '') ⇒ Regexp
Create a Regexp object from a pattern string and explicit flags.
-
.extract_all_captures(text, pattern_info) ⇒ Hash, ...
Extract all named capture groups as a hash or positional captures as an array.
-
.extract_capture(text, pattern_info, capture_name = nil) ⇒ String?
Extract content using named or positional capture groups.
-
.extract_flags_from_regexp(regexp) ⇒ String
Extract a flags string from a compiled Regexp object.
-
.flags_to_options(flags) ⇒ Integer
Convert a flags string (ex: "im") to a Regexp options integer.
-
.parse_and_extract(text, pattern_input, capture_name = nil, default_flags = '') ⇒ String?
A convenience method that combines parsing and a single extraction.
-
.parse_and_extract_all(text, pattern_input, default_flags = '') ⇒ Hash, ...
A convenience method that combines parsing and extraction of all captures.
-
.parse_pattern(input, default_flags = '') ⇒ Hash?
Parse a regex pattern string using the `to_regexp` gem for robust parsing.
-
.parse_structured_pattern(pattern_hash) ⇒ Object
Future enhancement to parse structured pattern definitions from a Hash.
-
.parse_tagged_pattern(tagged_input, tag_type) ⇒ Object
Future enhancement to parse custom YAML tags for regular expressions.
Class Method Details
.create_regexp(pattern, flags = '') ⇒ Regexp
Create a Regexp object from a pattern string and explicit flags.
137 138 139 140 |
# File 'lib/schemagraphy/regexp_utils.rb', line 137 def create_regexp pattern, flags = '' = (flags) Regexp.new(pattern, ) end |
.extract_all_captures(text, pattern_info) ⇒ Hash, ...
Extract all named capture groups as a hash or positional captures as an array.
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 |
# File 'lib/schemagraphy/regexp_utils.rb', line 173 def extract_all_captures text, pattern_info return nil unless text && pattern_info regexp = pattern_info[:regexp] match = text.match(regexp) return nil unless match if match.names.any? # Return hash of named captures match.names.each_with_object({}) do |name, captures| captures[name] = match[name] end else # Return array of positional captures match.captures end end |
.extract_capture(text, pattern_info, capture_name = nil) ⇒ String?
Extract content using named or positional capture groups.
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
# File 'lib/schemagraphy/regexp_utils.rb', line 148 def extract_capture text, pattern_info, capture_name = nil return nil unless text && pattern_info regexp = pattern_info[:regexp] match = text.match(regexp) return nil unless match if capture_name && match.names.include?(capture_name.to_s) # Extract named capture group match[capture_name.to_s] elsif match.captures.any? # Extract first capture group match[1] else # Return the entire match match[0] end end |
.extract_flags_from_regexp(regexp) ⇒ String
Extract a flags string from a compiled Regexp object.
124 125 126 127 128 129 130 |
# File 'lib/schemagraphy/regexp_utils.rb', line 124 def extract_flags_from_regexp regexp flags = '' flags += 'i' if regexp..anybits?(Regexp::IGNORECASE) flags += 'm' if regexp..anybits?(Regexp::MULTILINE) flags += 'x' if regexp..anybits?(Regexp::EXTENDED) flags end |
.flags_to_options(flags) ⇒ Integer
Convert a flags string (ex: "im") to a Regexp options integer.
106 107 108 109 110 111 112 113 114 115 116 117 118 |
# File 'lib/schemagraphy/regexp_utils.rb', line 106 def flags = 0 flags = flags.to_s |= Regexp::IGNORECASE if flags.include?('i') |= Regexp::MULTILINE if flags.include?('m') |= Regexp::EXTENDED if flags.include?('x') # NOTE: 'g' (global) and 'o' (once) are not standard Ruby flags # encoding flags ('n', 'e', 's', 'u') are handled by to_regexp end |
.parse_and_extract(text, pattern_input, capture_name = nil, default_flags = '') ⇒ String?
A convenience method that combines parsing and a single extraction.
199 200 201 202 |
# File 'lib/schemagraphy/regexp_utils.rb', line 199 def parse_and_extract text, pattern_input, capture_name = nil, default_flags = '' pattern_info = parse_pattern(pattern_input, default_flags) extract_capture(text, pattern_info, capture_name) end |
.parse_and_extract_all(text, pattern_input, default_flags = '') ⇒ Hash, ...
A convenience method that combines parsing and extraction of all captures.
210 211 212 213 |
# File 'lib/schemagraphy/regexp_utils.rb', line 210 def parse_and_extract_all text, pattern_input, default_flags = '' pattern_info = parse_pattern(pattern_input, default_flags) extract_all_captures(text, pattern_info) end |
.parse_pattern(input, default_flags = '') ⇒ Hash?
Parse a regex pattern string using the `to_regexp` gem for robust parsing.
Handles `/pattern/flags`, `%r{pattern}flags`, and plain text formats.
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/schemagraphy/regexp_utils.rb', line 30 def parse_pattern input, default_flags = '' return nil if input.nil? || input.to_s.strip.empty? input_str = input.to_s.strip # Remove surrounding quotes that might come from YAML parsing clean_input = input_str.gsub(/^["']|["']$/, '') # Heuristic to detect if it's a Regexp literal is_literal = (clean_input.start_with?('/') && clean_input.rindex('/').positive?) || clean_input.start_with?('%r{') if is_literal # Try to parse as regex literal using to_regexp begin regexp_obj = clean_input.to_regexp(detect: true) # Extract pattern and flags from the compiled regexp pattern_str = regexp_obj.source flags_str = extract_flags_from_regexp(regexp_obj) { pattern: pattern_str, flags: flags_str, regexp: regexp_obj, options: regexp_obj. } rescue RegexpError => e # Malformed literal is an error raise RegexpError, "Invalid regex literal '#{input}': #{e.}" end else # Treat as plain pattern string with default flags flags_str = default_flags.to_s = (flags_str) begin regexp_obj = Regexp.new(clean_input, ) { pattern: clean_input, flags: flags_str, regexp: regexp_obj, options: } rescue RegexpError => e raise RegexpError, "Invalid regex pattern '#{input}': #{e.}" end end end |
.parse_structured_pattern(pattern_hash) ⇒ Object
Note:
Future enhancement to parse structured pattern definitions from a Hash.
Not yet implemented.
84 85 86 87 88 89 |
# File 'lib/schemagraphy/regexp_utils.rb', line 84 def parse_structured_pattern pattern_hash # TODO: Implement structured pattern parsing # pattern_hash should have 'pattern' and 'flags' keys # flags can be string or array raise NotImplementedError, 'Structured pattern parsing not yet implemented' end |
.parse_tagged_pattern(tagged_input, tag_type) ⇒ Object
Note:
Future enhancement to parse custom YAML tags for regular expressions.
Not yet implemented.
96 97 98 99 100 |
# File 'lib/schemagraphy/regexp_utils.rb', line 96 def parse_tagged_pattern tagged_input, tag_type # TODO: Implement custom YAML tag parsing # tag_type would be :literal or :pattern raise NotImplementedError, 'Tagged pattern parsing not yet implemented' end |