Python/regex

Aus SchnallIchNet
Wechseln zu: Navigation, Suche
  1. "." Any character except a newline.
    1. 'a' through 'Z'
    2. any numbers and symbols
    3. tab ('\t')
  2. "^" The start of the string.
    This is not the first character of the string but the invisible boundary which precedes the string.
    So, in the string cartwheel', the term '^' would match the location immediately before the 'c'.
  3. "$" The end of the string or just before the end of a line. This is not the last character of the string but the invisible boundary which follows the string. As with the preceding expression, the term '$' would match the location immediately following the 'l'.
  4. "*" 0 or more instances of the pattern
    'cart.*' would match 'cartwheel', 'cartridge', 'cart567', and any other string that begins
    with the four characters 'cart'.
    '.*wheel' would match 'cartwheel', 'backwheel', 'frontwheel', '4-wheel', and any other string that ends with the five characters of 'wheel'.
    'c.*l' would match 'cartwheel', 'control', 'cancel', 'c5a-f67l', and any other string that begins with 'c' and ends with 'l'.
  5. "+" 1 or more instances of the pattern. This is usually used in conjunction with square braces.
    'c[art]+' matches 'c' followed by one or more instances of either 'a', 'r', or 't'.
  6. "?" 0 or 1 instances of the pattern.
    'ca?t' matches 'cart', 'cast', 'cat', and any other string in which the first two places are 'ca', the last is 't' and the string is at least 3 and no more than 4 places long.
  7. "*?", "+?", "??" Match as few repetitions of the term preceding '?' as possible.
    Other forms of these operators try to match as many as possible.
  8. "{m}" Specifies how many instances of the regex should be matched
  9. "{m,n}" Specifies a range of the number of instances that should be matched
  10. "{m,n}?" Specifies a range of the number of instances that should be matched, matching as few as possible
  11. "\" Escapes special characters or signals a special sequence (like octal if the next character is 0).
    newline character: '\n'
    tab character: '\t'
  12. "[]" Indicates a set of characters for a single position in the regex
    'd[aou]' matches 'da', 'do', and 'du'.
    '200[0-9]' matches all numbers from '2000' through '2009'. Note that this matches them as a string literal, not as an integer.
  13. "|" Matches either the value on the left of the pipe or the value on the right
    '[d|c]og' matches 'dog' or 'cog'.
    '[d|c][a|o][g|t]' matches any of the following: dog, dag, dot, dat, cog, cag, cot, and cat.
  14. "(...)" Indicates a grouping for the regex.
    (ca[rtp]) matches car, cat, and cap. The regex is also saved and can be accessed in other ways, saving one the effort of repeating it.
  15. "(?iLmsux)" Each letter defines the further meaning of the construction.
  16. "(?:...)" Non-grouping of a regex
  17. "(?P<name>...)" Give name 'name' to the regex for later usage
  18. "(?P=name)" Recalls the text matched by the regex named 'name'
  19. "(?#...)" A comment/remark. The parentheses and their contents are ignored.
  20. "(?=...)" Matches if the preceding part of the regex and the subsequent part both match
  21. "(?!... )" Matches expressions when the part of the regex preceding the parenthesis is not followed by the expression in parentheses
  22. "(?<=...)" Matches the expression to the right of the parentheses when it is preceded by the value of ...
  23. "(?<!...)" Matches the expression to the right of the parentheses when it is not preceded by the value of ...
  1. "\A" Matches the start of the string. This is similar to '^', above.
  2. "\b" Matches the empty string that forms the boundary at the beginning or end of a word.
  3. "\bwheel" will match 'wheel' but not 'chartwheel'.
  4. "\B" Matches the empty string that is not the beginning or end of a word
  5. "\d" Matches any decimal digit. This includes the numbers 0 through 9 or any number in the real set.
  6. "\D" Matches any non-decimal digit.
  7. "\s" Matches any whitespace character like a blank space, tab, and the like.
  8. "\S" Matches any non-whitespace charaacter. This is obviously the inverse of '\s', above.
  9. "\w" Matches any alphanumeric character and the underscore: a through z, A through Z, 0 through 9, and '_'.
  10. "\W" Matches any non-alphanumeric character. Examples for this include '&', '$', '@', etc.
  11. "\Z" Matches the end of the string. This is similar to '$', above.