[svlug] extended grep reg exp

William R Ward bill at wards.net
Tue Aug 12 13:30:25 PDT 2003


Robert Khachikyan writes:
>I've read the doc for grep extensively and google searched
>it and still couldn't find what i was looking for...on top
>of that, i left my reg exp book @ home....so here it is.
>
>I have a big file that has
>
>3918400 bla bla bla
>3918401 bla bla bla
>3918402 bla bla bla
>3918403 bla bla bla
>3918404 bla bla bla
>...
>3945785 bla bla bla
>3945786 bla bla bla
>3945787 bla bla bla
>...
>
>you get the idea. I want to grep a portion of it out.
>Let's say from 3918403 -> 3928404 (10001 lines).

That's not an easy problem to solve with regexp.  If it was 3910000 to
3919999 however, it would be easy: 391[0-9]{4} (egrep syntax) or
391[0-9][0-9][0-9][0-9] (grep syntax).

>To my knowledge, grep's regular expression works with
>searching for the last character of the string(*[0-9]).
>This would return only 10 lines...what if I want to
>do a crazy grep like this?

Not the last character, but each [] range represents a single
character.  If the things inside the [] are digits, then it looks for
a single digit.

>i thought 'egrep -E 39[18403-28404] file' would do, but
>it comes back with no match...

Translation of [18403-28404] would be "1 or 8 or 4 or 0 or 3..2 or 8
or 4 or 0 or 4".  Grep doesn't understand numbers, only characters.
This is just like how [0-9a-fA-F] matches a single hexadecimal digit.

>can anyone shine a light on this....thanks a mill

I *think* you can get what you want by using this egrep regexp:

39\(1840[3-9]\|184[1-9][0-9]\|18[5-9][0-9]{2}\|1[8-9][0-9]{3}\|2840[0-4]\|28[3-9][0-9]{2}\|2[0-8][0-9]{3}\)

(For a Perl version, drop the \ in front of (, |, and ).)

But I haven't tested it, so I can't be 100% sure.  A more readable
translation of the above:

39
 - Followed by one of these...
  1840[3-9]
   - or -
  184[1-9][0-9]
   - or -
  18[5-9][0-9]{2}
   - or -
  1[8-9][0-9]{3}
   - or -
  2840[0-4]
   - or -
  28[3-9][0-9]{2}
   - or -
  2[0-8][0-9]{3}

--Bill.

-- 
William R Ward            bill at wards.net          http://www.wards.net/~bill/
-----------------------------------------------------------------------------
           PROFESSIONAL PROGRAMMER, CLOSED COURSE.  DO NOT ATTEMPT.




More information about the svlug mailing list