Jay Taylor's notes
back to listing indexregex - AWK: Access captured group from line pattern - Stack Overflow
[web search]- Home
-
- Public
- Stack Overflow
- Tags
- Users
- Jobs
-
-
Teams
-
Free 30 Day Trial
-
If I have an awk command
pattern { ... }
and pattern uses a capturing group, how can I access the string so captured in the block?
FS
) and pick what one would like to match with a $field
. Preformatting the input could help too.
– Krzysztof Jabłoński
Jul 1 '15 at 17:06
gawk
(since it uses gensub
).
– rampion
Jul 8 '15 at 17:39
That was a stroll down memory lane...
I replaced awk by perl a long time ago.
Apparently the AWK regular expression engine does not capture its groups.
you might consider using something like :
perl -n -e'/test(\d+)/ && print $1'
the -n flag causes perl to loop over every line like awk does.
gawk
!= awk
. They're different tools and gawk
isn't available by default in most places.
– Oli
Sep 4 '12 at 12:21
With gawk, you can use the match
function to capture parenthesized groups.
gawk 'match($0, pattern, ary) {print ary[1]}'
example:
echo "abcdef" | gawk 'match($0, /b(.*)e/, a) {print a[1]}'
outputs cd
.
Note the specific use of gawk which implements the feature in question.
For a portable alternative you can achieve similar results with match()
and substr
.
example:
echo "abcdef" | awk 'match($0, /b[^e]*/) {print substr($0, RSTART+1, RLENGTH-1)}'
outputs cd
.
This is something I need all the time so I created a bash function for it. It's based on glenn jackman's answer.
Definition
Add this to your .bash_profile etc.
function regex { gawk 'match($0,/'$1'/, ary) {print ary['${2:-'0'}']}'; }
Usage
Capture regex for each line in file
$ cat filename | regex '.*'
Capture 1st regex capture group for each line in file
$ cat filename | regex '(.*)' 1
grep -o
's.
– bfontaine
Mar 7 '18 at 17:16
You can use GNU awk:
$ cat hta
RewriteCond %{HTTP_HOST} !^www\.mysite\.net$
RewriteRule (.*) http://www.mysite.net/$1 [R=301,L]
$ gawk 'match($0, /.*(http.*?)\$/, m) { print m[1]; }' < hta
http://www.mysite.net/
awk 'match($0, /.*(http.*?)\$/) { print substr($0,RSTART,RLENGTH) }'
– Ed Morton
Nov 28 '12 at 4:43
RewriteRule (.*) http://www.mysite.net/$
for me, which is more than the subgroup.
– rampion
Nov 29 '12 at 13:02
You can simulate capturing in vanilla awk too, without extensions. Its not intuitive though:
step 1. use gensub to surround matches with some character that doesnt appear in your string. step 2. Use split against the character. step 3. Every other element in the splitted array is your capture group.
$ echo 'ab cb ad' | awk '{ split(gensub(/a./,SUBSEP"&"SUBSEP,"g",$0),cap,SUBSEP); print cap[2]"|" cap[4] ; }' ab|ad
gensub
is a gawk
specific function. What do you get from your awk if you type awk --version
;-?). Good luck to all.
– shellter
Apr 13 '12 at 5:28
echo 'ab cb ad' | awk '{gsub(/a./,SUBSEP"&"SUBSEP);split($0,cap,SUBSEP);print cap[2]"|"cap[4]}'
– dubiousjim
Apr 19 '12 at 1:05
gawk --posix '{gensub(...)}'
.
– dubiousjim
Apr 24 '12 at 0:08
gensub
function, your example applied to a very limited scenario: the whole pattern is grouped, it can't match something like all key=(value)
when I want to extract only the value
parts.
– Meow
Sep 24 '15 at 13:24
I struggled a bit with coming up with a bash function that wraps Peter Tillemans' answer but here's what I came up with:
function regex { perl -n -e "/$1/ && printf \"%s\n\", "'$1' }
I found this worked better than opsb's awk-based bash function for the following regular expression argument, because I do not want the "ms" to be printed.
'([0-9]*)ms$'
$1
– Demis
Dec 19 '17 at 18:39
'([0-9]*)ms$'
- is that supplied as an argument (and the string another argument)? And the output from perl -e
is being inserted into bash's printf
command then, to replace %s
, is that right? Thanks, I am hoping to use this.
– Demis
Dec 20 '17 at 23:55
Your Answer
Not the answer you're looking for? Browse other questions tagged regex awk or ask your own question.
Linked
Related
Hot Network Questions
-
Do "finché" (and "fino a che") mean "until" or "as long as" (or both)?
-
How to say in French: He lives voluntarily like a prisoner?
-
First time buyer- How to decide how much to put toward a deposit?
-
How to let colleagues know that your desk is vacant?
-
Masonry bits dulling after 3-4 holes
-
How are barometric pressure measurements traceable over centuries to 100 parts per million accuracy?
-
How does the Linux kernel update itself?
-
W2 Form Stating Single instead of Married
-
Seemingly Irrelevant Papers, All with Author in Common, Suggested by Reviewer
-
Why would the Senate adopt such tiring impeachment trial rules?
-
What is Test Automation Framework?
-
Riddle of the Black Knights
-
How to deal with a colleague who's taking all the nice/important tasks
-
What is the thick black soysauce that they pair with Hainanese Chicken Rice
-
Internal Truth Machine
-
Was the Holocaust legal?
-
Which 8-bit computers were used in German schools in the 1980s?
-
While loop in a function
-
How would a civilization adapt to handle a series of unkillable deadly people?
-
How many generations would it take until the noble and peasant classes become different species and cannot interbreed?
-
Would you notice, visually, if the planet you are on, is vastly larger than Earth?
-
Non-orbital takeoff and landing?
-
Can Malicious Code Fit in 14 Bytes
-
Cumin in Taco Seasoning?
site design / logo © 2020 Stack Exchange Inc; user contributions licensed under cc by-sa 4.0 with attribution required. rev 2020.1.23.35883