regex

hi all,

I’m new to regex and I’m trying a simple one to see how they work.

My simple regex is “(\w+@\w+\w+)” to be able to capture some email from a big text.

I’ve created a text file, put into it a big chunk of random text, and inserted an email at 2 spots in there.

When I compare the file’s content to my regex, -match shows me the 2 lines with the email but what I see is the whole line, not the email I was looking for.

If I do $matches, I see no match.

Can someone please explain what I am doing wrong?

$file = get-content file.txt

$regex = “(\w+@\w+\w+)”

$file -match $regex

Thank you!

I think you want to go through the file one line at a time. -match works differently with an array.

foreach ($line in $file) {
  $line -match $regex
  $matches
}

True

Name                           Value
----                           -----
1                              js@powershell
0                              js@powershell

I would recommend using the [regex] accelerator. It allows the user to tap directly into .NET Regex, its faster and more efficient than using native PS to loop through data.

$text = Get-Content -Path C:\TEMP\PSdotorg_deleteMe.txt
$regex = "(\w+\@\w+\w+)"

[regex]::Matches($text,$regex)

Groups   : {0, 1}
Success  : True
Name     : 0
Captures : {0}
Index    : 0
Length   : 10
Value    : this@email

Groups   : {0, 1}
Success  : True
Name     : 0
Captures : {0}
Index    : 133
Length   : 13
Value    : another@email

Another recommendation is to use RegEx101.com to double check your syntax. I personally find it easier to test with than running through PS to get it right, but to each their own.

$regex = "(\w+@\w+\.\w+)"

That regex will return the entire email address, including the .xyz

Even that pattern won’t work on emails with more than one ‘.’.

PS C:\users\js> 'js@www.powershell.org' -match "(\w+@\w+\.\w+)"; $matches
True

Name                           Value
----                           -----
1                              js@www.powershell
0                              js@www.powershell

You are correct, unconventional formatting in the domain hadn’t occurred to me! The below expression accounts for that, provided of course there are no digits in use before the @. :slight_smile:

$regex = '\w+@.+?(?=\s)'

[regex]::Matches($text,$regex)

Groups   : {0}
Success  : True
Name     : 0
Captures : {0}
Index    : 0
Length   : 17
Value    : email@address.com

Groups   : {0}
Success  : True
Name     : 0
Captures : {0}
Index    : 150
Length   : 27
Value    : another@email.different.com

Groups   : {0}
Success  : True
Name     : 0
Captures : {0}
Index    : 222
Length   : 47
Value    : whatTheHeck@www.why.onearth.areyoudoingthis.com

An email address might be the wrong choice for a regex beginner … even when most of the people think that an email address is easy to recognize.

This site illustrates a little bit of what I mean: http://emailregex.com/.

ok I will experiment with what you guys gave me and I believe I’ll be good.

Thank you!

Messing with text with email strings.

$UrlList = @'
this is the URL https://stackoverflow.com/&20%
http://stackoverflow.com
http://www.SomeSite.com this is oure main site
http://www.SomeSite.com
ftp://www.somesite.com
ftp://somesite.com
ftp\SomeSite.com
If you want the file go there: file://SomeSite.com
'@ 

[RegEx]::Matches($UrlList, '(ftp:|ftp|http:|https:|file:)(//.([^\s]+)|\\.([^\s]+))').value

https://stackoverflow.com/&20%
http://stackoverflow.com
http://www.SomeSite.com
http://www.SomeSite.com
ftp://www.somesite.com
ftp://somesite.com
ftp\SomeSite.com
file://SomeSite.com