Using Powershell to parse text

Hi all,

I would like to use Powershell to parse text.

As my first project, I have a text string which contains (among other things) dates formatted as dd/mm/yyyy

I want to build a script that will find the index location of each instance where these are found in the text, and just before that index location, insert a delimiter character.

I can find all of the dates using a regex

$Regex = regex"\d\d/\d\d/\d{4}"

This produced the output of groups from which I extracted the index location for each date substring. I then created an array variable with all of the index values.

I then converted those arrayed string values to numbers

[array]$c = foreach($number in $STRARRAY) {([int]::parse($number))}

Now I need one more loop to insert a delimiter character at the location of each index.

Can anyone suggest how that command would be formed?



Welcome to

Please read the first pinned post on top of the list of posts of this forum. Read Me Before Posting! You’ll be Glad You Did! When you post code you should format it as code.

As you are a Powershell beginner I’d recommend staying with the built-in cmdlets as long as possible. It makes the code easier to read, easier to debug, easier to maintain or extend. And some dotNet methods have slightly suprising side effects compaired to standard Powershell cmdlets.

I think you’re overcomplicating your task.

If I got you right and assumed your input comes from a file called text.txt and your delimiter charachter is “#” you can achieve your task like this:

Get-Content -Path 'D:\sample\text.txt' | 
    ForEach-Object {
        $_ -replace '(?=\d{2}\/\d{2}\/\d{4})','#'

You can pipe the result to whatever cmdlet or further steps as you like.

If you want to insert a delimiter, such as a semi-colon, you can do this with a pretty simple process:

$string = "this is a test 01/01/2020 testing 01/02/2020 "

$pattern = '(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d'

#Place a semicolon in front of match 0
$string -replace $pattern, ';$0'

The regex matches:

PS C:\Users\rasim> $Matches

Name                           Value                                                                                                                                                                                                                             
----                           -----                                                                                                                                                                                                                             
3                              20                                                                                                                                                                                                                                
2                              01                                                                                                                                                                                                                                
1                              01                                                                                                                                                                                                                                
0                              01/01/2020

The match is 0, so we get the following output:

this is a test ;01/01/2020 testing ;01/02/2020

This discusses it more:

These are wonderful!

I thought the -replace operator would simply remove the pattern string and replace it with a replacement string.

But Rob’s example inserted the delimiter character nicely without removing the date values.

Thank you both very much!

Hey, I finally get to use one of those little known -replace codes for the second argument:

'hi there 01/01/1999 hi there' -replace '\d\d/\d\d/\d{4}','#$&'

hi there #01/01/1999 hi there

$& - substitutes a copy of the whole match

Info is buried on this page that has no connection to the “about_comparison_operators” help:

[quote quote=196157]
But Rob’s example inserted the delimiter character nicely without removing the date values.[/quote]

… and my example snippet didn’t???

Sorry Olaf, yes - your example did also.

I’m trying to understand out the option in the replace command that makes it function this way.

Most times a replace command does remove the selected string and totally replace it.

I was looking for the Get-Help on this and didn’t get very far in discovering it.

Got any suggestions where I would look for any/all examples of its use?

Thank you again!

Great. Because it worked here in my tests so I was worried. Thanks. :wink:

The trick in my code snippet is in the regex pattern. It’s called look-ahead and searches for something “followed by something particular”. So it matches actually the charachter before (or after when you use a look-behind) the pattern you provide this way … what you wanted.


Thank you for explaining that - and Look-aheads have been something I’ve had difficulty getting my brain around. I can see I’m going to have to dig into that and get a solid understanding of it.

Thanks again - I appreciate!