I have a regex that I use to remove subtitle duplication via VSC and can be seen working here. With the help of Bing AI (as I have not formally learnt Powershell), I came up with this function to what I’ve done manually before:
I kept iterating on this code via Bing AI but nothing worked. Not (?msi) or (?s). I don’t know what I am doing wrong. Is VSC’s implementation of regex different than Powershell’s?
Files with spaces still ignore spaces and only the first word is passed to whisp. I don’t know about character-escaping in pwsh so I used Bing AI to help.
I mentioned it in my first answer. On Windows line breaks often consist of two characters … \r\n. I added that extension to your regex.
Using an AI for code production actually requires enough knowledge of the target programming or scripting language from the user. Since you cannot trust the output of an AI you have to validate it yourself.
I know about /r and /n. That is not why I asked. You made the code larger besides just using /r and /n. You have 3 groups instead of just the 2 in mine. That is what I am asking about.
I have made lots of pwsh functions using just Bing AI (and sometimes help of others) so it works. This time though because it was regex-related I knew it was probably something related to regex implementation.
And because you actually cannot validate it correctly you end up with clunky, cumbersome and inefficient code.
If you’re looking for a single new line with \n it’s ok to extend it to \r?\n to find an eventually existing \r? (carriage return) as well. But if you’re looking for one or more consecutive new line occurances with \n+ you need to make a group of the combination \r? and \n to match it correctly if there are more than one. So you use a non capturing group like this: (?:\r?\n)+