Help removing random characters from multiple strings

Hi,
I get a file sent to me by our staff in Accounting with information for users who are no longer employed. My eventual goal is to be able to import the file into a script and have it confirm that the users are disabled. However, I’m stuck on cleaning up the file that gets exported from the program that is used to keep track of employees, and am wondering if there is a way in Powershell to clean this up a bit. An example line in the text that is exported is below (confidential information is obviously cleaned up):

 1 000000       0000000000000000000000000NAME               0   SURNAME                   BOX 111                            SOMEWHERE, OUTTHERE                  
        A1A 1A1                           000000000000000000000000JOBTITLE - 00.00                        U        0000000000.00000000.00000000.00000000.00
000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.0000000.0000
0000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.000000
00.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.000000000000     
      000                         0000000000     000000.00                                                                                                 
                                                                                                                                                           
       000         

Everything that is a 0 in that line is a number or letter other than 0, otherwise this would be incredibly simple. Each line also contains the exact same amount of characters, with the capital variables being the main variance in length.
I’ve tried using Trim, Trimstart, and Trimend, but it seems that they need specific characters, not a numbered amount. I’ve tried Replace, but it also seems to need specific characters (unless I’m doing it wrong).

Please help!

Thanks,
Matt

EDIT: Out of that string, I need NAME, SURNAME, BOX (address), SOMEWHERE,OUTTHERE (city and province), A1A 1A1 (postal code), and JOBTITLE.

Without seeing the original data, it’s kinda hard to interpret what this data is and what is needed. Any way you can give an example of actual data (fake actual data of course)

Matt, I haven’t tried with your data , but I’ve had success in the past with the Substring Method. Here’s a link to an article ( the relevant portion is near the end) https://technet.microsoft.com/en-us/library/ee692804.aspx

I’m not sure I understand the question. Do you know exactly how many characters you need to remove from the beginning and end of the string, but don’t know what those characters are? If that’s the case, the Substring method of the string class would probably help.

Beyond that, I’d probably need to know more about the data and what you want to get out of it. In your example line, what do you want it to look like after you’ve manipulated it?

I have attached the file and sanitized it. Please ignore the goofy names - it’s Friday :slight_smile:
Also, it comes out as a .blk, not .txt - however I have changed its extension so it can be uploaded.

Matt

Looks like the first, middle, and last names always start on the same index location. If that is the case, you could just strip out what you don’t need.

Looks to me like this is just a fixed-width data file. (Not sure what the ~0100 on the first line represents; you’d probably want to skip that.) Extracting data from a fixed-width file is fairly straightforward. You can use Substring() as I mentioned earlier (with an example below), or you can use ADO with a Schema.ini file (details here: https://msdn.microsoft.com/en-us/library/ms974559.aspx , though that’s an old article with VBScript code that you’d need to convert to PowerShell.)

Example with Substring:

# first data line from your sample file
$line = '1 000000       000000000RP00000000000000NAME                    SURNAME                     STREET ADDRESS                     CITY, PROVINCE                      A1A 1A1                           000000000000000000000000JOB TITLE - 00.00                       U        0000000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00ACCOUNTANT           LN                          5555555555     000000.00                                                                                                                                                                                                                                                                   EES         '

$name = $line.Substring(40,20).Trim()
$initial = $line.Substring(60,4).Trim()
$surname = $line.Substring(64, 28).Trim()
# etc

Extracting other bits of data would amount to the same thing (Substring(index, length).Trim()); you just need to figure out the starting index and length of each column.

Awesome. Thanks for the help!

Matt