I need to extract certain sections of multiple Word docs (from within tables) and paste them into (differently formatted) tables in a single created output document. I was initially trying to do this with Power Automate, but it seems like a PowerShell script might be a better solution. I’ve tried to write a suitable script with the help of ChatGPT, which suggests two options based on Open XML SDK and Open-Xml-PowerTools. I’m running into issues installing both, and both seem to be quite old packages. Is there a better way?
What’s wrong with PowerAutomate. Without having experiences with it I’d expect it to be actually quite fitting for the task.
That only works when you’re actually already proficient in PowerShell since you would need to check and validate what ChatGPT suggests.
Issues?
If they are still supported it does not matter how old they are.
Do not use Word docs as source for the data you want to collect. And do not use Word docs as the target for your data you want to output. Word documents are not meant to be processed automatically. Especially when I read “(from within tables)” I’d ask why not using CSV files or Excel sheets instead?
What’s wrong with PowerAutomate. Without having experiences with it I’d expect it to be actually quite fitting for the task.
It’s probably possible. This page shows the basics of how to extract text from a Word doc, but it’s quite cumbersome, and a flow which identified all the right sections and pasted them into a new doc (which has specific formatting requirements) would be very long. It’s also tied to a specific user, which is a nuisance.
That only works when you’re actually already proficient in PowerShell since you would need to check and validate what ChatGPT suggests.
Indeed.
Issues?
Organisational permissions. Probably surmountable but trying to determine if this is the best method first. On further exploration, it appears possible to write this without either package.
Do not use Word docs as source for the data you want to collect. And do not use Word docs as the target for your data you want to output. Word documents are not meant to be processed automatically. Especially when I read “(from within tables) ” I’d ask why not using CSV files or Excel sheets instead?
Indeed. We use Word docs because (a) that’s how it’s always been done and parts of the organisation are resistant to change and (b) there are multiple paragraphs of text in there, and writing that in Excel isn’t a great experience. I might see if I can get away with introducing Excel anyway…
Since Office documents are actually zipped XML files it should be even possible without an installed Word. But that does not mean that it will be easier. I’d actually expect it to be harder than using a specialized module or tool or the Word object model.
If that would be a valid argument we’d all still live in caves.
You may sell this automation better.
Since I actually don’t know what this is about I’d recommend you to try to change the process in general to be able to automate the task.
You may start where these Word docs are created … is this already an automated process?
You may search for examples online - don’t use ChatGPT. There are examples out there on StackOverflow, reddit or maybe even here.
Edit:
BTW: When you crosspost the same question at the same time to different forums you should at least post links to the other forums along with your question to avoid people willing to help you potentially wasting their time.