Help with parsing JSON data efficiently


I have written a function that writes information to a logfile in JSON format. Each log entry has lots of information that could help the person who reads the logfile. For instance,

“id”: 1,
“date”: “the date”,
“Message”: “Creating user xyz”,
“CmdledUsed”: “Get-ADUser -Identity xyz”,
“Module”: “ActiveDirectory”,
“CmdletToReverseChange”: “Remove-ADUser -identity xyz”

So I call “Write-LogFileJson” when I want to log information to the logfile. The reason I am using JSON is because it is easier for someone else to read the information.

The problem I have is that it is incredibly inefficient to read all of the JSON file, to then append the new log entry (obviously).

Is there a more efficient way to do this? I mean to append the JSON file without reading it first before each time the function is called.

I don’t want the script to slow down just because the logfile happened to be really large. I mean I could use runspaces/jobs for that but then I will most likely encounter locked file issues.

Just need help with some ideas of how I can do this a bit more efficient. Maybe I just need to rethink the whole concept.

As far as I know no.
If you google it, no matter the language the question (java, python, c# etc.) have poped up and the answer is the same.
Load it back to objects, add whatever object you create, convert back to json.

I guess you could try but you would constantly need to add an extra ‘,’ to the last/previous entry and you would need to check for end ‘]’ ‘}’ and so forth.
So it might work but most likely you’ll break the formatting sooner or later.

I would log to .CSV or a custom windows eventlog, rather than JSON.

CSV will be tricky with that array property. You may need to reconsider what needs to be stored and how to represent it in the data file if you want to have a readily-appendable file.

You could also consider using a database rather than a file to centralize the logs as well. Then you are just inserting a record.

Yeah, CSV will be tricky. The data that I wrote in the post is just an example, there will be more properties. The end goal of it all is to centralize the logs for ease of management as well, so the eventlog approach could become cumbersome (Unless we forward the logs to some central logsystem).

I have considered the database approach, but I was going for simplicity (Where I am working currently the knowledge of these things are quite low). I could also write some type of wrapper that will display these logs in some readable way also I suppose.

Thanks for all the suggestions! I have some thinking to do :slight_smile:

Yes if doing it via CSV the logformat would need to be reconsidered.
On the other hand it’s appendable and easy to import in either powershell or e.g. Excel.
You could use some other delimter for the array, e.g. using ‘;’ so that they will stay in one column etc.
Maybe not the prettiest solution for the array but might be a solution.

As for this…

I have written a function that writes information to a logfile in JSON format.

… any reason why you chose a text file, vs creating a custom event log with the same info, and granting access for whoever needs to read it from there, that way, it’s still just you text, in a central location?

Just curious.

Articles and Examples:

Powershell Script To Create Custom Event Logs and Register App Sources

This Script Creates a custom event log and registers a source in this log. If the log already exist then the script just registers the source to it. This allows the custom code events to be logged to this custom log. To Run the script just change the path of the config file in the

Hi postanote,

Thanks for your reply.

The reason I am using a text-file is because it is easier to consume and manage right now. An EventLog usually requires someone to have some sort of permissions to the machine where the events are written, which is not something that we would want to do in all occasions.

When it comes to centralizing EventLogs you usually have to have some software that can parse the logs and display them for the user. It is the end goal but we are not there yet, so the centralizing of logfiles is much easier.

Come to think of it, what you could do is use YAML instead of JSON.
If you want to convert it to objects to work with in powershell, then there is a module in the PSGallery to do it.
There are three modules and tested one of them (powershell-yaml).
The module isn’t that great to convertto-yaml and then output it to a file.
Since it create an output that doesn’t work with convertfrom-yaml.
It has a switch to output a ‘jsoncompatible’ format but it’s basically a long string with ‘\r\n’ mixed into it.

Perhaps the other module is better at this but as long as you don’t need to import, manipulate and then export it, it should be fine.

But using the standard YAML syntax works, so if you just append to the file in that format then it should be pretty good.
E.g. from the Ansible documents.

$yaml = @"
-  martin:
    name: Martin D'vloper
    job: Developer
      - python
      - perl
      - pascal
-  tabitha:
    name: Tabitha Bitumen
    job: Developer
      - lisp
      - fortran
      - erlang

$obj = $yaml | convertfrom-yaml # module need to be installed.

Only gotcha if you use that format in the file is that you need to use the -RAW switch with Get-Content if you need to import it.

Have no idea where my post went but I can’t see it anymore, maybe it will turn up :slight_smile:

Anyway, the main idea is that you could use YAML instead of JSON.
It should be easier to append information in that format.
There is no native parser in powershell for it but there are modules in PSGallery.

I guess there’s quite a few things to look at here before even proceeding with looking at the code

  1. Management of the information
    At some point the data in the logfile/data file is going to be stale unless there is some form of removal/archival of the data. How do you plan to keep the information relevant and not just balloon in size?

  2. Access to the information
    Are all log file(s) going to be stored on a central share, or locally on the system itself? Centralization wherever possible is best

  3. Presentation of the information
    Are you talking about the user just opening the logfile and doing a search? Reason I’m asking is that if you detach it to a separate presentation layer, like via webservices, it’s not going to lock you down so much if you have to change the code at a later date. Additionally, it also makes it feasible to consider having the script use HTTP requests for reading, searching, and writing of information. It’s really quite simple to do and is much more efficient and scalable.

If you just want to do appending of data to a json file, I guess there are several options:

  1. Read it as a text file without converting to json (i.e. get-content), but without the closing brace to allow you to append the new info, then insert the closing brace. It's not ideal but prevents the need to do any conversion. You're still reading in a text file but without any other overhead like converting to an object. A bit of regex should make that relatively simple.
  2. Go one step further and use filestreams to read and write the data to a new file. Add in the final new data you have, then use the closing braces. Once complete, remove/archive the old file and renamed the new file to the standard name you are using.