ArrayList - C# vs Powershell - Is it the same thing or not?

Hello,

So my question comes because I have heard different definitions of the ArrayList in C# and in Powershell and so I would like to understand this concept a little better.

In Powershell an array can be declared as follows:

$Array = @()

And you can add elements to this array by using the += operator

$Array=1
$Array+=2
$Array+=3

I used to use this method a lot, but then I heard that this method was costly because under the hood what is happening is that Powershell is actually copying all the elements to a new array plus the new one and then destroying your original array and this adds overhead, It was suggested to use ArrayList instead with the .Add() function because this will simply add the new elements to my Array without doing the whole new array, copy elements and destroy old array.

$ArrayList = New-Object System.Collections.ArrayList
$ArrayList.Add(1)
$ArrayList.Add(2)
$ArrayList.Add(3)

Up until here we’re cool.

My confusion comes when I hear that in C# the ArrayList is actually doing that exact same thing that adds the overhead.
I heard that in C# the ArrayList will create a new array, copy all the elements plus your new one, and then delete the old array.

using System;
using System.Collections;

class Program
{
    static void Main()
    {
	ArrayList list = new ArrayList();
	list.Add(1);
	list.Add(2);
	list.Add(2);
    }
}

And for C# I hear that instead we should use Lists, and the lists will add elements without this overhead.

using System;
using System.Collections.Generic;

class Program
{
    static void Main()
    {
	List list = new List();
	list.Add(1);
	list.Add(2);
	list.Add(2);
    }
}

So finally my question is, did I get bad info somewhere? or something? It seems like these things should be the same, but I got 2 different definitions for C# and Powershell… so anyone know what actually happens in both scenarios?

Note: The code for List is supposed to have a lessthan; int greaterthan; right after the word List in both instances but it is not getting reflected in this post, probably not compatible with the codeblocks in this forum.

states that ArrayList.Add Method:

If Count is less than Capacity, this method is an O(1) operation. If the capacity needs to be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count.

Well … thanks I guess? It’s not really the answer I was looking for, but I guess you are trying to imply that they are the same thing in C# and in Powershell. It’s kinda hard to tell you know, with the lack of your own words.

Even though this link you are referencing uses a C# example and it doesn’t make any reference to Powershell at all.

At least it does confirm that in C# the .Add method has the overhead, so I guess that helps a little bit, it confirms that what I heard about C# is true, now I need to figure out if what I heard about Powershell is also true or misinformation, anyone here know?

Edit:
I found that other people online are recommending to use the Lists in Powershell instead of the Arrays for efficiency reasons, so I’m guessing that this holds true in C# and in Powershell alike and what i heard about ArrayLists in Powershell is probably just misinformation, I appreciate the response though, thank you.

You could test this yourself in PowerShell using:

measure-object
Just write 2 pieces of code that use both methods of adding items to an array. In both pieces of code repeat this like 10 to 20 times (so the difference becomes more obvious) and measure the time it takes both methods to complete the operation.

Try these two functions … the difference in performance is quite significant.

Test-Array:

function Test-Array {
    Param(
        $Iterations  = 100000
        ,
        $OutputEvery = 1000
    )

    $StopWatch = [System.Diagnostics.Stopwatch]::StartNew()

    $Array = @()
    for ($i = 1 ; $i -le $Iterations ; $i++)
    {
        if (-not ($i % $OutputEvery)) { "{0}`t{1}" -f 'Records processed:', $i }
    
        $Array += $i

    }

    $StopWatch.Stop()

    [PSCustomObject]@{
        ArrayCount      = $Array.Count
        DurationSeconds = [Math]::Round($StopWatch.Elapsed.TotalSeconds, 2)
        MemoryUsageMB   = [Math]::Round((Get-Process -Id $pid).WorkingSet / 1MB, 2)
    }
}

Test-ArrayList:

function Test-ArrayList {
    Param(
        $Iterations  = 100000
        ,
        $OutputEvery = 1000
    )

    $StopWatch = [System.Diagnostics.Stopwatch]::StartNew()

    $Array = New-Object System.Collections.ArrayList
    for ($i = 1 ; $i -le $Iterations ; $i++)
    {
        if (-not ($i % $OutputEvery)) { "{0}`t{1}" -f 'Records processed:', $i }
    
        [void]($Array.Add($i))

    }

    $StopWatch.Stop()

    [PSCustomObject]@{
        ArrayCount      = $Array.Count
        DurationSeconds = [Math]::Round($StopWatch.Elapsed.TotalSeconds, 2)
        MemoryUsageMB   = [Math]::Round((Get-Process -Id $pid).WorkingSet / 1MB, 2)
    }
}

My own test results, running each function in a fresh PowerShell session with no profile loaded.

Test-Array:

ArrayCount DurationSeconds MemoryUsageMB
---------- --------------- -------------
    100000          409,37         65,61

Test-ArrayList:

ArrayCount DurationSeconds MemoryUsageMB
---------- --------------- -------------
    100000            0,35         76,84

I mean only one. the truth about the c# arraylist somewhere inbetween :slight_smile:

if you want fast Add for big number of objects, use Capacity preallocation or System.Collections.ObjectModel.Collection

for example with @Christian’s original Test-ArrayList and modified one with Capacity preallocations ( $Array = New-Object System.Collections.ArrayList $Iterations )
result is

ArrayCount DurationSeconds MemoryUsageMB UseCapacity
---------- --------------- ------------- -----------
    100000            0,13        318,88 No
    100000            0,04        323,46 Yes

btw, for Array
    100000          418,89        280,92
and Collection
    100000            0,06        366,05

( collection creation trick used: $array = {}.Invoke(), which I use in my scripts instead of arrays )

Cool, thanks to Christian for providing the test results with Array vs ArrayList, there is a noticeable improvement.

I also did my own test with the generic list and it gave me the same results to using ArrayList.

And doing the capacity preallocation did not seem to make a difference in performance, so i think this means that the ArrayList is probably using an efficient method of adding new elements to the array

$Array = New-Object System.Collections.Generic.List[System.Object]
                      ArrayCount                 DurationSeconds                   MemoryUsageMB
                      ----------                 ---------------                   -------------
                          100000                            0.11                          190.84
$Array = New-Object System.Collections.ArrayList
                      ArrayCount                 DurationSeconds                   MemoryUsageMB
                      ----------                 ---------------                   -------------
                          100000                            0.12                          186.79