I have a script that pings a list of servers every 15 minutes. If any are down I get an email. If after another 15 minutes the same server is down it skips the notification until at least 55 minutes has passed to notify me. The problem is if the server is down for days I get notifications every hour until it’s backup. I’d like to modify this so if server is still down after an hour I get notified, but if it’s still down after another hour I don’t get notified for four hours. If it’s still down I get another notification after four hours again. Can someone help? Here’s my code:
$NewDate = Get-Date
$LogFile = "c:\scripts\logs\MonitorServers.csv"
$state = $false
$servers = "Server1", "Server2", "Server3"
## Test if server ping failed within last hour. If so, it exists script
if (Test-Path -Path $LogFile -NewerThan $NewDate.AddMinutes(-55)) {
exit
}
I imagine this isn’t the full code, I don’t see anything about pings or email notifications. You want to track some state logic for each server.
# sudo code
If $DOWN
If not $immediate_notification
# send immediate down email
# set immediate down state to true
else if not $15_min_notification
# send 15 min notification
# set 15 min notication to true
and so forth
if $UP
# reset other states
From mobile, sorry:
I think an add-on that might help is a companion file to the script like maybe a .csv
Even if it was just two columns, “Server”, “Online”.
The script would import the csv to get its list of servers to target but it would also have reference to the “Online” property. I think you’ll need that in the logic, with time, to determine whether to email or not. Otherwise if Server3 is down at 11am, and sends an email, how would you build the logic to email again in an hour, but not in 4. It’s gotta be based on time and state right?
Something like
# psuedo code
If CurrentState -eq False
if PreviousState -eq False -and Timestamp -ge FirstThreshold
Send email
I’ll try to get to a computer and think about this a bit more.
At the end of script execution the current objects would be written over the .csv file again for use in the next iteration.
EDIT: got a computer
alright, this is very rough draft, but I think I got my idea out
#Example CSV data
Server Status TimeDown
------ ------ --------
Server1 True
Server2 True
Server3 False 6/21/2024 10:22 AM
Server4 True
Server5 True
$CSV = Import-Csv "c:\scripts\MonitorServers.csv"
#Email info
$User = "MyEmail@Outlook.com"
$date = get-date -Format g
$cred = New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList $User, (Get-Content c:\scripts\secrethash.txt | ConvertTo-SecureString)
$EmailTo = "MyEmail@Outlook.com"
$EmailFrom = "MyEmail@Outlook.com"
$SMTPServer = "smtp.office365.com"
Foreach ($Server in $CSV) {
if (Test-Connection -ComputerName $($Server.Server) -Quiet -Count 1) {
# server is up!
$Server.Status = $true
if ($PreviousState -eq $false) {
# wipe out the timestamp field
$Server.TimeDown = $null
}
} else {
$CurrentState = $false
$PreviousState = $Server.Status
# the logic for alerting
if ($CurrentState -eq $false -and $PreviousState -eq $true) {
# first time it's gone down
$Datestamp = Get-Date -Format g
$Server.TimeDown = $Datestamp
$ServerName = $Server.Server
} elseif ($CurrentState -eq $false -and $PreviousState -eq $false -and (New-TimeSpan -Start $Server.TimeDown -End (Get-Date)).TotalHours -le 4) {
# it's within 4 hours of it still being down, send the email
$Subject = "Server $ServerName is down"
$Body = "Server $ServerName has been down since $($Server.TimeDown)"
} elseif ($CurrentState -eq $false -and $PreviousState -eq $false -and (New-TimeSpan -Start $Server.TimeDown -End (Get-Date)).TotalHours -gt 4) {
# don't send email
continue
}
# send an email for a server down event
$SMTPMessage = New-Object System.Net.Mail.MailMessage($EmailFrom,$EmailTo,$Subject,$Body)
#$attachment = New-Object System.Net.Mail.Attachment($filenameAndPath)
#$SMTPMessage.Attachments.Add($attachment)
$SMTPClient = New-Object Net.Mail.SmtpClient($SmtpServer, 587)
$SMTPClient.EnableSsl = $true
$SMTPClient.Credentials = New-Object System.Net.NetworkCredential($cred.UserName, $cred.Password);
$SMTPClient.Send($SMTPMessage)
}
}
#overwrite our CSV data
$CSV | Export-Csv -Path "c:\scripts\MonitorServers.csv" -NoTypeInformation
Maybe not the response you’re looking for but look into SCOM. We’v had a great experience with it and reporting if there’s any issues (including things like uptime). It can ping you in a variety of ways if like the heartbeat goes down etc.
I had to work on something simmilar so what I did was added another item to the project logic. I added a exclusion TxT file. If the server goes offline send and email and add it to the exclusion TxT file. Include the import the TxT file into the script logic and check if not NULL then have it check for a match. If it matches then no email else, send email. Something like that may help you out.
If you want to you can add something that checkes the last write time on the TxT file and if its not been written to in X amount of time you can delete the file and recreate a empty one.