Upgrade to 24.3.1 has been a failure

BT-customer · 2025-01-09T15:14:27+00:00

Here is my list of problems:The new agent running as sra-pin.exe has been deployed as of v24.3.1 but the old bomgar-scc.exe and remnants are still installed on upgraded systems, including the service. The new agent upgrade removed the ability of the old agent to be removed from Add/Remove programs, requiring manual registry scrubbing and file deletion. The new agent installs on diffierent locations depending if it’s upgraded or newly installed, making it almost impossible to detect via automation tools like InTune. The old process bomgar-scc.exe is still used when agent connects to a jump client, even though this process does not exist anywhere as executable before the connection is made. The new agent service stops running and does not resume, even after reboots. However, the sra-pin.exe process can be found running under the user context. The device shows offline regardless. Reinstalling the agent does not resolve this log term, after a day or so the agent goes back offline. Most devices are now showing offline, and some are in locations where IT support is not available, creating a very difficult situation. Complete uninstall and reinstall is not possible because of our company global span and many jump groups. Opened case has gone nowhere, the rep is offering a downgrade to 24.2.4 or waiting on the new release, without addressing how the offline agents will be handled.Here is a screenshot I sent recently to support that covers most of the problems listed below.

Sorry to hear about the issues, and thanks for the heads-up about that version. We are running 24.2.3.

We have had services stopping and not recovering for the past couple versions, resulting in clients showing offline in the Rep console. I have had an open support ticket for about three months now. In the meantime, I have a scheduled task on our endpoints that starts the bomagar-ps-* if it is stopped. Maybe you can use it as inspiration to build your own band-aid until a full solution is available.

Trigger: Event ID 10000 for Microsoft-Windows-NetworkProfile/Operational. Notes: This can trigger often and have duplicate events depending on your environment, but I couldn’t find a better trigger. I get around this by 1) Adding a 20 second sleep to my action and 2) Not starting a new instance of the task if it is already running

Action: powershell.exe -command "Start-Sleep -Seconds 20; if(($Service = Get-Service 'bomgar-ps-*' -ErrorAction SilentlyContinue) -and $Service.Status -eq 'Stopped'){ $Service | Start-Service }"

Hope that helps alleviates some of your pain.

Sorry to hear about the issues, and thanks for the heads-up about that version. We are running 24.2.3.

We have had services stopping and not recovering for the past couple versions, resulting in clients showing offline in the Rep console. I have had an open support ticket for about three months now. In the meantime, I have a scheduled task on our endpoints that starts the bomagar-ps-* if it is stopped. Maybe you can use it as inspiration to build your own band-aid until a full solution is available.

Trigger: Event ID 10000 for Microsoft-Windows-NetworkProfile/Operational. Notes: This can trigger often and have duplicate events depending on your environment, but I couldn’t find a better trigger. I get around this by 1) Adding a 20 second sleep to my action and 2) Not starting a new instance of the task if it is already running

Action: powershell.exe -command "Start-Sleep -Seconds 20; if(($Service = Get-Service 'bomgar-ps-*' -ErrorAction SilentlyContinue) -and $Service.Status -eq 'Stopped'){ $Service | Start-Service }"

Hope that helps alleviates some of your pain.

Hello, I appreciate the info. I will test it out as an InTune remediation script. The problem is that the service is designed to restart if it fails, but it does not. This is the default recovery settings. I am not sure why it’s set to reset after 1 day instead of 0, and why the service does not start after 3 tries.

Sorry to hear about the issues, and thanks for the heads-up about that version. We are running 24.2.3.

We have had services stopping and not recovering for the past couple versions, resulting in clients showing offline in the Rep console. I have had an open support ticket for about three months now. In the meantime, I have a scheduled task on our endpoints that starts the bomagar-ps-* if it is stopped. Maybe you can use it as inspiration to build your own band-aid until a full solution is available.

Trigger: Event ID 10000 for Microsoft-Windows-NetworkProfile/Operational. Notes: This can trigger often and have duplicate events depending on your environment, but I couldn’t find a better trigger. I get around this by 1) Adding a 20 second sleep to my action and 2) Not starting a new instance of the task if it is already running

Action: powershell.exe -command "Start-Sleep -Seconds 20; if(($Service = Get-Service 'bomgar-ps-*' -ErrorAction SilentlyContinue) -and $Service.Status -eq 'Stopped'){ $Service | Start-Service }"

Hope that helps alleviates some of your pain.

Hello, I appreciate the info. I will test it out as an InTune remediation script. The problem is that the service is designed to restart if it fails, but it does not. This is the default recovery settings. I am not sure why it’s set to reset after 1 day instead of 0, and why the service does not start after 3 tries.

Do you have debug logging turned on? I found that the service wasn’t failing when it stops; it was actually gracefully stopping itself which would not trigger that recovery logic in your screenshot.

Yes, BLOG.INI is in root of C drive, and I have provided the logs to support. Their only solution is downgrade to 24.2.4 or wait for the new release. I am going to see if I can setup a detect/remediate pair of InTune scripts to make sure the service is running.

I am going to try these 2 scripts

# BeyondTrust-Service-Detection.ps1
#
# InTune Detect Script. Needs to run under system context.

#Detect if BeyondTrust Remote Support service is running or not.
$serviceName = Get-Service | Where-Object { $_.Name -like "sra-pin*" }

if ($serviceName.Status -eq 'Stopped') {
    Write-Output "Service is stopped."
    exit 1
} else {
    Write-Output "Service is running."
    exit 0
}

# BeyondTrust-Service-Remediation.ps1
#
# InTune Remediation Script. Needs to run under system context.

#Detect if BeyondTrust Remote Support service is running or not. Start if it's not running.
$serviceName = Get-Service | Where-Object { $_.Name -like "sra-pin*" }

if ($serviceName.Status -eq 'Stopped') {
    Start-Service -Name $serviceName.Name
    Write-Output "Service started."
} else {
    Write-Output "Service is already running."
}

InTune portal

Modified the detection script to add try/catch, and account for a situation where service does not exist.

#InTune Detect Script. Needs to run under system context.

#Detect if BeyondTrust Remote Support service is running or not.

try {
    $serviceName = Get-Service | Where-Object { $_.Name -like "sra-pin*" }

    switch ($serviceName) {
        { $_ -eq $null } {
            Write-Output "Service does not exist."
            exit 0
        }
        { $_.Status -eq 'Stopped' } {
            Write-Output "Service is stopped."
            exit 1
        }
        { $_.Status -eq 'Running' } {
            Write-Output "Service is running."
            exit 0
        }
        default {
            Write-Output "Unable to make a decision."
            exit 0
        }
    }
} catch {
    Write-Output "An error occurred: $_"
    exit 1
}

Seems like the InTune scripts are working to address the shortcomings. I will leave this here if anyone else has problems. Shout-out to @mjhall for his work and feedback on the issues faced!

Just chiming in to say that this update destroyed our jump clients as well. Support doesn’t seem to have enough time to perform proper troubleshooting so we’re heavily leaning toward just redeploying everything after downgrading back to 23.2. My critical concern is that we will still need to perform this upgrade again at some point and I’ve no idea how we’re meant to test it to make sure we don’t end up in the same situation. It’s just a complete mess.

No Test/UAT is really concerning. I agree support tries to solve this via email conversations rather than real time. This drags the troubleshooting time, creates frustration, and gets people upset. My biggest nightmare scenario is that we end up with a breach on BT side that compromises all their clients like SolarWinds. We have their agent deployed in every computer. If their security is a lax as their support we are in trouble.

We are running two on-prem appliances in a failover relationship which allows us to upgrade asynchronously and do acceptance testing on the secondary / non-prod environment before applying the updates in production.

Asynchronous Upgrade: Two B Series Appliances Set Up for Failover

If there are show-stoppers in testing, however, I’m not sure how I’d rollback those updates on the appliance (if it’s even possible). Then you might have to run without failover configured until it’s resolved or revert to a snapshot of the appliance before the upgrade.

I’ve heard some people say you can’t run an updated appliance without also updating the jump clients. I didn’t see this in my environment, but maybe for some updates that is the case (such as going from 23.x to 24.x)

We are running two on-prem appliances in a failover relationship which allows us to upgrade asynchronously and do acceptance testing on the secondary / non-prod environment before applying the updates in production.

We are a cloud customer, so no such access.

Moving to the cloud comes with trade-offs. It might reduce the workload for managing infrastructure, but it also means relying on the vendor for certain controls and access.

What are the security implications of having downgraded back to 24.2? I thought 24.3 was meant to patch a fairly serious security vulnerability. Would be great to get input from the BeyondTrust team.

Hello @ChristofferH - All of the current SRA Software versions have a BT24-10 and BT24-11 patches available to solve the latest security CVE’s you are referring to. Any upgrades (or downgrades) will require these patches to be installed ASAP after the upgrade is completed.

I hope this helps with your queries!

Hello @ChristofferH - All of the current SRA Software versions have a BT24-10 and BT24-11 patches available to solve the latest security CVE’s you are referring to. Any upgrades (or downgrades) will require these patches to be installed ASAP after the upgrade is completed.

I hope this helps with your queries!

Thanks - will this be done autonomously on your side for cloud appliances? We were downgraded back to 24.2 but I’ve no idea if any patches were applied after that.

Hello @ChristofferH - while I am fairly certain you would have been automatically patched being on a Cloud instance, it may be best to get a Support Case logged with our Support Team so they an 100% confirm for you. Always best to be sure!

Thanks!

We are seeing the same issue on this upgraded version. Has anyone had any update from support on the status of a patch to fix this issue?

I have been told that it is planned to be fixed in 24.3.2. There's no concrete ETA for its release at the moment though.

I have not seen this post til now but I had already deployed my own remediation for this very issue as it was driving me crazy that Jumps would go offline for no reason. I mentioned this on a service review recently with my account manager and technical account manager and I was told this was news to them. I will see if I can push them to get more details and reference this post too.

Hello All - just to advise that PRA 24.3.2 has now been released, as per the Release Notes: https://docs.beyondtrust.com/pra/changelog/privileged-remote-access-24-3-2. I believe the RS release will be soon.

Reply

Badge Earners

Reply

Badge Earners

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded