Change proposal: NugetPackageManager restore functionality

This is a summary change discussion/proposal based on findings related to:
FR: Nuget feed in OpenFlow
(Bug?) Updating dependency leaves loaded dlls unchanged

I’ll try to lay down the train of thoughts, so it’s easier to see why the conclusions are as they are.

Please find below a short list of the main issues, from a user/deployer perspective:

Issue 1:
OpenRPA loads all dll’s from the extensions folder on application start.
This leads to extensions dll’s being locked for update (changing the dependency version in a project does not change the actual dll in extensions), which leads to the update not doing what it’s supposed to and leads to invalid workflows.
Note: this does not occur when a dependency has been added in the same run of OpenRPA, as it can then actually remove the file (process still has ownership of it).

Issue 2:
OpenRPA does not ensure that dependency projects are actually present and loaded for available projects that have a dependency change (it does on project manual import).
This leads to missing dependencies when remotely deploying projects with package dependencies.

Issue 3:
OpenRPA Package Manager only allows for changing dependencies of the projects, but not installing the dependencies on the local machine, and all dependencies show as Installed.
This means that if a package is added to a project as a dependency, Package Manager shows it as Installed, even though it’s completely missing from the machine. To actually install it, the user has to either install or update the package (and thus alter the project), it does not give the possibility to do an equivalent of a nuget restore for available projects.

Issue synthesis:
In summary, the only way to actually install the extension on your machine is to either manually put correct dll’s into the extensions when OpenRPA is closed, or “update” the project dependencies (requires update rights to the project).

Solution approach:
Since reworking OpenRPA to be multi-process or multi-appdomain is not worth the gain just for extensions (it’s a niche use case :frowning: ), and “external” solutions are at best bandaid hacks, the solution should most likely center around NugetPackageManager and how (when) the extensions are loaded.

Addition 1:
Add a nuget restore equivalent
All of the functionality on the NugetPackageManager side is already there, it should only require adding a new call path to install the packages without altering the projects.
If I’m reading the code correctly, it could work similar to this (a bit hacky, but ehh):

private bool InstallAllProjectDependencies()
{
	var allDependencies = new Dictionary<string, string>();

	foreach (var project in availableProjects
							.Where(p => p.dependencies != null && project.dependencies.Count > 0))
	{
		Log.Debug($"Found dependencies on project: {project.name}");
		foreach (var newDependency in project.dependencies)
		{
			if (allDependencies.ContainsKey(newDependency.Key))
			{
				if (allDependencies[newDependency.Key] != newDependency.Value)
				{
					// could have a better errMsg
					string errMsg = $"Dependency clash for {newDependency.Key}: {newDependency.Value}.";
					Log.Error(errMsg);
					return false;
				}
				
			} else {
				Log.Debug($"Added dependency to load: {newDependency.Key}: {newDependency.Value}");
				allDependencies[newDependency.Key] = newDependency.Value;
			}
		}
	}

	foreach (var kvp in allDependencies)
	{
		// yanked from Project.cs
		 foreach (var jp in allDependencies)
		{
			var ver_range = VersionRange.Parse(jp.Value);
			if (ver_range.IsMinInclusive)
			{
				Log.Information("DownloadAndInstall " + jp.Key);
				var target_ver = NuGet.Versioning.NuGetVersion.Parse(ver_range.MinVersion.ToString());
				await NuGetPackageManager.Instance.DownloadAndInstall(null, new NuGet.Packaging.Core.PackageIdentity(jp.Key, target_ver), true); // can call with project = null as it's not used there? Otherwise create a dummy throwaway project before the loop
			}
		}
	}
	return true;
}

When to call it:

Potentially this could be called instead of the current extension assembly loading in app.xaml, although I’m not sure if the projects are available at that point.
Or, the LoadDll’s parameter could be changed to false and call it before the extensions loading. That way it would do an auto-restore before loading the extensions (config flag default false? → set to true for end users/robots?).

Aside of that, I think it would be good to add a call like that (with true for load) on user demand (either on project level or for all, the code is very similar). That way, the user (dev) would have good control of what’s happening when.

Summary:
I think adding an auto-restore on startup would “fix” most of the extensions version management issues related to this.

So… thoughts?

If this sounds reasonable, I’ll be happy to proceed with seeing if it actually works as intended (at least the manual restore part, not sure about the auto-restore part on startup).
As in - I can add it and PR it in if it works, so you don’t have to worry about this for now :slight_smile: I just need an opinion if this is a change that you’re open to before diving into code.

For now, something like this manual restore + the PackageRestore workflow from the OpenFlow Nuget attempt should be a good workaround.

I’m not sure I agree on issue 3.
When OpenRPA starts, it does a NuGet install for all projects, AND it installs packages as you “add” them, both locally and remotely.

foreach (var _id in updatePackages)
{
    try
    {
        var p = Projects.FindById(_id);
        await p.InstallDependencies(true);
    }
    catch (Exception ex)
    {
        Log.Error(ex.ToString());
    }
}

inside LoadServerData in RobotInstance.cs.
I have done a ton of tests on this a few years ago, but I trust you if you tell me it’s broken now, but please check again just to be safe?

I have a tendency to “misread/understand,” and then after I reply and come back later, I get a “doh” moment. So bear with me if I misunderstood:

The reason I load all DLLs no matter what is to get the toolbox updated in case a DLL contains activities. I guess we could add some “caching” on what DLLs have activities and what has not, and then only load those on startup and the rest from CurrentDomain.AssemblyResolve, but I kind of assume you have a dependency because you intend on using it, hence it would still be locked on the machines you actually want to update, so it would not solve the main issue.

Oh, I just had my first “doh” moment…
You mean, if it loads the DLLs at startup and then gets an updated project from the server, the DLLs are already locked? True… In that case, it makes sense to not load the DLLs right away, but I’m not sure how to tackle “push loading the toolbox” to later? But for most robots, I would assume they are always online and therefore have an up-to-date project definition.

Setup:
Nuget source set:
<add key="OpenRPAPackages" value="C:\Users\{myUser}\AppData\Roaming\OpenRPA\packages"/
image

Available project:
image

Dependencies state (nothing installed):

Starting OpenRPA…

Opening a workflow:
image

XAML relevant parts:
image
image

Folder states (still empty):

After running the workflow I sent you yesterday the things install and work just fine.
So - either there is one more place that keeps track of the “installed” state that I missed, and that’s why it doesn’t restore, or it just doesn’t restore.
The thing is if I missed the place that also needs to be cleaned up (btw - checked GAC, these packages are not there), then just clearing the extensions folder will lead to that state.

try adding a breakpoint to DownloadAndInstall in NuGetPackageManager.cs
does it fail on GetNearest check ?
if not, does it hit the “InstallFile” foreach ?

I’ll check in the evening on another machine that I’m not messing with so much.

Today popped up a potential workaround for the dll lock as well (just putting it here so I don’t forget).
Since OpenRPA installs into and loads from the same folder (and that causes the lock-on-startup-load), what if it didn’t?
So the installations and so on would work the same, but not the loading on-start. It could do an extensions-runtime-cache copy (in the openrpa folder) at startup, and load from there. And on close (or also at startup in case of hard crashes) - clean it up.
Since the issue does not surface when you’re in the same session (it still has file ownership and can do whatever with them) and it can’t unload from the domain anyway (so if you load a different version of the same thing… I’m not really sure how that interacts under the hood) so a restart is still needed, then maybe?
Sure, it would increase the startup time a tiny bit, but it shouldn’t be even noticeable unless it’s running on a really slow HDD and there’s waaaay too many extensions.

I’ve set these up as below (the first one just to check if debugger attaches correctly, disabling on first hit):

It hit none of them on startup :confused: in app when opening the package manager it hits the initialize breakpoint, so for the whole procedure nothing happened there.

Maybe we miss understood eachother
either you installed a package locally, using the package manager ui
This will then call the install function

or you installed it on a remote machine, and your local openrpa needs to “catch” it …

  • if it’s running it does so in the onWatchEvent inside RobotInstance.cs that calls InstallDependencies
  • if it was offline, then doing startup, it “catches” the updated project inside LoadServerData that keeps track on update projects and call InstallDependencies for each of them

simply restarting will not do anything

1 Like

and you might say, but what about a “new” robot …
well, when that starts, the local database is empty, so LoadServerData will see all projects as “new” and call InstallDependencies on those.

so if you dont’ want to update a project to test, you can also just close openrpa, delete the localdatabase and start it again

1 Like

Yeah, I just realized I was going on a completely wrong path. That’s a bit embarrassing :smiley:

After making sure that it had a project version mismatch between machines:
It does try to install, if the project versions are different, and then it hits the locked dll issue since it already loaded them.

It works fine on a “new” robot, as you said, but after a restart there (so the dll’s lock out), it’s back to square one.

I’ll try to debug some more later, I think I know how to sidestep that (with the extensions-runtime-cache idea). If it works, we could also enable back the Update button, since that should work too. And then extensions should be safe for use, meaning that custom models and so on are pretty much risk free (as long as you can give nuget access to OpenRPA the packages they need).

Sorry for the confusion, I should have been more rigorous when testing on a multi-machine setup :confused: so easy to miss a spot.

Once we’ll get this working, I promise to make a video tutorial on how to make use of the extensions for data modeling - and after our last talk about different stack pieces, with a class model that works for both .net framework 4.6.2 (OpenRPA) and .net6 (agents). That’s one way to future-proof potential moving of OpenRPA parts to agents later for scalability/fine coding control, and keep the data strongly typed on both ends on the same model.

I’m just happy you are willing to put in some time to this.
I know we cannot make it perfect, due to OpenRPA being a one-process thing right now, but any help to make it better is always welcome.

Small update and I’m out for today:
The good:
extensions-runtime-cache sidestep of the dll-lock seems to be working :slight_smile: this is really nice, and allows for turning the Update back

But it’s a western, unfortunately :frowning:

The bad:
When 2+ projects have a dependency overlap, the newest version will be used. This is fine, and as it should be, but since it’s completely invisible to the user, it can lead to an invalid xaml (or behaviour change) if the versions have breaking changes between them. This is for both direct dependencies, and further down the dependency tree.

The ugly:
When 2+ projects have a dependency overlap, and you uninstall (remove) the dependency from one of them, this triggers a dependency uninstall, which will leave the other project “broken” for execution (missing dependency). Looking through the code, this could even happen if one project has A->B->C dependency, and the other has a direct B or C, so it’s really hard to find at first as a user.

So… I’m working now on a dependency resolver that will take into account all available projects, traverse the dependecy trees of them (using all available nuget sources), then flatten the dependencies to find the “highest minimum version” that fits all of them, then a) produce a recommendation to the user to update dependencies for conflicting projects, or b) triggers a install for the packages (if there’s no conflict).

So far I have the flattened dependency “highest minimum version”, and could implement triggering an install, but it doesn’t check for conflicts yet.
Ahh, the good ol’ dependency hell mitigations :slight_smile:

For now I’m adding it as a side-functionality, not changing anything in the existing NugetPackageManager, but since a lot of the functionality overlaps (especially the package finding), I think it could be deduplicated later.

More tomorrow, if time allows :slight_smile:

PS. If anyone wants to have a look :wink:

2 Likes

Time didn’t allow, so no progress, but found also a potential bug/unintended behavior around this line:

Since for my testing purposes I’ve put in OpenFlow just the versions I wanted, it wasn’t actually finding the package version I was expecting it to see due to (I think, can’t go to debug mode right now) a different feed having more versions of that package.

What seems to be potentially causing it is a little above the highlighted line - if I’m reading it correctly, the search uses always all feeds, regardless of which one is selected, and returns results from the one that has most versions (which might not include the version that is needed).

Truth be told, I have a working “workaround” for these issues (with the custom workflows, and adding the extension-runtime-cache), so I’m not sure if this should be pursued further, as it seems (and this might be overshooting) that most of the NugetPackageManager (including the UI bindings) would need changes.
So I’m not really sure if it’s worth it (still would be fun though, but time and so on).

I guess we could just pull in the changes related to the folder cache (so dll’s don’t lock, ergo update works as expected) and leave the rest for later?

1 Like