Hi!
We are seeing an issue affecting several different robots.
These robots are running on separate Windows virtual machines on the same server.
Occasionally, a robot will become unresponsive (“lazy”):
-
It remains connected to OpenFlow
-
But it stops picking up new WorkItems
-
It also stops responding to
invoke workflowcalls from Node-RED -
We are able to invoke a workflow manually within the OpenRPA client, just not automatically via NodeRED
-
Restarting the OpenRPA application immediately fixes the problem, however as this happens overnight it is a bit annoying as we then notice it only the next working day.
This has been happening for quite some time, but we only started looking more into it now.
Here we enabled some more logging, most importantly network, and on a robot which has become lazy, we see that it does receive queuemessages regarding workflows from NodeRED, which it simply ignores.
However we also found an exception being thrown for a specific message:
[08:27:30.225][Network] Send (1) queuemessage / {“striptoken”:true,“queuename”:“unknown.ogcsqrmkg”,“data”:{“command”:“error”,“workflowid”:null,“flowid”:null,“nodeId”:null,“detectorid”:null,“killexisting”:false,“killallexisting”:false,“traceId”:null,“spanId”:null,“data”:{“ClassName”:“System.NullReferenceException”,“Message”:“Object reference not set to an instance of an object.”,“Data”:null,“InnerException”:null,“HelpURL”:null,“StackTraceString”:" at OpenRPA.RobotInstance.<WebSocketClient_OnQueueMessage>d__85.MoveNext()",“RemoteStackTraceString”:null,“RemoteStackIndex”:0,“ExceptionMethod”:“8\nMoveNext\nOpenRPA, Version=1.4.57.7, Culture=neutral, PublicKeyToken=null\nOpenRPA.RobotInstance+<WebSocketClient_OnQueueMessage>d__85\nVoid MoveNext()”,“HResult”:-2147467261,“Source”:“OpenRPA”,“WatsonBuckets”:null}},“correlationId”:“1a6ad1f847b72de1”,“replyto”:null,“expiration”:0,“consumerTag”:null,“routingkey”:null,“exchange”:null,“traceId”:null,“spanId”:null,“error”:null,“jwt”:null}
So my questions are:
- Does this have anything to do with us not using the newest version; we use 1.4.57.7
- Is there more ways that we can see what the actual issue is, like some more stuff we can log.
Also right now we only have the log from within at max an hour as so much is printed in the log. Is it maybe possible to pipe it to a file. I see that i can enablelog_to_filein settings, but what is the location of this log then? - Do you have any other suggestions to fixing this?
Thanks in advance