Chinese encoding errors(unreadle �)

Hi, Allan,
Greetings !
I have always been troubled by Chinese encoding anomalies. Specifically, variable values ​​and tool names will automatically change to abnormal characters � after using Chinese. As shown in the figures below, this problem occurs when using “"sequence” , “variables“.

I have tried many solutions, such as:
(1) Changing the MongoDB database encoding
(2) Changing the local computer character encoding
(3) Adding UTF-8 encoding to the open.exe.config and settings.json configuration files
(4) Adding UTF-8 encoding to the xaml file

(5)change the culture of xaml or the language of UI
But none of the above methods worked, so I would appreciate any help from experts in this field.

In addition, I noticed two typical phenomena that may help solve this problem:
(1) In offline mode, this problem does not seem to occur
(2) This problem occurs automatically without modifying any process content

The “�” you’re seeing is the Unicode replacement character. That only shows up when some part of the round-trip decoded the bytes with the wrong encoding. The thing that makes this hard to debug is that you are only seeing this for a few characters and not everything.

In offline mode, OpenRPA saves the data in a LiteDB database. When in “normal” mode, it saves all data both into a local LiteDB database and in MongoDB in OpenCore.

If you open the workflow that has this problem in OpenCore, do you also see these characters (to check if this is a UI display issue or if the data is actually corrupted)?

If you open a workflow with and without the problem, and export it from OpenRPA, do you see the corrupted signs in any of those?

Does this only happen with certain characters (e.g., punctuation like “、”, “。”, quotes), or also basic CJK characters?

Are the strings typed directly, or pasted from Word/WeChat/Excel? (Those sometimes introduce legacy encodings.)

(1) In opencore, no “�“ was found.

(2) Abnormal flow was exported as json format. The corrupted signs were found. After the normal flow (ApiWF.json) was exported, no “�“ was found.

(3) The “�“ was also found to happen, when using basic CJK characters

(4) The CJK characters were typed by manual

If both workflows are fine inside OpenCore, but one of them has the issue, this indicates a problem in OpenRPA. In your first post, you mentioned this happens even when you do not do anything. There should only be two cases where OpenRPA updates a workflow without you pressing save:

  1. If you import a workflow that was exported from another OpenRPA with images embedded, it will save those as separate files and save the workflow with the links to those files.
  2. If you open a workflow that uses a different local setting (culture) than what your computer has, it updates the workflow to use what you have, to avoid specifically this problem.

Are you changing local settings (culture) when this problem occurs?
Are you able to create a workflow that does not have the issue and then force the problem to occur? If you are, did the culture setting on the workflow in OpenRPA change? And did you change your local culture settings as part of that process?

Sorry, late to reply
In fact, “This problem occurs automatically without modifying any process content” means that,I just tried to expand the sequence in the process without modifying any content. At that time, the editor prompted that there were some modifications. I tried to save the modifications, and then the symbol “�” were be introduced. Sometimes, the symbol “�” even occured during the flow was running, without any operating. I tried to correct the “�” by using “Edit XAML”. I copied the XAML content into vs code and replaced “�” with correct CJK characters. Then, I pasted the corrected XAML content to “Edit XAML” box. After that, I tried to save the modifications, and then the symbol “�” still existed.

(1)Actually, the workflow was created by using copying. Then I tried to modify the name. In the source workflow, no image was introduced, and the CJK characters were also typed manually.

(2)The culture of the workflow was settled as Chinese.The operating system of local PC is WIN10. The regoin was settled as Chinese and the settings “Beta: Use Unicode UTF-8 for worldwide language support” was chosen.

I cannot force the problem to be reproduced. The probability of this problem occurring is high, but it is not necessarily 100% reproducible.

If that possible, I can send the source json file of workflow to your email. Then, you can try to debug the problem by local.

To me, it still sounds like you are not using the same locale/regional settings as the workflow was created in. Then this will happen. The reason we have a region field is to make sure that when we EXECUTE the script/workflow, it uses the correct locale; it does not mean you can EDIT using different locales. So if you open it, you will change it.

OK, thanks a lot ! I will try to solve the problem in this way.

1 Like

Now I have solved this problem by trying to switch the culture to “Chinese (China)”. During this process, I tried the following options in turn, including: “Chinese”, “Chinese (Simplified)”, “Chinese (China)”, among which only “Chinese (China)” was valid. Therefore, unproper culture may be the reason that inducing the encoding error.

1 Like

Nice. Thank you for sharing the solution.

Thank you for your guidance and help, otherwise I would not be able to solve this problem smoothly.

More accurate descriptions of steps to modify culture are listed as following:
(1) First, cancel “serializable (save state)”, if it’s chosed.
(2) Then, choose the culture opotions according the language using in the workflow.
(3) Finally choose “serializable (save state)”.
After above steps, the changes will be applied, otherwise it will fail to save.



This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.