Using Dict to Translate Large Data Sets

hongji

New Member
Joined
Jul 13, 2023
Messages
3
Office Version
  1. 2021
Platform
  1. Windows
I am tasked with triaging and translating large sets of data from Chinese to English. I want to use a macro to take care of as many of the repeat terms as I possibly can. My issue is, for words that I see a lot but contain another word that is in the dictionary already in the string, it only translates the shorter word and then stops. For example, 'Name', 'Legal Name', and 'Registered Name', all appear in the data sets, however the macro only translates the Chinese 姓名, and leaves the other 3 characters in the longer word. Any help would be greatly appreciated. I'm also open to doing this in a completely different way if what I'm doing isn't the most efficient. Here is what I have;

Sub Translate()
Dim rng As Range
Dim cellArray As Variant
Dim replacements As Object
Dim key As Variant
Dim i As Long, j As Long

' Define the replacement rules using a Dictionary object
Set replacements = CreateObject("Scripting.Dictionary")
replacements.CompareMode = vbTextCompare
With replacements
.Add "性别", "Gender"
.Add "联系方式", "Contact Method"
.Add "身份证姓名", "Legal Name"
.Add "原上报姓名", "Registered Name"
.Add "姓名", "Name"
.Add "求教", "Teaching Assistant"
.Add "求教电话", "Teaching Assistant Phone Number"
End With

' Check if any cells are selected
If Selection.Cells.Count > 0 Then
' Set the range to the selected cells
Set rng = Selection

' Turn off screen updating and calculation
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual

' Transfer range values to an array for faster processing
cellArray = rng.Value

' Perform replacements directly on the array
For i = LBound(cellArray, 1) To UBound(cellArray, 1)
For j = LBound(cellArray, 2) To UBound(cellArray, 2)
For Each key In replacements
cellArray(i, j) = Replace(cellArray(i, j), key, replacements(key), , , vbTextCompare)
Next key
Next j
Next i

' Write the modified array back to the range
rng.Value = cellArray

' Turn on screen updating and calculation
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End If
End Sub
 

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
Hello @hongji.
Welcome to the MrExcel forum. Please accept my warmest greetings and sincere hope that all is well.

Check if the following idea makes sense to you:

1. Assuming that the data to be replaced is in column B.​
2. In three columns that you have available, put the text in Chinese, in English and the Long function.​
(see my example in columns E to G).
3. Sort columns E through G by column G from greatest to least.​
(see my example in columns E to G).

Dante Amor
ABCDEFG
1ChineseChineseEnglishLen
2性别求教电话Teaching Assistant Phone Number31
3姓名求教Teaching Assistant18
4性别原上报姓名Registered Name15
5联系方式联系方式Contact Method14
6身份证姓名电话Phone Number12
7求教电话身份证姓名Legal Name10
8求教性别Gender6
9原上报姓名Number6
10联系方式姓名Name4
11电话
12身份证姓名
13性别
14
15姓名
16
Hoja11
Cell Formulas
RangeFormula
G2:G10G2=LEN(F2)


4. Run the following macro:​

VBA Code:
Sub traslate_v3()
  Dim c As Range
 
  For Each c In Range("E2", Range("E" & Rows.Count).End(3))
    Range("B:B").Replace c.Value, c.Offset(, 1).Value, xlPart, , False
  Next
End Sub
5. Change range("B:B") to your selection or to the range of cells you need.​


6. After the macro:
Dante Amor
ABCDEFG
1ChineseChineseEnglishLen
2Gender求教电话Teaching Assistant Phone Number31
3Name求教Teaching Assistant18
4Gender原上报姓名Registered Name15
5Contact Method联系方式Contact Method14
6Legal Name电话Phone Number12
7Teaching Assistant Phone Number身份证姓名Legal Name10
8Teaching Assistant性别Gender6
9Registered NameNumber6
10Contact Method姓名Name4
11Phone Number
12Legal Name
13Gender
14Number
15Name
16
Hoja11
Cell Formulas
RangeFormula
G2:G10G2=LEN(F2)


--------------
Let me know the result and I'll get back to you as soon as I can.
Respectfully
Dante Amor
--------------
 
Upvote 0
Hi Dante,

This looks great, and is exactly what I need. Am I able to save this in my personal workbook in VBA or some other tool in order to apply this to different excel sheets with the same terms. I have about 600 excel sheets I want to just power through, so if I could make this a ribbon or something that would be the most optimal. Thanks!
 
Upvote 0
I have about 600 excel sheets
To do on all sheets, do the following:

1. Create a sheet and name it "Translator"​
2. In column A the sentence in Chinese, in B the sentence in English and in C the Len function, example:​
Dante Amor
ABC
1ChineseEnglishLen
2求教电话Teaching Assistant Phone Number31
3求教Teaching Assistant18
4原上报姓名Registered Name15
5联系方式Contact Method14
6电话Phone Number12
7身份证姓名Legal Name10
8性别Gender6
9Number6
10姓名Name4
11
12
13
Translator
Cell Formulas
RangeFormula
C2:C10C2=LEN(B2)



3. Sort the data on the "Translator" sheet from largest to smallest by column C.​
4. Adjust the range of cells in the macro, I am putting column B: sh.Range("B:B")

You can put a range of columns: sh.Range("A:E")

Or the whole sheet:sh.Cells

5. Adjust the range in the part highlighted in blue and run the macro:​
Rich (BB code):
Sub Translate_v1()
  Dim shT As Worksheet, sh As Worksheet
  Dim c As Range
  
  Set shT = Sheets("Translator")
  
  For Each sh In Sheets
    If sh.Name <> shT.Name Then
      For Each c In shT.Range("A2", shT.Range("A" & Rows.Count).End(3))
        sh.Range("B:B").Replace c.Value, c.Offset(, 1).Value, xlPart, , False
      Next
    End If
  Next
End Sub

6. You can put a button on the "Translator" sheet and assign the macro to the button.​



--------------
Let me know the result and I'll get back to you as soon as I can.
Respectfully
Dante Amor
--------------​
 
Upvote 0
translating large sets of data from Chinese to English.
Just one question to @DanteAmor & @hongji

Since the translation is from Chinese to English, shouldn't that sorting of items based on length be done on the length of the Chinese data? Otherwise it may still be possible that a shorter subset of a longer Chinese set of characters could get replaced first? I don't know Chinese at all so I don't know if that circumstance could arise but thought it might be more certain to always get it right.
 
Upvote 0
To do on all sheets, do the following:

1. Create a sheet and name it "Translator"​
2. In column A the sentence in Chinese, in B the sentence in English and in C the Len function, example:​
Dante Amor
ABC
1ChineseEnglishLen
2求教电话Teaching Assistant Phone Number31
3求教Teaching Assistant18
4原上报姓名Registered Name15
5联系方式Contact Method14
6电话Phone Number12
7身份证姓名Legal Name10
8性别Gender6
9Number6
10姓名Name4
11
12
13
Translator
Cell Formulas
RangeFormula
C2:C10C2=LEN(B2)



3. Sort the data on the "Translator" sheet from largest to smallest by column C.​
4. Adjust the range of cells in the macro, I am putting column B: sh.Range("B:B")

You can put a range of columns: sh.Range("A:E")

Or the whole sheet:sh.Cells

5. Adjust the range in the part highlighted in blue and run the macro:​
Rich (BB code):
Sub Translate_v1()
  Dim shT As Worksheet, sh As Worksheet
  Dim c As Range
 
  Set shT = Sheets("Translator")
 
  For Each sh In Sheets
    If sh.Name <> shT.Name Then
      For Each c In shT.Range("A2", shT.Range("A" & Rows.Count).End(3))
        sh.Range("B:B").Replace c.Value, c.Offset(, 1).Value, xlPart, , False
      Next
    End If
  Next
End Sub

6. You can put a button on the "Translator" sheet and assign the macro to the button.​



--------------
Let me know the result and I'll get back to you as soon as I can.
Respectfully
Dante Amor
--------------​
Sorry, I worded that poorly. It's around 600 separate excel documents, so I want to be able to add a button to my ribbon that I can use across different excel documents.
 
Upvote 0
Check this:





 
Upvote 0
Follow the next steps:

1. In a new book, put the following information in columns A through C.
Dante Amor
ABC
1ChineseEnglishLen
2求教电话Teaching Assistant Phone Number31
3求教Teaching Assistant18
4原上报姓名Registered Name15
5联系方式Contact Method14
6电话Phone Number12
7身份证姓名Legal Name10
8性别Gender6
9Number6
10姓名Name4
11
Translator
Cell Formulas
RangeFormula
C2:C10C2=LEN(B2)

You know, you put the formula and sort by column C.

2. The name of the sheet must be: Translator
3. Put the following macro in that workbook
VBA Code:
Sub translate_v1()
  Dim sh As Worksheet
  Dim c As Range
  Set sh = ThisWorkbook.Sheets("Translator")
  For Each c In sh.Range("A2", sh.Range("A" & Rows.Count).End(3))
    Range("B:B").Replace c.Value, c.Offset(, 1).Value, xlPart, , False
  Next
End Sub
Adjust the column or columns you need.

4. Save the new workbook as macro-enabled excel, with the name: translator_v1
Just to have the backup
5. Now save the workbook again but as an Excel add-in with the name translator_v1
6. Close the book.
7. Open an excel workbook to translate.
8. Click the File tab, click Options, and then click the Add-Ins category.
9. In the Manage box, click Excel Add-ins, and then click Go.
10. The Add-Ins dialog box appears.
11. Press Browse botton
12. Select the translator_v1 add-in file
13. Press Ok
14. Press Ok
15. To make your favorite commands easily accessible, you can add your own tab to the Excel ribbon. Here's how:
16. In the Customize the Ribbon window, under the list of tabs, click the New Tabbutton.

Create a new tab for the ribbon.

17. This adds a custom tab with a custom group because commands can only be added to custom groups.
18. Select the newly created tab, named New Tab (Custom), and click the Rename… button to give your tab an appropriate name. In the same manner, change the default name given by Excel to a custom group.
19. Top left in the Choose command from: box
20. Select Macros
21. Select translate_v1
1689619667864.png

22. Press Add button to add the macro to the new group

1689619717180.png


When done, click OK to save the changes.

23. Press Ok
24. Select from new tap and press button translate_v1
1689619842505.png



Result:
1689619867241.png


------------------
🫡 😅

------------------
Another idea is that a macro opens one by one the 600 files and applies the translation to the column or columns you need, saves the file, closes the file and opens the next file.
;)
 
Upvote 0

Forum statistics

Threads
1,223,903
Messages
6,175,284
Members
452,630
Latest member
OdubiYouth

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top