chillzen
New Member
- Joined
- Feb 1, 2022
- Messages
- 8
- Office Version
- 2016
- Platform
- Windows
- MacOS
- Mobile
- Web
Hello,
I am stuck trying to solve what may be a very simple problem. I understand logically what needs to be done I just don't have the knowledge for the best way to handle this much data and not make excel crash. I am currently limited with no access to utilize Pandas to solve this even though that seems to be the best option. If this can be done in Excel possibly using VBA that would be great. I am creating a 2-column term base for two languages (A and B).
Notes:
Columns A:N = Lang A
Columns O:AP = Lang B
Each row is a set of a word in LangA and its synonyms with LangB and its grammatical case variations
~26K rows total
Total of 67,431 LangA terms | 285,974 Lang B
Not all Language A terms have Language B equivalents and vise versa but still need to be returned (to fill in later with the translations)
Average of 2.5 LangA terms in each row and avg 10.5 LangB
Output needs to be 2 columns (may be split into multiple sheets/files if needed since it will obviously be an extremely large number of rows generated)
Below is the best sample I can provide that should help work towards a solution:
Raw Data
Desired Output
I am stuck trying to solve what may be a very simple problem. I understand logically what needs to be done I just don't have the knowledge for the best way to handle this much data and not make excel crash. I am currently limited with no access to utilize Pandas to solve this even though that seems to be the best option. If this can be done in Excel possibly using VBA that would be great. I am creating a 2-column term base for two languages (A and B).
Notes:
Columns A:N = Lang A
Columns O:AP = Lang B
Each row is a set of a word in LangA and its synonyms with LangB and its grammatical case variations
~26K rows total
Total of 67,431 LangA terms | 285,974 Lang B
Not all Language A terms have Language B equivalents and vise versa but still need to be returned (to fill in later with the translations)
Average of 2.5 LangA terms in each row and avg 10.5 LangB
Output needs to be 2 columns (may be split into multiple sheets/files if needed since it will obviously be an extremely large number of rows generated)
Below is the best sample I can provide that should help work towards a solution:
Raw Data
A Terms | A Terms | B Terms | B Terms |
A1 | A2 | B1 | B2 |
A3 | A4 | B3 | |
A5 | B4 | B5 | |
A6 | B6 | ||
A7 | |||
B7 | B8 | ||
B9 |
Desired Output
Term A | Term B |
A1 | B1 |
A1 | B2 |
A2 | B1 |
A2 | B2 |
A3 | B3 |
A4 | B3 |
A5 | B4 |
A5 | B5 |
A6 | B6 |
A7 | |
B7 | |
B8 | |
B9 |