GraH
Well-known Member
- Joined
- Mar 22, 2020
- Messages
- 1,577
- Office Version
- 365
- Platform
- Windows
Hi all,
My data is structured in 2 Excel tables: tComments and tKeyTerms. Both are loaded in Power Query and I converted the tKeyTerms to a list {tKeyTerms}.
The goal is to detect if any of the terms, which are not single words but a group of words, are used in any of the comments. When there is a match I need to extract the key term from the list.
In the real life situation there are going to be 200K comments and potentially 100 to 150 key terms. The code I have so far works fast enough for 15K rows.
Currently I'm only able to detect if there is a matching value with the following M-code, but I can't figure out how I can extract or find the position from the list of the matching value:
Any guidance in the right direction is highly appreciated. Do bare in mind I'm not a programmer/code writer as 99% of what I do with PQ is done using the UI. So I also welcome some notes on how the code works.
Sample data looks like this:
My data is structured in 2 Excel tables: tComments and tKeyTerms. Both are loaded in Power Query and I converted the tKeyTerms to a list {tKeyTerms}.
The goal is to detect if any of the terms, which are not single words but a group of words, are used in any of the comments. When there is a match I need to extract the key term from the list.
In the real life situation there are going to be 200K comments and potentially 100 to 150 key terms. The code I have so far works fast enough for 15K rows.
Currently I'm only able to detect if there is a matching value with the following M-code, but I can't figure out how I can extract or find the position from the list of the matching value:
Code:
let
Source = Excel.CurrentWorkbook(){[Name="tComments"]}[Content],
ListKeyTerms = List.Buffer(tKeyTerms),
AddCol_AddListToSource = Table.AddColumn(Source,"KeyTerms",each ListKeyTerms),
AddColAsFx_TextContainsTerms = Table.AddColumn(AddCol_AddListToSource, "MatchKeyTerm", (C) => List.AnyTrue(List.Transform(C[KeyTerms], each Text.Contains(C[Comment], _))))
in
AddColAsFx_TextContainsTerms
Any guidance in the right direction is highly appreciated. Do bare in mind I'm not a programmer/code writer as 99% of what I do with PQ is done using the UI. So I also welcome some notes on how the code works.
Sample data looks like this:
Book1 (version 1).xlsb | ||||||
---|---|---|---|---|---|---|
A | B | C | D | |||
1 | Comment | KeyTerms | ||||
2 | er staan wat inleidende woorden voor dit zijn mijn sleuteltermen die dan weer gevolgd worden door een hoop overbodige woorden. | dit zijn mijn sleuteltermen | ||||
3 | in this phase there are a few words preceding what I'm actually looking for, my personal key term of interest and then again there are words, words and more words no-one cares about. | key term of interest | ||||
4 | Ceci est un texte qui ne vaut rien du tout. | |||||
Sheet1 |