recognise format of code in Power Query to extract

ooptennoort

Board Regular
Joined
Mar 29, 2021
Messages
67
Office Version
  1. 365
Platform
  1. Windows
>10k of rows of short sentences have within then, usually, a code. I need to extract this code, but how do I get PQ to recognise its format, which is:

230F-115-04
301F-115-05
30A-115-26
30F-1180-02
30F-118-03
30F-1252-04
130H-1252-05
30F-1252-18
30F-126-01
40G-127-01
30F-108-01
31F-299-91
30F-130-01
30F-131-06
30F-132-01

I've tried with text.before and between and after, but have never gotten just -and all of- the code. Very challenging. Help is very much appreciated (if want I want is possible at all ;) ).
 

Excel Facts

What is =ROMAN(40) in Excel?
The Roman numeral for 40 is XL. Bill "MrExcel" Jelen's 40th book was called MrExcel XL.
Hi,
So you want to lookup any of these codes in a column of text?
There is no direct function for this, at least not that I know off. However you can write it. Try this tutorial.
Problem is, I don't have the codes. There may be millions. So I can't lookup. All I can do is recognise the code format/structure. Maybe if I can use your suggestion by looking for excel wildcards, somehow... but the variation ???-???-?? or ??-??-?? or ???-??-?? or ??-???-?? make it difficult...
 
Upvote 0
John McDougal wrote a post I use to bulk replace data in tables and I am guessing the code could be adjusted to extract codes (rather than replace them) if you have them in a table that can be referred to.

Thank you both, btw! Like I said to Grah, problem is, I don't have the codes. There may be millions. So I can't lookup. All I can do is recognise the code format/structure. Maybe if I can use your suggestion by looking for excel wildcards, somehow... but the variation ???-???-?? or ??-??-?? or ???-??-?? or ??-???-?? make it difficult...
 
Upvote 0
Problem is, I don't have the codes. There may be millions. So I can't lookup. All I can do is recognise the code format/structure. Maybe if I can use your suggestion by looking for excel wildcards, somehow... but the variation ???-???-?? or ??-??-?? or ???-??-?? or ??-???-?? make it difficult...
I see. Could you upload a small, but representative sample of data via xl2BB?
PQ does offer example from column (much like flash fill), but I doubt it will work.
 
Upvote 0
I see. Could you upload a small, but representative sample of data via xl2BB?
PQ does offer example from column (much like flash fill), but I doubt it will work.
Using Column from examples is the first thing I tried and it worked surprisingly well, but I think because the sample size (999 records?) was too small... and >999 of the first cells contained my code, that batch was 100% perfect, but where code was missing Column from examples just provided the whole cell content (problem description) instead of empty cell. Column from examples also missed the (rare) other formats (???-???-?? and ???-????-??).
Using xl2BB is something I will try asap (but I am not sure if it is necessary... will have to try it, not familiar with it)! Thank you!
 
Upvote 0
I ask for the sample, just to have a view on the patterns. I doubt it is ok to look for "-" and build around it. Last space before and first space after and stuff like that. Samples make it easier to assess if we are on a good track or not.
 
Upvote 0
Hi @ooptennoort ,
Would this do the trick for you?
Book1
AB
1CommentGetCodes
2bla blabla bla 230F-115-04 blablablabla blablabla blablablabla blablabla blabla230F-115-04
3blabla blablablablablabla blablabla bla 301F-115-05 blabla bla301F-115-05
4bla bla bla bla bla bla bla blablablablablablablablablabla 30A-115-26 blablablablablablablabla blablablablablabla blabla30A-115-26
5blablablablablabla 30F-1180-02 blablablablablablablablablablablablabla30F-1180-02
6blablablablablablablablablabla 30F-118-03 blablablabla 130H-1252-0530F-118-03, 130H-1252-05
7blablablablablablablablablablablabla 30F-1252-04 blablablablablablablablablabla30F-1252-04
8blablablabla400 blablablablabla 130H-1252-05 blablabla130H-1252-05
95000 blablablablablablablablablablablablablablablablablablabla 30F-1252-18 blablablablablablablablablablablablablablablablabla30F-1252-18
10blabla 30F-126-01 blablablablablablablablabla30F-126-01
11blablablablablablablablablablablablabla 40G-127-01 blabla40G-127-01
12blablablablablablablablablablablablabla 30F-108-01 blablablabla30F-108-01
13blablabla 31F-299-91 blablablablablablablablablablablablablablablablabla31F-299-91
14blablablablablablablablablablablablablablablablablablabla 30F-130-01 blablablablablablablablablablablablablablabla30F-130-01
15blablablablablablablablablablablablablablablabla 30F-131-06 blablablablablablablablabla30F-131-06
16301F-115-05 blabla 30F-132-01 blablablablablablablablablablablabla 105301F-115-05, 30F-132-01
1730F-130-01 blablabla30F-130-01
18blablabla 30F-130-0130F-130-01
Sheet2

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Comment", type text}}),
    GetCodes = Table.AddColumn(#"Changed Type", "GetCodes", each let Words = Text.Split([Comment]," "),
Codes = List.Select(Words, each List.AllTrue({ Text.Contains(_, "-"), 
Text.PositionOfAny(_, {"0".."9"} ) <> -1,
Text.PositionOfAny(_, {"A".."Z"} ) <> -1})),
Combine = Text.Combine(Codes, ", ") 
in 
Combine)
in
    GetCodes
 
Upvote 0
Use data from GraH

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Regex = Table.AddColumn(Source, "Regex", each  Text.Combine(Json.Document(Web.Page(
    "<meta http-equiv='X-UA-Compatible' content='IE=edge'> <script>
    document.write(JSON.stringify('"& [Comment] &"'.match(/\d+[A-Z]-\d+-\d+/g)));
    </script>")[Data]{0}[Children]{0}[Children]{1}[Text]{0}),", "))
in
    Regex


Power Query Regex.xlsx
AB
1CommentRegex
2bla blabla bla 230F-115-04 blablablabla blablabla blablablabla blablabla blabla230F-115-04
3blabla blablablablablabla blablabla bla 301F-115-05 blabla bla301F-115-05
4bla bla bla bla bla bla bla blablablablablablablablablabla 30A-115-26 blablablablablablablabla blablablablablabla blabla30A-115-26
5blablablablablabla 30F-1180-02 blablablablablablablablablablablablabla30F-1180-02
6blablablablablablablablablabla 30F-118-03 blablablabla 130H-1252-0530F-118-03, 130H-1252-05
7blablablablablablablablablablablabla 30F-1252-04 blablablablablablablablablabla30F-1252-04
8blablablabla400 blablablablabla 130H-1252-05 blablabla130H-1252-05
95000 blablablablablablablablablablablablablablablablablablabla 30F-1252-18 blablablablablablablablablablablablablablablablabla30F-1252-18
10blabla 30F-126-01 blablablablablablablablabla30F-126-01
11blablablablablablablablablablablablabla 40G-127-01 blabla40G-127-01
12blablablablablablablablablablablablabla 30F-108-01 blablablabla30F-108-01
13blablabla 31F-299-91 blablablablablablablablablablablablablablablablabla31F-299-91
14blablablablablablablablablablablablablablablablablablabla 30F-130-01 blablablablablablablablablablablablablablabla30F-130-01
15blablablablablablablablablablablablablablablabla 30F-131-06 blablablablablablablablabla30F-131-06
16301F-115-05 blabla 30F-132-01 blablablablablablablablablablablabla 105301F-115-05, 30F-132-01
1730F-130-01 blablabla30F-130-01
18blablabla 30F-130-0130F-130-01
Sheet1
 
Upvote 0
Solution

Forum statistics

Threads
1,224,739
Messages
6,180,674
Members
452,993
Latest member
FDARYABEE

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top