PQ: Searching for a phrase in unstructured data

pvr928

Well-known Member
Joined
Oct 21, 2002
Messages
790
Hi

This is my first post in the PQ section! I've been using Excel and VBA for a long time but Power Pivot/PQ for only a short time. From what I've seen, read and experienced - it is awesome!

I've got multiple Excel files containing unstructured data (converted from PDF files). Each of these files contain similar phrases.

I'm wondering if PQ can search for a particular phrase and then extract data to the right of it, ie:

"Number of days in quarter: 42"

Where the phrase is "Number of days in quarter" and what I want to extract is 42.

I can write some vba to search and extract this data into a new sheet and table, and then use PQ to then extract the data from this new table, but if I can cut this step out it would save a lot of time (I've got files in the multiple 100,000).

I'm not looking for a complete solution, just whether it is possible and then some tips on how to get there.

Thanks so much for your help.

Cheers

Pvr928
 
No problem, it's a new POC in my M-toolbox now :-)

Binary should definitely work. How did you try to open it?
Try: Text.FromBinary(File.Contents(Filepath))


Thanks, I'll try that.

I was opening it via the menu - Get Data, From File (I think that's what it's called) - Excel.

It goes through the motions and then says the connection failed and then suggests it's an Access thing - this is with.xlsb. With.xlsx it is no problem.

Cheers

pvr928
 
Upvote 0

Excel Facts

Repeat Last Command
Pressing F4 adds dollar signs when editing a formula. When not editing, F4 repeats last command.
Hi ImkeF

I'm more familiar with PQ now, but still working on my vba idea rather than using PQ - for now at least.

I haven't tried your suggested fix, but did come across this which talks about the same error that I get when I use PQ on a .xlsb file. I'll try it out and revert with my result.

https://support.office.com/en-us/ar...380-aaf1-fca4e81563a3?ui=en-US&rs=en-US&ad=US

Apparently it only affects 2013 users.

Cheers

pvr928
 
Upvote 0
Unfortunately .XLSB files are notorious for error'ing out in PQ in my experience. There's obviously awesome benefits to saving a file as a .XLSB but because of PQ being a pain I tend to only use it as a last resort, i.e. the file size gets too big.
 
Upvote 0
Hi SimonNU

Yes - file size was exactly why I started using them as well, and now my default is .xlsb.

However, now that I'm a PQ convert I'll have to judiciously choose which type. A pity I deliberately converted about 100,000 files to .xlsb which I'll probably now have to convert to .xlsx - but it's all learning !!

cheers

pvr928
 
Upvote 0

Forum statistics

Threads
1,224,158
Messages
6,176,745
Members
452,741
Latest member
Muhammad Nasir Mahmood

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top