Power Query ; imported text data with no spaces

nascaline

New Member
Joined
Apr 11, 2021
Messages
6
Office Version
  1. 365
Platform
  1. Windows
Hi all,

I am new to power query, I'm currently having a problem importing data from PDF files.( i.e. the imported text has skipped all the spacing and output one string)

What I get from import:
| Description
| PLASTERINGANDPAVING;Applyingwhitepaint; NipponPaintZeroTecPaint;
| green; NN1307-4 Gysum;asspecifiedindrawingnr.

My desired result:
| Description
| PLASTERING AND PAVING ; Applying white paint; Nippon Paint Zero Tec Paint;
| green; NN1307-4 Gysum ; as specified in drawing nr.

Is there any way that I can try to get the text data above?
Many Thanks

PDF file sample can be downloaded in the link below:
 

Excel Facts

Bring active cell back into view
Start at A1 and select to A9999 while writing a formula, you can't see A1 anymore. Press Ctrl+Backspace to bring active cell into view.
I had the same issue when I tried to import your file. I don't think you are going to like the only solution I could find.
The only way around it that I could find is to import the pdf into MS Word and save it as HTM, then import using Power Query import it as a Text file.
I got that method from here:- Import Tabular Data from PDF using Power Query - Excelerator BI - Sub heading "Import into Power Query".
They got the best result saving it as MHT but on your pdf I got a better result saving it as HTM.

If you can't get a better option and decide to use this follow the instructions on the Link I provided or if that is not clear let me know and I will provide more details.
 
Upvote 0
Solution
I had the same issue when I tried to import your file. I don't think you are going to like the only solution I could find.
The only way around it that I could find is to import the pdf into MS Word and save it as HTM, then import using Power Query import it as a Text file.
I got that method from here:- Import Tabular Data from PDF using Power Query - Excelerator BI - Sub heading "Import into Power Query".
They got the best result saving it as MHT but on your pdf I got a better result saving it as HTM.

If you can't get a better option and decide to use this follow the instructions on the Link I provided or if that is not clear let me know and I will provide more details.
Alex, Thanks for the help! Just tried it myself and the result seems fine for me. :)
 
Upvote 0

Forum statistics

Threads
1,223,718
Messages
6,174,077
Members
452,542
Latest member
Bricklin

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top