brawnystaff
Board Regular
- Joined
- Aug 9, 2012
- Messages
- 109
- Office Version
- 365
I OCR'ed a PDF containing a printout from ledger data and put it into an Excel worksheet. However, the OCR process puts the data in differnet columns depending on the page, as there was alot of text in it that I do not need.
The PDF ledger contained about 10 columns, and the data I want is either a date or number (sometimes with a decimal, sometimes not) in their respective cell. Other cells that contain either words or words with letters I do not need.
Is there a way in Power Query to extract out only cells with dates or numbers for each row? The PQ editor is showing 18 columns, as the "junk text" is inner-mixed in each row depending on the page. Looking to only extract cells with dates or numbers to only get the 10 columns. Any ideas? Thanks.
The PDF ledger contained about 10 columns, and the data I want is either a date or number (sometimes with a decimal, sometimes not) in their respective cell. Other cells that contain either words or words with letters I do not need.
Is there a way in Power Query to extract out only cells with dates or numbers for each row? The PQ editor is showing 18 columns, as the "junk text" is inner-mixed in each row depending on the page. Looking to only extract cells with dates or numbers to only get the 10 columns. Any ideas? Thanks.