Problem
Working with various in-congruent data tables of 100+ columns & 200-500k records, "Clearing Contents" of 'blank cells' takes*:
"Blank" Cells Defined
- A1:B10; C15:E25; BB2200:BC300000 Ranges contain "blank" cells that Excel counts in Pivot Table summations.
- LEN(ranges) = 0
Various Workarounds attempted
I) Power Query:
IA) 95%+ of all columns have these 'random' blank cells littered throughout.
IB) Transforming "data type" from string-to-decimal/whole number--which does replace these "blank" cells with (null) values--isn't an option & instead results in "ERROR".
IC) "Replace Values" does not work; nor does adding a Prefix/Suffix & then replacing those with null/blank
ID) Deleting entire records (e.g., "Delete Errors") isn't an option.
II) VBA: On a 1-10 scale of competent working knowledge on the topic, I'd score myself (generously) at a 2; maybe a 4 if including VBA plagiarism abilities.
III) Power Pivot Table: recreating the entire table into Power Pivot (standard Pivot doesn't allow for "Distinct Count" summation) & recreating a new flattened table from this.
IV) Excel Table - Seek n' Destroy Method:
#1 ) 20-minutes plus to complete the search & return the cells in the "Find & Replace" window. (During this step, Excel's name box displays progress as it cycles COL+ROW, i.e., A2, BB200, CC20000, etc).
#2 ) Once #1 completes, another 20-minutes to select all results in window via ctrl-a
#3) After #2 completes, another 20-minutes to "clear contents" via right-click context menu.
* - during each one of these steps, excel 'appears' to be locked-up (e.g., red "not responding" in resource mgr). 50%+ of the time, 100% lockup occurs at step 2 and/or 3 and/or exceeds 60+ minutes, eventually requiring a forced term.
Misc info
A) Resources: During each #1 -#3 steps, 25% Avg CPU; 85-115 threads; elevated; 7% of total 16MB memory allocated in working set.
B) System: Win7 Enterprise SP1 (x64); Excel 2016 (x64); Power BI Desktop (x64) [haven't attempted in PBI yet]
C) Size Matters: From what I could find on the topic, mostly, the Q&A surrounds one column or a small range.
D) M/DAX: Granted, there is likely a M or DAX solution but haven't yet stumbled across it...
E) Why: PQ data types <> decimal/whole numbers
Working with various in-congruent data tables of 100+ columns & 200-500k records, "Clearing Contents" of 'blank cells' takes*:
"Blank" Cells Defined
- A1:B10; C15:E25; BB2200:BC300000 Ranges contain "blank" cells that Excel counts in Pivot Table summations.
- LEN(ranges) = 0
Various Workarounds attempted
I) Power Query:
IA) 95%+ of all columns have these 'random' blank cells littered throughout.
IB) Transforming "data type" from string-to-decimal/whole number--which does replace these "blank" cells with (null) values--isn't an option & instead results in "ERROR".
IC) "Replace Values" does not work; nor does adding a Prefix/Suffix & then replacing those with null/blank
ID) Deleting entire records (e.g., "Delete Errors") isn't an option.
II) VBA: On a 1-10 scale of competent working knowledge on the topic, I'd score myself (generously) at a 2; maybe a 4 if including VBA plagiarism abilities.
III) Power Pivot Table: recreating the entire table into Power Pivot (standard Pivot doesn't allow for "Distinct Count" summation) & recreating a new flattened table from this.
IV) Excel Table - Seek n' Destroy Method:
#1 ) 20-minutes plus to complete the search & return the cells in the "Find & Replace" window. (During this step, Excel's name box displays progress as it cycles COL+ROW, i.e., A2, BB200, CC20000, etc).
#2 ) Once #1 completes, another 20-minutes to select all results in window via ctrl-a
#3) After #2 completes, another 20-minutes to "clear contents" via right-click context menu.
* - during each one of these steps, excel 'appears' to be locked-up (e.g., red "not responding" in resource mgr). 50%+ of the time, 100% lockup occurs at step 2 and/or 3 and/or exceeds 60+ minutes, eventually requiring a forced term.
Misc info
A) Resources: During each #1 -#3 steps, 25% Avg CPU; 85-115 threads; elevated; 7% of total 16MB memory allocated in working set.
B) System: Win7 Enterprise SP1 (x64); Excel 2016 (x64); Power BI Desktop (x64) [haven't attempted in PBI yet]
C) Size Matters: From what I could find on the topic, mostly, the Q&A surrounds one column or a small range.
D) M/DAX: Granted, there is likely a M or DAX solution but haven't yet stumbled across it...
E) Why: PQ data types <> decimal/whole numbers