Compressing large amount of rows but adding totals?

Stildawn · Oct 9, 2024

Hi All

I'm working with a large amount of data, the largest set I've seen in real life so far was over 28k rows.

What I need to do is get it down to a compressed amount of rows based on essentially removing duplicates in a few columns, but counting certain column values into a total.

I have created this test file: VBA Test Data.xlsx

In it we have sheet "Original Data" this is essentially the raw data that I have, however its typically around 8k to 28k rows.

The sheet "Converted Data" is what it needs to compress down to.

Basically I need to do the below:

Remove duplicate rows based on columns A / B / C / F
However the remaining non duplicate row needs to have a count of all removed duplicates from column E & G
End up with totals from column E & G for each unique invoice / product / code etc (like example in sheet "Converted Data"
The original data can be in any order, and is often not in a nice grouped order like the example spreadsheet.

This all needs to be done in VBA as part of a bunch of other code to get a final product.

So basically just after your thoughts on how best to tackle this in the most efficient manner, as with the amount of data it can take a while to process etc.

Thanks

Sergius · Oct 10, 2024

Hello! Are you satisfied with this result?

VBA Test Data.xlsx

A

B

C

D

E

F

G

1

Invoice #

Product

Code

Description

QTY

COO

Price

2

INV123

ABC123

6109.10.

Product 1

27

CN

85,59

3

INV123

XYZ987

6201.20.

Product 9

23

KH

73,6

4

INV798

EFG234

4202.10.

Product 2

24

TW

48,25

5

INV798

LMN567

6307.10.

Product 5

26

AU

88,66

Table1

Made with Power Qwery.

Power Query:

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Invoice #", type text}, {"Product", type text}, {"Code", type text}, {"Description", type text}, {"QTY", Int64.Type}, {"COO", type text}, {"Price", type number}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Invoice #", "Product", "Code", "Description", "COO"}, {{"QTY", each List.Sum([QTY]), type nullable number}, {"Price", each List.Sum([Price]), type nullable number}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Grouped Rows",{"Invoice #", "Product", "Code", "Description", "QTY", "COO", "Price"})
in
    #"Reordered Columns"

Stildawn · Oct 10, 2024

Can I do this within VBA? Cause this is just one step of lots of code.

jkpieterse · Oct 10, 2024

Create a pivot table from your data:

Format your range as a table (control+t)
Click "Summarize with Pivot table"
Choose where to place the PT
Drag these fields to the Rows area: Invoice # Product Code Description COO
Drag the QTY and Price fields to the Value area.
Done.

I know the layout is slightly different but the result is identical.

Stildawn · Oct 10, 2024

jkpieterse said:
Create a pivot table from your data:

Format your range as a table (control+t)

Click "Summarize with Pivot table"

Choose where to place the PT

Drag these fields to the Rows area: Invoice # Product Code Description COO

Drag the QTY and Price fields to the Value area.

Done.

I know the layout is slightly different but the result is identical.

I need to do it all in VBA.

Peter_SSs · Oct 10, 2024

For a vba approach, try this with a copy of your workbook. I have assumed the worksheet with the original data is active and the results are written to columns I:O

VBA Code:

Sub Consolidate_Data()
  Dim dQ As Object, dP As Object
  Dim a As Variant
  Dim i As Long
  Dim s As String
 
  Set dQ = CreateObject("Scripting.Dictionary")
  Set dP = CreateObject("Scripting.Dictionary")
 
  With Range("A2", Range("G" & Rows.Count).End(xlUp))
    a = .Value
    For i = 1 To UBound(a)
      s = Join(Application.Index(a, i, Array(1, 2, 3, 4)), ";") & ";;" & a(i, 6)
      dQ(s) = dQ(s) + a(i, 5)
      dP(s) = dP(s) + a(i, 7)
    Next i
    .Rows(0).Copy Destination:=.Cells(0, 9)
    With .Offset(, 8).Resize(dQ.Count, 1)
      .Value = Application.Transpose(dQ.Keys)
      .TextToColumns DataType:=xlDelimited, Semicolon:=True, Comma:=False, Space:=False, Other:=False
      .Offset(, 4).Value = Application.Transpose(dQ.items)
      .Offset(, 6).Value = Application.Transpose(dP.items)
    End With
  End With
End Sub

My sample data and code results:

Stildawn VBA Test Data.xlsm

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

1

Invoice #

Product

Code

Description

QTY

COO

Price

Invoice #

Product

Code

Description

QTY

COO

Price

2

INV123

ABC123

6109.10.

Product 1

4

CN

12.68

INV123

ABC123

6109.10.

Product 1

27

CN

85.59

3

INV123

ABC123

6109.10.

Product 1

6

CN

19.02

INV123

XYZ987

6201.20.

Product 9

23

KH

73.6

4

INV123

ABC123

6109.10.

Product 1

6

CN

19.02

INV798

EFG234

4202.10.

Product 2

24

TW

48.25

5

INV123

ABC123

6109.10.

Product 1

4

CN

12.68

INV798

LMN567

6307.10.

Product 5

26

AU

88.66

6

INV123

ABC123

6109.10.

Product 1

3

CN

9.51

7

INV123

ABC123

6109.10.

Product 1

2

CN

6.34

8

INV123

ABC123

6109.10.

Product 1

2

CN

6.34

9

INV123

XYZ987

6201.20.

Product 9

4

KH

12.8

10

INV123

XYZ987

6201.20.

Product 9

6

KH

19.2

11

INV123

XYZ987

6201.20.

Product 9

6

KH

19.2

12

INV123

XYZ987

6201.20.

Product 9

4

KH

12.8

13

INV123

XYZ987

6201.20.

Product 9

3

KH

9.6

14

INV798

EFG234

4202.10.

Product 2

5

TW

5.6

15

INV798

EFG234

4202.10.

Product 2

4

TW

12.9

16

INV798

EFG234

4202.10.

Product 2

5

TW

15.6

17

INV798

EFG234

4202.10.

Product 2

4

TW

10.6

18

INV798

EFG234

4202.10.

Product 2

6

TW

3.55

19

INV798

LMN567

6307.10.

Product 5

4

AU

15.65

20

INV798

LMN567

6307.10.

Product 5

7

AU

12.85

21

INV798

LMN567

6307.10.

Product 5

3

AU

1.5

22

INV798

LMN567

6307.10.

Product 5

2

AU

6.5

23

INV798

LMN567

6307.10.

Product 5

1

AU

18.95

24

INV798

LMN567

6307.10.

Product 5

4

AU

14.66

25

INV798

LMN567

6307.10.

Product 5

5

AU

18.55

Original Data

jkpieterse · Oct 10, 2024

What about this (needs a reference to the ADO library):

VBA Code:

Sub SummarizeData()
    Dim sql As String
    Dim cn As ADODB.Connection
    Dim Rst As ADODB.Recordset
    Dim wb As String
    Const sheetName As String = "Original_Data"
    wb = ThisWorkbook.Path & "\" & ThisWorkbook.Name
    'Invoice #   Product Code    Description QTY COO Price
    sql = "SELECT [" & sheetName & "$].[Invoice #], [" & sheetName & "$].Product, [" & sheetName & "$].Code, [" & sheetName & "$].Description, Sum([" & sheetName & "$].QTY) AS SumOfQTY, [" & sheetName & "$].COO, Sum([" & sheetName & "$].Price) AS SumOfPrice"
    sql = sql & " FROM [" & sheetName & "$]"
    sql = sql & " GROUP BY [" & sheetName & "$].[Invoice #], [" & sheetName & "$].Product, [" & sheetName & "$].Code, [" & sheetName & "$].Description, [" & sheetName & "$].COO;"

    Set cn = New ADODB.Connection

    '--- Connection ---
    '    With cn
    '        .Provider = "Microsoft.Jet.OLEDB.4.0"
    '        .ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" _
             '            & sheetName & ";Extended Properties=""Excel 12.0;HDR=YES;"""
    '        .Open
    '    End With
    Set cn = CreateObject("ADODB.Connection")
    With cn
        .Provider = "Microsoft.ACE.OLEDB.12.0"
        .ConnectionString = "Data Source=" & wb & ";" & "Extended Properties=""Excel 12.0 Xml;HDR=YES"";"
        .Open
    End With


    Set Rst = New ADODB.Recordset
    Set Rst = cn.Execute(sql)
    Worksheets("Converted Data").Range("A10").CopyFromRecordset Rst

End Sub

Compressing large amount of rows but adding totals?

Stildawn

Board Regular

Sergius

Board Regular

Stildawn

Board Regular

jkpieterse

Well-known Member

Stildawn

Board Regular

Peter_SSs

MrExcel MVP, Moderator

jkpieterse

Well-known Member

Similar threads

Share this page

Compressing large amount of rows but adding totals?

Stildawn

Board Regular

Sergius

Board Regular

Stildawn

Board Regular

jkpieterse

Well-known Member

Stildawn

Board Regular

Peter_SSs

MrExcel MVP, Moderator

jkpieterse

Well-known Member

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock