Hi guys,
I am looking for some advice on whether I am approaching this from the right direction. This is a bit complicated so bear with me.
First declaration: I am doing something that really sticks in my craw: taking data from an SQL Data warehouse via an Access application to Excel then loading it into Access. By the time we are finished this data will have been through 5 transformations JUST to get it back to its original format
Production->SQL warehouse->access2003->excel->access2010 data import->access2010 transformation macro
I have a reporting application which provides me with a results set which contains:
8 fields which describe an asset (including AssetID, CostCentre)
3 fields which describe a payment scheme (type, method, etc..)
80 fields which descibe payment elements for that asset.
on each record the 8 asset fields and 3 payment scheme fields are completed, the asset data may be repeated because it may have more than one payment scheme
The payment elements will either be empty (NULL), have zero (cancelled charge) or a currency value.
I need to get this data into 3rd Normal form.
I have created separate tables for the assets and cost centres by doing maketable queries with the DISTINCT keyword and these tables are indexed
Asset = AssetID
CostCentre = CostCentreID
The next challenge is to create a list of chargecodes. I have done this by taking the column headings and transposing them in Excel, adding extra data columns and them importing them into an Access table (ChargeCode: PK = ID)
I now need to create a CostCentre-ChargeCode cross reference table (i'll call it C4 from now on) for every charge code that has a value against ANY asset in that charge code. I then need to create an Asset-C4 cross reference to indicate that the particular asset is attached to the charge code
For this last one I have written a VBA macro that follows this logic
open a recordset for the import file (rsIMP)
for each record in the recordset, loop through the charge fields (13-93)
If the field has a value
open a recordset for the ChargeCode (rsCHG)where the description matches the rsIMP.fields(fieldloop).name. This should ALWAYS give me one record
open a recordset for the C4 (rsC4) where CostCentre = rsIMP.fields("CostCentreID") and ChargeCodeID = rsCHG.fields("ID"). This will give me 0 or 1 records.
if rsC4.EOF = true then prepare an SQL command to insert the record and then run DoCmd.runSQL(SQLcommand) and re-run the query to get the new ID
prepare an SQL command to insert the Asset-C4 table with the AssetID and C4.ID
get next rsIMP field
get next rsImp record
This is working but it is slow. I have 57K import records with 80 fields (=4.5m loops) and probably 500K insert statments. E.g It has been running for an hour and is only 8000 records in - so it will take about 6 hours to complete.
Is there a better approach to re-normalise a flattened data structure
I am using Access2010 over Ctirix 12 using WYSE terminals so this is all running on the SuperPC cluster over a T100 ethernet network
Actual code is below
Table Structures (irrelevant fields omitted):
Asset: AssetID (PK), Address, InboundPostcode (FK to CostCentre),CostCentre (FK to CostCentre)
CostCentre: CostCentre(PK), InboundPostCode(PK)
ChargeCode:ID(PK autonumber),Description,ParentChargeCodeID(FK to ChargeCode.ID) codes are hierachical. Data Entry codes feed into summary reporting codes. Currently there is one DE code for each REP code but in maintenance mode, the users will create additional DE codes as the asset ratios may be different.
CostCentreChargeCode: ID (PK autonumber),CostCentreID (FK to CostCentre),CostCentreInboundPostCode (FK to CostCentre),ChargeCodeID (FK to ChargeCode)
AssetC4: ID (PK autonumber), AssetID (FK to Asset), C4ID (FK to CostCentreChargeCode)
CostC4:ID (PK autonumber), C4ID (FK to CostCentreChargeCode) this is not used in the code above. This will be used to assign costs to the Cost Centre and then apportion them to the assets via the AssetC4 table
I am looking for some advice on whether I am approaching this from the right direction. This is a bit complicated so bear with me.
First declaration: I am doing something that really sticks in my craw: taking data from an SQL Data warehouse via an Access application to Excel then loading it into Access. By the time we are finished this data will have been through 5 transformations JUST to get it back to its original format
data:image/s3,"s3://crabby-images/cf35b/cf35bdb5b0d2dee8d5dfe1d6ade350bd9dec0f93" alt="ROFL :rofl: :rofl:"
I have a reporting application which provides me with a results set which contains:
8 fields which describe an asset (including AssetID, CostCentre)
3 fields which describe a payment scheme (type, method, etc..)
80 fields which descibe payment elements for that asset.
on each record the 8 asset fields and 3 payment scheme fields are completed, the asset data may be repeated because it may have more than one payment scheme
The payment elements will either be empty (NULL), have zero (cancelled charge) or a currency value.
I need to get this data into 3rd Normal form.
I have created separate tables for the assets and cost centres by doing maketable queries with the DISTINCT keyword and these tables are indexed
Asset = AssetID
CostCentre = CostCentreID
The next challenge is to create a list of chargecodes. I have done this by taking the column headings and transposing them in Excel, adding extra data columns and them importing them into an Access table (ChargeCode: PK = ID)
I now need to create a CostCentre-ChargeCode cross reference table (i'll call it C4 from now on) for every charge code that has a value against ANY asset in that charge code. I then need to create an Asset-C4 cross reference to indicate that the particular asset is attached to the charge code
For this last one I have written a VBA macro that follows this logic
open a recordset for the import file (rsIMP)
for each record in the recordset, loop through the charge fields (13-93)
If the field has a value
open a recordset for the ChargeCode (rsCHG)where the description matches the rsIMP.fields(fieldloop).name. This should ALWAYS give me one record
open a recordset for the C4 (rsC4) where CostCentre = rsIMP.fields("CostCentreID") and ChargeCodeID = rsCHG.fields("ID"). This will give me 0 or 1 records.
if rsC4.EOF = true then prepare an SQL command to insert the record and then run DoCmd.runSQL(SQLcommand) and re-run the query to get the new ID
prepare an SQL command to insert the Asset-C4 table with the AssetID and C4.ID
get next rsIMP field
get next rsImp record
This is working but it is slow. I have 57K import records with 80 fields (=4.5m loops) and probably 500K insert statments. E.g It has been running for an hour and is only 8000 records in - so it will take about 6 hours to complete.
Is there a better approach to re-normalise a flattened data structure
I am using Access2010 over Ctirix 12 using WYSE terminals so this is all running on the SuperPC cluster over a T100 ethernet network
Actual code is below
Code:
Sub aaron()
Dim db As Database
Dim rsImport As Recordset
Dim rsWorking As Recordset
Dim SQL As String
Dim y As Long
Dim ID As Long
Set db = CurrentDb()
SQL = "SELECT * FROM ImportActiveH"
Set rsImport = db.OpenRecordset(SQL, dbOpenDynaset)
Do While Not rsImport.EOF
For y = 0 To rsImport.Fields.Count - 1
If Left(rsImport(y).Name, 1) = "U" And Not IsNull(rsImport(y).Value) Then
'* we are in the cost codes and have a value and need to create the additional records
'* Get the chargecode ID for the chargeable ChargeCode. At this stage there is only one bottom code per
'* ActiveH code.
SQL = "SELECT ID FROM ChargeCode CC WHERE CC.Description = """ & rsImport(y).Name & """"
'MsgBox SQL
Set rsWorking = db.OpenRecordset(SQL, dbOpenDynaset)
If Not rsWorking.EOF Then '* the charge code exists
ID = rsWorking.Fields("ID").Value
'* get the bottom level ID
SQL = "SELECT Top 1 ID from q_BottomLevelChargeCodes as CC where CC.ParentChargeCodeID = " & ID & " order by ID"
Set rsWorking = Nothing
Set rsWorking = db.OpenRecordset(SQL, dbOpenDynaset)
If Not rsWorking.EOF Then '* the bottom level record exists
ID = rsWorking.Fields("ID").Value
'* see if the CostCentreChargeCode exists
If InStr(rsImport.Fields("Postcode").Value, " ") <> 0 Then
SQL = "SELECT * FROM CostCentreChargeCode as C4 WHERE C4.CostCentreID = """ & rsImport.Fields("CostCentre").Value & """" _
& " AND C4.CostCentreInboundPostcode = """ & Left(rsImport.Fields("Postcode").Value, InStr(rsImport.Fields("Postcode").Value, " ") - 1) & """" _
& " AND C4.ChargeCodeID = " & ID
Else '* there is a problem with some of the postcodes having a hard space instead of a soft space....
SQL = "SELECT * FROM CostCentreChargeCode as C4 WHERE C4.CostCentreID = """ & rsImport.Fields("CostCentre").Value & """" _
& " AND C4.CostCentreInboundPostcode = """ & Left(rsImport.Fields("Postcode").Value, 4) & """" _
& " AND C4.ChargeCodeID = " & ID
End If
Set rsWorking = Nothing
Set rsWorking = db.OpenRecordset(SQL, dbOpenDynaset)
If rsWorking.EOF Then
'* we need to create the costcentreChargeCode record
SQL = "INSERT INTO CostCentreChargeCode (CostCentreID,CostCentreInboundPostCode,ChargeCodeID) VALUES (""" & rsImport.Fields("CostCentre").Value & """" _
& ",""" & Left(rsImport.Fields("Postcode").Value, InStr(rsImport.Fields("Postcode").Value, " ") - 1) & """" _
& "," & ID & ")"
'MsgBox SQL
DoCmd.SetWarnings False
DoCmd.RunSQL SQL
Else
'* Capture the ID because we need it to create the AssetC4 record
ID = rsWorking.Fields("ID")
End If
Else
'* we have a problem
MsgBox "Missing Bottom level Charge Code for " & rsImport(y).Name & " for asset ID " & rsImport.Fields("U100AssetRef").Value
End If
'* create the AssetC4
If rsWorking.RecordCount = 0 Then '* we just inserted a new C4 record
If InStr(rsImport.Fields("Postcode").Value, " ") <> 0 Then
SQL = "SELECT * FROM CostCentreChargeCode as C4 WHERE C4.CostCentreID = """ & rsImport.Fields("CostCentre").Value & """" _
& " AND C4.CostCentreInboundPostcode = """ & Left(rsImport.Fields("Postcode").Value, InStr(rsImport.Fields("Postcode").Value, " ") - 1) & """" _
& " AND C4.ChargeCodeID = " & ID
Else '* there is a problem with some of the postcodes having a hard space instead of a soft space....
SQL = "SELECT * FROM CostCentreChargeCode as C4 WHERE C4.CostCentreID = """ & rsImport.Fields("CostCentre").Value & """" _
& " AND C4.CostCentreInboundPostcode = """ & Left(rsImport.Fields("Postcode").Value, 4) & """" _
& " AND C4.ChargeCodeID = " & ID
End If
Set rsWorking = Nothing
Set rsWorking = db.OpenRecordset(SQL, dbOpenDynaset)
ID = rsWorking.Fields("ID").Value
End If
DoCmd.SetWarnings False
DoCmd.RunSQL ("INSERT into AssetC4 (AssetID,C4ID,Ratio) VALUES(""" & rsImport.Fields("U001AssetRef").Value & """," & ID & ",1)")
End If
End If
Next y
lvarStatus = SysCmd(acSysCmdSetStatus, rsImport.AbsolutePosition)
rsImport.MoveNext
Loop
End Sub
Table Structures (irrelevant fields omitted):
Asset: AssetID (PK), Address, InboundPostcode (FK to CostCentre),CostCentre (FK to CostCentre)
CostCentre: CostCentre(PK), InboundPostCode(PK)
ChargeCode:ID(PK autonumber),Description,ParentChargeCodeID(FK to ChargeCode.ID) codes are hierachical. Data Entry codes feed into summary reporting codes. Currently there is one DE code for each REP code but in maintenance mode, the users will create additional DE codes as the asset ratios may be different.
CostCentreChargeCode: ID (PK autonumber),CostCentreID (FK to CostCentre),CostCentreInboundPostCode (FK to CostCentre),ChargeCodeID (FK to ChargeCode)
AssetC4: ID (PK autonumber), AssetID (FK to Asset), C4ID (FK to CostCentreChargeCode)
CostC4:ID (PK autonumber), C4ID (FK to CostCentreChargeCode) this is not used in the code above. This will be used to assign costs to the Cost Centre and then apportion them to the assets via the AssetC4 table