Speeding up a stochastic calculation

MoshiM

Active Member
Joined
Jan 31, 2018
Messages
439
Office Version
  1. 2016
Platform
  1. Windows
I'm attempting to calculate a stochastic index (current - min of period)/(max of period - min of period) but the calculation takes upwards of 5 minutes for 1000 records. Is there any way to optimize this or should I do calculations in VBA?
Power Query:
let 
    Source = Sql.Database(".\SQLEXPRESS01", "DatabaseName"),
    wantedTable = Source{[Schema="dbo",Item="L_Futures_Only"]}[Data],
    contractFilter = Table.StopFolding(Table.SelectRows(wantedTable, each [codeID] = "020601")), 
    selectedColumns = Table.SelectColumns(contractFilter,{"recordDate","comm_positions_long_all","comm_positions_short_all","contract_units","codeID"},MissingField.Ignore),
    dateSorted = Table.Sort(selectedColumns, {"recordDate", Order.Descending}),
    retrievedQuantity = Table.AddColumn(dateSorted, "Quantity", each Number.FromText(Text.Select([contract_units],{"0".."9","."} ) ) ),
    comm_NET = Table.AddColumn(retrievedQuantity, "comm_NET", each [comm_positions_long_all] - [comm_positions_short_all]),

[COLOR=rgb(209, 72, 65)]    STOCH = (tbl as table , monthsInPast as number, currentValue as number, currentDate as date) as nullable number =>
    let
        // tbl consists of 2 columns. Dates in column 1 and values in column 2.
        wantedName = Table.ColumnNames(tbl){1},
        dateInPast = Date.AddMonths(currentDate, monthsInPast),
        valueList = Table.Column(Table.SelectRows(tbl, each [recordDate] <= currentDate and [recordDate] >= dateInPast), wantedName),
        minValue = List.Min(valueList),
        output = Number.Round( ((currentValue - minValue) / (List.Max(valueList) - minValue)) * 100, 0)
    in
        output,[/COLOR]
    [COLOR=rgb(147, 101, 184)]netCommSelection = Table.SelectColumns(comm_NET,{"recordDate","comm_NET"})[/COLOR],
    commThreeYear = Table.AddColumn(comm_NET, "Commercial 3YI", each STOCH(netCommSelection, -36, [comm_NET], [recordDate])),
    commOneYear = Table.AddColumn(commThreeYear, "Commercial 1YI", each STOCH(netCommSelection, -12, [comm_NET], [recordDate]))
in 
    commOneYear
 

Excel Facts

Links? Where??
If Excel says you have links but you can't find them, go to Formulas, Name Manager. Look for old links to dead workbooks & delete.
I'd try the following:
1. Sort netCommSelection by date and buffer it.
2. Inside STOCH use Table.RemoveFirstN and Table.RemoveLastN with date condition to select rows, get a list of values (Table.Column) and buffer it.
3. Calculate max and min values and pass them to "output" calculation
 
Upvote 0
I'd try the following:
1. Sort netCommSelection by date and buffer it.
2. Inside STOCH use Table.RemoveFirstN and Table.RemoveLastN with date condition to select rows, get a list of values (Table.Column) and buffer it.
3. Calculate max and min values and pass them to "output" calculation
Thank you!! Calculation times went from 5 minutes down to 30 seconds. Do have any other suggestions?
 
Upvote 0
Thank you!! Calculation times went from 5 minutes down to 30 seconds. Do have any other suggestions?
w/o additional info about your data - sadly, no. Suppose you have 1 measurement per day. I replicated this data with 10 000 of consecutive dates and random values (sorted by date in ascending order). This code works less than 15 sec on my PC:
Power Query:
let
    dates = List.Buffer(data[record_date]), 
    values = List.Buffer(data[value]),
    positions = List.Buffer(List.Positions(dates)),
    stoch = (x as number, m as number) => 
        [current_date = dates{x}, 
        date_in_past = Date.AddMonths(current_date, m),
        past_position = List.PositionOf(dates, date_in_past, Occurrence.First, (c, v) => c >= v), 
        val = List.Range(values, past_position, x - past_position + 1),
        current = values{x},
        min = List.Min(val), 
        max = List.Max(val), 
        output = Number.Round( (current - min) / (max - min) * 100, 0)][output],
    result = Table.FromColumns(
        {
            dates,
            values, 
            List.Transform(positions, (x) => stoch(x, -36)), 
            List.Transform(positions, (x) => stoch(x, -12))
        }, 
        {"date", "value", "36", "12"}
    )
in
    result
 
Upvote 0
Solution
w/o additional info about your data - sadly, no. Suppose you have 1 measurement per day. I replicated this data with 10 000 of consecutive dates and random values (sorted by date in ascending order). This code works less than 15 sec on my PC:
Power Query:
let
    dates = List.Buffer(data[record_date]),
    values = List.Buffer(data[value]),
    positions = List.Buffer(List.Positions(dates)),
    stoch = (x as number, m as number) =>
        [current_date = dates{x},
        date_in_past = Date.AddMonths(current_date, m),
        past_position = List.PositionOf(dates, date_in_past, Occurrence.First, (c, v) => c >= v),
        val = List.Range(values, past_position, x - past_position + 1),
        current = values{x},
        min = List.Min(val),
        max = List.Max(val),
        output = Number.Round( (current - min) / (max - min) * 100, 0)][output],
    result = Table.FromColumns(
        {
            dates,
            values,
            List.Transform(positions, (x) => stoch(x, -36)),
            List.Transform(positions, (x) => stoch(x, -12))
        },
        {"date", "value", "36", "12"}
    )
in
    result
This brought it down to <2 seconds. Do you know why passing a table or list to the function doesn't have the same efficiency?

My data source once filtered was just 1.2k rows once filtered of weekly data, so I used the following:
Power Query:
    stoch = (currentPosition as number, monthsInPast as number) => 
        [      
        //targetName = Table.ColumnNames(tbl){1},
        currentDate = dates{currentPosition},
        dateInPast = Date.AddMonths(currentDate, monthsInPast),
        expectedWeekCount = Number.IntegerDivide(Duration.Days(currentDate-dateInPast), 7),
        currentValue = values{currentPosition},        
        lowerBoundPosition = currentPosition - (expectedWeekCount - 1),
        calculationPermitted = if lowerBoundPosition >= 0 then dates{lowerBoundPosition} >= dateInPast else false,
        valueList = if calculationPermitted then List.Buffer(List.Range(values, lowerBoundPosition, expectedWeekCount)) else null,
        minValue = if calculationPermitted then List.Min(valueList) else null,
        output = if calculationPermitted then Number.Round(((currentValue - minValue) / ( List.Max(valueList) - minValue)) * 100, 0) else null][output],
 
Upvote 0
Do you know why passing a table or list to the function doesn't have the same efficiency?
I am not sure I understand this. In general I replaced Table.SelectRows (goes over all the table every time) first with Table.RemoveFirst/LastN + buffering and then with buffering + List.PositionOf + List.Range. List functions are fast if you use them properly. Try to avoid operations that scan your whole table many times (be creative to avoid that). If it's unavoidable - buffer your table.
 
Upvote 0

Forum statistics

Threads
1,223,880
Messages
6,175,155
Members
452,615
Latest member
bogeys2birdies

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top