# DAX Linear Interpolation



## joeshu26 (Aug 27, 2015)

Hello,

One table is a "lookup" table that contains probabilities and scores.  The other is a "fact" table with scores and a column for probabilities.  I am looking for a DAX measure that will return the correct probability by linearly interpolating the score in the "fact" table to the "lookup" table...



ProbabilityScore.01
14.58.0212.12.0310.75.049.83.059.16.068.65.078.23.087.88.097.59.107.34.117.12.106.93

<tbody>

</tbody>

IdentifierIDGroupSizeScore1A100009.52A84007.253A60008.4994A30009.21245B450012.436B220011.6897C48007.2318C33008.829C860010.310C50009.993011D110010.651012D180013.4

<tbody>

</tbody>


----------



## joeshu26 (Aug 27, 2015)

I currently have this measure to get the weighted average score, so I can see size weighted score by group and would like the same thing with the interpolated probability

Wght Score:=DIVIDE(SUMX(factScore,factScore[Size]*factScore[Score]),SUM(factScore[Size]))


----------



## Ozeroth (Aug 28, 2015)

I would create a series of intermediate measures to calculate the Scores above/below the Score of interest, the corresponding probabilities, the interpolation % and finally the interpolated probability.

I'm assuming [Wght Score] is the Score we want the interpolated probability for, and the lookup table is called 'Probability' and is hidden so it can't be filtered.



Get the max Score that is <= [Wght Score]


```
[B]Score Lower[/B] :=
CALCULATE (
    MAX ( 'Probability'[Score] ),
    FILTER (
        'Probability',
        'Probability'[Score] <= [Wght Score]
    )
)
```

Get the min Score that is >= [Wght Score]


```
[B]Score Upper[/B] :=
CALCULATE (
    MIN ( 'Probability'[Score] ),
    FILTER (
        'Probability',
        'Probability'[Score] >= [Wght Score]
    )
)
```

Get the Probability corresponding to Score Lower


```
[B]Probability Lower [/B]:=
CALCULATE (
    MIN ( 'Probability'[Probability] ),
    FILTER (
        'Probability',
        'Probability'[Score] = [Score Lower]
    )
)
```

Get the Probability corresponding to Score Upper


```
[B]Probability Upper [/B]:=
CALCULATE (
    MAX ( 'Probability'[Probability] ),
    FILTER (
        'Probability',
        'Probability'[Score] = [Score Upper]
    )
)
```

Calculate the interpolation fraction


```
[B]Interpolation Fraction[/B] :=
DIVIDE (
    [Wght Score] - [Score Lower],
    [Score Upper] - [Score Lower]
)
```

Finally calculate the interpolated Probability


```
[B]Probability Interpolated[/B] :=
[Probability Lower]
    + [Interpolation Fraction]
    * ( [Probability Upper] - [Probability Lower] )
```


----------



## joeshu26 (Aug 28, 2015)

Works like a charm, thank you very much


----------



## joeshu26 (Sep 24, 2015)

Hi Guys,

I am looking to optimize the above queries to run quicker and reduce query/cpu duration.  Anybody have some optimizations to the above queries to reduce query/dpu duration?
Thanks!


----------



## Ozeroth (Sep 28, 2015)

Here is a better method than my original version, which should run more quickly (I've tested it myself with DAX Studio) 

*1.* Add a column to your Probability table as shown below. I have called it "Probability Delta per Score Delta". This column is the 'slope' from the current row to the next row, defined as (Change in Probability)/(Change in Score).

For example the value in the first row is (0.02-0.01)/(12.12-14.58) = -0.004065.

I would suggest doing this in Power Query, but can also be done with a calculated column in the PowerPivot table. Having these values pre-computed saves on calculation in the PowerPivot measure.



*Probability**Score**Probability Delta per Score Delta*0.0114.58-0.0040650.0212.12-0.0072990.0310.75-0.0108700.049.83-0.0149250.059.16-0.0196080.068.65-0.0238100.078.23-0.0285710.087.88-0.0344830.097.59-0.0400000.17.34-0.0454550.117.12-0.0526320.126.93

<tbody>

</tbody>
*2.* Define a new *Probability Interpolated* measure as follows:

```
[B]Probability Interpolated :=[/B]
CALCULATE (
    VALUES ( 'Probability'[Probability] ) + VALUES ( 'Probability'[Probability Delta per Score Delta] ) * ( [Wght Score] - VALUES ( 'Probability'[Score] ) ),
    FIRSTNONBLANK( VALUES('Probability'[Score]), IF('Probability'[Score] >= [Wght Score],1) )
)
```
The 2nd argument of CALCULATE uses FIRSTNONBLANK to find the smallest 'Probability'[Score] that is >= [Wght Score]. This defines the row of the 'Probability' table we want to use.
The 1st argument of CALCULATE evaluates the linear interpolation using this row of 'Probability'.


----------



## joeshu26 (Sep 28, 2015)

Ozeroth, thank you for this DAX wizardry!!  I would love to download DAX studio if IT would only let me download it 

I decided to create the Calculated Column 'Probability'[Probability Delta per Score Delta] in a SQL query, though Power Query what be an effective alternative.

 This took performance of query time from +30 seconds to under 4 seconds!!!


----------



## Ozeroth (Sep 28, 2015)

You're welcome  That's great - wasn't sure how much faster it would be when applied to your real data.


----------



## joeshu26 (Jan 6, 2016)

Would it be possible to move the " FIRSTNONBLANK( VALUES('Probability'[Score]), IF('Probability'[Score] >= [Wght Score],1) ) " filter argument to the storage engine?

For example, replacing the FIRSTNONBLANK with a combination of FILTER with MIN(X) of 'Probability'[Score] >= [Wght Score] ?


----------



## Ozeroth (Jan 7, 2016)

Hmm interesting - I don't know enough about the ins & outs of storage vs formula engine to tell you for sure, but here are some possibilities. The 3rd one actually uses MIN.
Would be interested in relative performance in your full model - not sure if you have DAX Studio yet?

*1. Keep FIRSTNONBLANK but use FILTER
*

```
=
CALCULATE (
    VALUES ( 'Probability'[Probability] )
        + VALUES ( 'Probability'[Probability Delta per Score Delta] )
            * ( [Wght Score] - VALUES ( 'Probability'[Score] ) ),
    FIRSTNONBLANK (
        FILTER ( VALUES ( 'Probability'[Score] ), 'Probability'[Score] >= [Wght Score] ),
        1
    )
)
```

*2. Change FIRSTNONBLANK to TOPN and again use FILTER*

```
=
CALCULATE (
    VALUES ( 'Probability'[Probability] )
        + VALUES ( 'Probability'[Probability Delta per Score Delta] )
            * ( [Wght Score] - VALUES ( 'Probability'[Score] ) ),
            TOPN(1, FILTER(VALUES('Probability'[Score]), 'Probability'[Score] >= [Wght Score]),'Probability'[Score],1)


)
```

*3. Use FILTER and MIN function to determine min value
*(Note that MIN returns a scalar so can't be directly used as a filter argument for CALCULATE like TOPN/FIRSTNONBLANK can. There could be a better way of doing this with MIN though, not sure.)

```
=
CALCULATE (
    VALUES ( 'Probability'[Probability] )
        + VALUES ( 'Probability'[Probability Delta per Score Delta] )
            * ( [Wght Score] - VALUES ( 'Probability'[Score] ) ),
    FILTER (
        VALUES ( 'Probability'[Score] ),
        'Probability'[Score]
            = CALCULATE (
                MIN ( 'Probability'[Score] ),
                FILTER ( VALUES ( 'Probability'[Score] ), 'Probability'[Score] >= [Wght Score] )
            )
    )
)
```


----------



## joeshu26 (Aug 27, 2015)

Hello,

One table is a "lookup" table that contains probabilities and scores.  The other is a "fact" table with scores and a column for probabilities.  I am looking for a DAX measure that will return the correct probability by linearly interpolating the score in the "fact" table to the "lookup" table...



ProbabilityScore.01
14.58.0212.12.0310.75.049.83.059.16.068.65.078.23.087.88.097.59.107.34.117.12.106.93

<tbody>

</tbody>

IdentifierIDGroupSizeScore1A100009.52A84007.253A60008.4994A30009.21245B450012.436B220011.6897C48007.2318C33008.829C860010.310C50009.993011D110010.651012D180013.4

<tbody>

</tbody>


----------



## joeshu26 (Jan 7, 2016)

This is great Ozeroth!  Yeah I was getting stuck on using MIN as filter argument and was also mixing up syntax when I tried to use combinations of TOPN and FILTER with FIRSTNONBLANK.  All 3 options you provided increased performance for me by significant amounts.  I've been using DAX Studio as well as SQL Server Profiler to analyze query results.  Though I'm still a newbie when it comes to reading and optimizing DAX query plans.  Here are DAX Studio timings:

Previous Solution:
      Total: 296ms , SE CPU: 125ms x0.9 , FE: 156ms 52.7% , SE: 140ms 47.3% , SE Queries: 10 , SE Cache: 2 20.0%

Updated Solutions:
1.   Total: 203ms , SE CPU: 110ms x0.7 , FE: 46ms 22.7% , SE: 157ms 77.3% , SE Queries: 10 , SE Cache: 2 20.0%
2.   Total: 203ms , SE CPU: 125ms x1.0 , FE: 78ms 38.4% , SE: 125ms 61.6% , SE Queries: 10 , SE Cache: 2 20.0%
3.   Total: 218ms , SE CPU: 126ms x0.8 , FE: 61ms 28.0% , SE: 157ms 72.0% , SE Queries: 12 , SE Cache: 3 20.0%

Note my fact table is around 3 million rows, so I'm not sure if that's why storage engine is not running in parallel across cores (not sure if SE CPU only running up to x1.0 because 3 million rows is under 8 million row segment) or if it's because of the settings of the slow server hardware I'm running it on.


----------

