Handling Outliers

Madybecka

New Member
Joined
May 15, 2020
Messages
2
Office Version
  1. 365
Platform
  1. Windows
Hey Everyone,

Im still super new at this VBA stuff. But I am trying to devise a way to automatically pull data from my worksheet and exclude any outliers that is inputted by the user as "Yes" in another section of the worksheet then be able to use a basic STDEV function for the rest of the data.

My thoughts are either a loop system in which it checks each row of given range of data for "Yes" and then stores the Value associated with that row in a designated list or array. Then If does not contain "Yes" it stores in a separate list or array that I can then use a STDEV function on it.

My second thought is to build a function of STDEV with a capability of ignoring any "yes" values.


Thank you in advance
 

Excel Facts

Shade all formula cells
To shade all formula cells: Home, Find & Select, Formulas to select all formulas. Then apply a light fill color.
You could use VBA. But you do not have to resort to VBA, if you wish to avoid it.

Of course, the first question is: how do you identify "outliers"?

However you do it, you probably derive values that are called lower and upper "fences". Suppose they are in F1 and F2 respectively.

And suppose your data are in A1:A1000.

Then you can array-enter (press ctrl+shift+Enter instead of just Enter) the following formula to exclude "outliers":

=STDEV(IF(A1:A1000>=F1, IF(A1:A1000<=F2, A1:A1000)))

PS.... For the mean, you can use AVERAGEIFS and normally-enter (just press Enter as usual) the following:

=AVERAGEIFS(A1:A1000, A1:A1000, ">=" & F1, A1:A1000, "<=" & F2)

Similarly for MAXIFS and MINIFS, if you have recent versions of Excel. (I don't.)

But most other statistical functions would be array-entered in a manner similar to STDEV.
 
Upvote 0
PS....

1. Consider whether you should use STDEV or STDEVP. STDEVP is the true std dev of the data. Use STDEV to estimate a "population" std dev based on the "sample" data.

2. If you want a VBA solution, you might study the following.

VBA Code:
Option Explicit

Function mySTDEV(data As Range, f1 As Double, f2 As Double) As Variant
' f1 is outlier lower fence
' f2 is outlier upper fence
Dim v1 As Variant
Dim k As Long, r As Long, c As Long
' better to pull data into VBA array than to reference each data(i)
v1 = data
ReDim v2(1 To data.Count) As Double
k = 0
For r = 1 To UBound(v1, 1): For c = 1 To UBound(v1, 2)
    If f1 <= v1(r, c) And v1(r, c) <= f2 Then
        k = k + 1
        v2(k) = v1(r, c)
    End If
Next c, r
If k < 2 Then
    mySTDEV = CVErr(xlErrValue)
Else
    ReDim Preserve v2(1 To k) As Double
    mySTDEV = WorksheetFunction.StDev(v2)
End If
End Function
 
Upvote 0
(Joe - OP said outliers were indicated by the user typing "Yes" somewhere.)

So I don't know how the data is, but here's how you can get your value without VBA.

In my example, I have values in B2:B26, and I put "Yes" into A2:A26 next to any value I consider an outlier.

This array formula* will compute the STDEV you want:

{=STDEV(IF(A2:A26<>"Yes",B2:B26,""))}

* It's an array formula. Don't type the curly braces. Instead, type the formula, then hold Ctrl+Shift while you press Enter. Excel will put the braces around the formula if it is entered properly.

If you're on Office 365, Monthly Release, you've got Dynamic Arrays, which means you don't need to hold Ctrl+Shift while you press Enter.
 
Upvote 0
Thank you guys so much!!!

Sorry it took some time to test all these out and then respond back. I used a mixture of both comments for outside of VBA.

Here's what I used:

{=STDEV(IF(G18:G306 <> "Yes", E18:E306, " "))}

{=AVERAGE(IF(G18:G306 <> "Yes", E18:E306, " "))}

Some background to this, I'm creating a template where I can just paste all my raw data and it gets automatically pulled into the correct areas and plotted on a chart. But what I do is esoteric and research based, so it really needs to be based on user input to decide if certain data should be excluded not only from the STD calculations but also be changed on my charts.

Writing VBA became a little complicated for me when using input from user AND also being able to take on data that is variable, but I did attempt it.

Thank you all for your help!!!
 
Upvote 0

Forum statistics

Threads
1,223,886
Messages
6,175,196
Members
452,616
Latest member
intern444

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top