Cleaning data

ZochSteveo

New Member
Joined
Dec 21, 2024
Messages
18
Office Version
  1. 2024
Platform
  1. MacOS
Hi there,

I have been trialling a method to determine which are outliers in my data set.

Currently, I have been using Z-scores to do this. I then see if this absolute value is greater than 3 to give an output of whether or not it is an outlier from the rest of the population.

I also want to return that it is an outlier if its Z score is less than -3.

My current formula, using an example, looks like this:

=IF(ABS(AA2)>3,"YES","NO")

What would I change this to so that it also includes if ABS(AA2)<-3 gives an output?

Many thanks,

Zach
 
Excel Formula:
ABS(AA2)<-3
can't be less than -3,
Excel Formula:
ABS(AA2)
it can only be a positive number
 
Upvote 0
Excel Formula:
ABS(AA2)<-3
can't be less than -3,
Excel Formula:
ABS(AA2)
it can only be a positive number
Ok thank you, is there an alternative then for what I could do? Cause I do also want to eliminate the small values, below -3
 
Upvote 0
You just remove the ABS and put both formulas in an OR statement
Excel Formula:
=IF(OR(AA2<-3,ABS(AA2)>3),"YES","NO")
 
Upvote 0
You just remove the ABS and put both formulas in an OR statement
Excel Formula:
=IF(OR(AA2<-3,ABS(AA2)>3),"YES","NO")
Does the ABS function need to be removed from both? I have trialled it and it has come through as 'YES' when I was expecting it to be 'NO'
 
Upvote 0
The ABS can be removed as it only changes negative numbers to positive so has no effect on positive numbers, the formula gives the results I expect

Book1
AAABAC
23NONO
34YESYES
4-2NONO
5-3NONO
6-4YESYES
7-7YESYES
8-1NONO
92NONO
107YESYES
Sheet3
Cell Formulas
RangeFormula
AB2:AB10AB2=IF(OR($AA2<-3,$AA2>3),"YES","NO")
AC2:AC10AC2=IF(OR($AA2<-3,ABS($AA2)>3),"YES","NO")


Edited just because of how the XL2BB formula range was showing
 
Last edited:
Upvote 0
Being honest I don't see the difference in results to your original ABS formula (and there shouldn't be a difference either), perhaps it would be better if you posted the results you expect based on the data in the table that I posted then we'll take it from there.

Book1
AAABACAD
1Original formula
23NONONO
34YESYESYES
4-2NONONO
5-3NONONO
6-4YESYESYES
7-7YESYESYES
8-1NONONO
92NONONO
107YESYESYES
Sheet3
Cell Formulas
RangeFormula
AB2:AB10AB2=IF(OR($AA2<-3,$AA2>3),"YES","NO")
AC2:AC10AC2=IF(OR($AA2<-3,ABS($AA2)>3),"YES","NO")
AD2:AD10AD2=IF(ABS(AA2)>3,"YES","NO")
 
Upvote 0
Being honest I don't see the difference in results to your original ABS formula (and there shouldn't be a difference either), perhaps it would be better if you posted the results you expect based on the data in the table that I posted then we'll take it from there.

Book1
AAABACAD
1Original formula
23NONONO
34YESYESYES
4-2NONONO
5-3NONONO
6-4YESYESYES
7-7YESYESYES
8-1NONONO
92NONONO
107YESYESYES
Sheet3
Cell Formulas
RangeFormula
AB2:AB10AB2=IF(OR($AA2<-3,$AA2>3),"YES","NO")
AC2:AC10AC2=IF(OR($AA2<-3,ABS($AA2)>3),"YES","NO")
AD2:AD10AD2=IF(ABS(AA2)>3,"YES","NO")
Ok so basically, I just want to find the outliers using the Z value that are greater than 3, and less than -3. I will attach a screenshot image of what I mean. I am not actually sure how to upload an Excel script onto here like you have done, so I will explain what is going on here. In column A, I have the population of the data that was given to me. In Column D, I have the mean (average) of the population in cell D2, and the standard deviation of the population in D3. In column B, I have used the following formula, and then copied it down: =(A2-$D$2)/$D$3 so it is in column C that I need the help as this will then determine whether or not the data in each row is an outlier or not. And it will be considered an outlier if it is <-3 or >3 so less than -3 or greater than 3.

I hope that makes sense. I will try and work out how to attach an Excel
 

Attachments

  • Screenshot 2025-02-10 at 11.44.46.png
    Screenshot 2025-02-10 at 11.44.46.png
    96.3 KB · Views: 3
Upvote 0
Can you do what I asked in my previous post and with the table that I posted put your expected results, the results I posted with all the formula give a Yes for the results <-3 or >3 and so I really do not know what you are looking for especially as everything you have shown in column B in your image is >-3 and <3
 
Upvote 0
Can you do what I asked in my previous post and with the table that I posted put your expected results, the results I posted with all the formula give a Yes for the results <-3 or >3 and so I really do not know what you are looking for especially as everything you have shown in column B in your image is >-3 and <3
Perfect, that works. Is there a way to get the cells which are "YES" to highlight a certain colour?
 
Upvote 0

Forum statistics

Threads
1,226,831
Messages
6,193,206
Members
453,779
Latest member
C_Rules

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top