We can pass various parameters in the count function. It's output changes as per the parameters passed in it.
We will discuss the difference in the output of count(*), count(1) and count(col_name).
Recommended SQL Courses:
Let's consider a table as shown below:
Table created: STUDENTS
■ COUNT(*) :
Let's check count(*) operation on the above table: STUDENTS.Output:
count(*) output = Total number of records in the table including null values.
■ COUNT(1) :
Let's check count(1) operation on the above table.Output:
count(1) output = Total number of records in the table including null values.
NOTE :
▪ The output of count(*) and count(1) is same but the difference is in the time taken to execute the query.
▪ count(1) is faster/optimized than count(*) because:
▯ count(*) has to iterate through all the columns,
But count(1) iterates through only one column.
▯ Check the time difference between count(*) and count(1) on big data-set.
▯ count(*) has to iterate through all the columns,
But count(1) iterates through only one column.
▯ Check the time difference between count(*) and count(1) on big data-set.
▪ Always try to use count(1) instead of count(*). Since, count(1) performs better and saves computation effort & time.
COMMON CONFUSION :
▪ Many people think that the number "1" in the count(1) indicates the first column of the table.
But it is not correct.
▪ 1 is just a hardcoded value. You can use any other hardcoded number or string instead of 1.
The output will be the same.
The output will be the same.
■ COUNT(col_name) :
Let's check count(col_name) operation on the above table.On first column : ID
Output:
count(ID) output = Total number of entries in the column "roll_no" excluding null values.
On the second column : MARKS
Output:
count(MARKS) output = Total number of entries in the column "Marks" excluding null values.
■ VIDEO :
■ SUMMARY :
- count(*) :
output = total number of records in the table including null values. - count(1) :
output = total number of records in the table including null values.
[ Faster than count(*) ] - count(col_name) :
output = total number of entries in the column "col_name" excluding null values.
--------------------------------------------------------------------------------
Click here to see solutions for all Machine Learning Coursera Assignments.
&
Click here to see more codes for Raspberry Pi 3 and similar Family.
&
Click here to see more codes for NodeMCU ESP8266 and similar Family.
&
Click here to see more codes for Arduino Mega (ATMega 2560) and similar Family.
Feel free to ask doubts in the comment section. I will try my best to answer it.
If you find this helpful by any mean like, comment and share the post.
This is the simplest way to encourage me to keep doing such work.
Thanks & Regards,
-Akshay P Daga
I was just testing this on a table with 68,689,336 rows of data on a SQL Server 2008 R2 instance and COUNT(*) and COUNT(1) perform basically the same. There is a 30 ms difference, but with repeated runs, the difference seems to switch as to which one performs better. I used the hint MAXDOP 1 to ensure that I wouldn't be hit by parallel processing causing performance differences.
ReplyDeleteHow large does the data set need to be for COUNT(*) and COUNT(1) to give a noticeable performance difference?
Basically, the above mentioned difference between COUNT(*) and COUNT(1) is correct for Big data platform like Hadoop - Hive, Google - Big Query, etc.
DeleteI am not sure about SQL server. Never tested it on SQL server.
I did some testing with SQL Server and it generates the exact same execution plan and the query optimizer even switches the COUNT(1) over to a COUNT(*) on the back-end. So with SQL Server, they are identical.
Delete