Recent Posts

HackerRank: [SQL Advanced Join] (5/5) 15 DAYS OF LEARNING SQL | Sub-Query, datediff, distinct, count in SQL

HackerRank: [SQL Advanced Join] (5/5) 15 DAYS OF LEARNING SQL | Sub-Query, datediff, distinct, count in SQL
I started studying SQL from a very famous site - HackerRank. Here I will try to provide multiple approaches & solutions to the same problem. It will help you learn and understand SQL in a better way.

Please make use of my blog posts for learning purpose only and feel free to ask your questions in the comment box below in case of any doubt.

Click Here for the previous blog-post in the series.


SQL Problem Statement:

Julia conducted a 15 days of learning SQL contest. The start date of the contest was March 01, 2016 and the end date was March 15, 2016.

Write a query to print total number of unique hackers who made at least submission each day (starting on the first day of the contest), and find the hacker_id and name of the hacker who made maximum number of submissions each day. If more than one such hacker has a maximum number of submissions, print the lowest hacker_id . The query should print this information for each day of the contest, sorted by the date.



Input Format:

The following tables hold contest data:

  • Hackers: The hacker_id is the id of the hacker, and name is the name of the hacker.

  • Submissions: The submission_date is the date of the submission, submission_id is the id of the submission, hacker_id is the id of the hacker who made the submission, and score is the score of the submission.




Sample Input:

For the following sample input, assume that the end date of the contest was March 06, 2016.
  • Hackers Table:

  • Submissions Table:
Sample Output:

2016-03-01 4 20703 Angela
2016-03-02 2 79722 Michael
2016-03-03 2 20703 Angela
2016-03-04 2 20703 Angela
2016-03-05 1 36396 Frank
2016-03-06 1 20703 Angela



Explanation:
  • On March 01, 2016 hackers 20703, 36396, 53473, and 79722 made submissions. There are 4 unique hackers who made at least one submission each day. As each hacker made one submission, 20703 is considered to be the hacker who made maximum number of submissions on this day. The name of the hacker is Angela.

  • On March 02, 2016 hackers 15758, 20703, and 79722 made submissions. Now 20703 and 79722 were the only ones to submit every day, so there are unique hackers who made at least one submission each day. 79722 made 2 submissions, and name of the hacker is Michael.

  • On March 03, 2016 hackers 20703, 36396, and 79722 made submissions. Now 20703 and 79722 were the only ones , so there are 2 unique hackers who made at least one submission each day. As each hacker made one submission so 20703 is considered to be the hacker who made maximum number of submissions on this day. The name of the hacker is Angela.

  • On March 04, 2016 hackers 20703, 44065, 53473, and 79722 made submissions. Now 20703 and 79722 only submitted each day, so there are 2 unique hackers who made at least one submission each day. As each hacker made one submission so 20703 is considered to be the hacker who made maximum number of submissions on this day. The name of the hacker is Angela.

  • On March 05, 2016 hackers 20703, 36396, 38389 and 62529 made submissions. Now only submitted each day, so there is only 1 unique hacker who made at least one submission each day. 36396 made 2 submissions and name of the hacker is Frank.

  • On March 06, 2016 only 20703 made submission, so there is only 1 unique hacker who made at least one submission each day. 20703 made 1 submission and name of the hacker is Angela.

Solution-1: USING SUB-QUERY, DATEDIFF (MySQL Query):

SELECT SUBMISSION_DATE,
       (SELECT Count(DISTINCT hacker_id) as no_of_unique_hacker_id FROM submissions S2
        WHERE  S2.submission_date = S1.submission_date
        AND (SELECT Count(DISTINCT S3.submission_date)
             FROM   submissions S3
             WHERE  S3.hacker_id = S2.hacker_id
             AND S3.submission_date < S1.submission_date
	    ) = Datediff(S1.submission_date, '2016-03-01')
       ) AS NO_OF_UNIQUE_HACKERS,
       (SELECT hacker_id FROM submissions S2
        WHERE  S2.submission_date = S1.submission_date
        GROUP  BY hacker_id
        ORDER  BY Count(submission_id) DESC, hacker_id ASC LIMIT  1
       ) AS MAX_SUB_HACKER_ID,
       (SELECT name FROM hackers
        WHERE  hacker_id = MAX_SUB_HACKER_ID
       ) AS NAME
FROM   (SELECT DISTINCT submission_date FROM submissions) S1
GROUP BY submission_date;



NOTE: 
  • SUBMISSION_DATEMAX_SUB_HACKER_IDNAME are easy to understand from the above code itself.

  • The complex part is to get the NO_OF_UNIQUE_HACKERS, If Count(DISTINCT S3.submission_date) is matched with Datediff(S1.submission_date, '2016-03-01') then we can say, the hacker has made the submission each day.

  • DISTINCT is used to get the unique records from the given column.

  • DATEDIFF is used to calculate the difference between 2 dates.

Expected Output:

2016-03-01 112 81314 Denise
2016-03-02 59 39091 Ruby
2016-03-03 51 18105 Roy
2016-03-04 49 533 Patrick
2016-03-05 49 7891 Stephanie
2016-03-06 49 84307 Evelyn
2016-03-07 35 80682 Deborah
2016-03-08 35 10985 Timothy
2016-03-09 35 31221 Susan
2016-03-10 35 43192 Bobby
2016-03-11 35 3178 Melissa
2016-03-12 35 54967 Kenneth
2016-03-13 35 30061 Julia



--------------------------------------------------------------------------------
Click here to see solutions for all Machine Learning Coursera Assignments.
&
Click here to see more codes for Raspberry Pi 3 and similar Family.
&
Click here to see more codes for NodeMCU ESP8266 and similar Family.
&
Click here to see more codes for Arduino Mega (ATMega 2560) and similar Family.
Feel free to ask doubts in the comment section. I will try my best to answer it.
If you find this helpful by any mean like, comment and share the post.
This is the simplest way to encourage me to keep doing such work.

Thanks & Regards,
-Akshay P Daga

No comments