## Recent Posts

I started studying SQL from a very famous site - HackerRank. Here I will try to provide multiple approaches & solutions to the same problem. It will help you learn and understand SQL in a better way.

Please make use of my blog posts for learning purpose only and feel free to ask your questions in the comment box below in case of any doubt.

SQL Problem Statement:

You are given a table, Projects, containing three columns: Task_ID, Start_Date and End_Date. It is guaranteed that the difference between the End_Date and the Start_Date is equal to 1 day for each row in the table

 Table: Projects

If the End_Date of the tasks are consecutive, then they are part of the same project. Samantha is interested in finding the total number of different projects completed.

Write a query to output the start and end dates of projects listed by the number of days it took to complete the project in ascending order. If there is more than one project that have the same number of completion days, then order by the start date of the project.

Sample Input:

Sample Output:

```2015-10-28 2015-10-29
2015-10-30 2015-10-31
2015-10-13 2015-10-15
2015-10-01 2015-10-04```

Explanation:

The example describes following four projects:
• Project 1: Tasks 1, 2 and 3 are completed on consecutive days, so these are part of the project. Thus start date of project is 2015-10-01 and end date is 2015-10-04, so it took 3 days to complete the project.

• Project 2: Tasks 4 and 5 are completed on consecutive days, so these are part of the project. Thus, the start date of project is 2015-10-13 and end date is 2015-10-15, so it took 2 days to complete the project.

• Project 3: Only task 6 is part of the project. Thus, the start date of project is 2015-10-28 and end date is 2015-10-29, so it took 1 day to complete the project.

• Project 4: Only task 7 is part of the project. Thus, the start date of project is 2015-10-30 and end date is 2015-10-31, so it took 1 day to complete the project.

### Solution-1: Using CROSS JOIN, DATEDIFF & SUB-QUERY (MySQL Query):

```SELECT s.Proj_Start_Date, min(e.Proj_End_Date) as Real_Proj_End_Date
FROM
(SELECT Start_Date as Proj_Start_Date FROM Projects WHERE Start_Date NOT IN (SELECT End_Date FROM Projects)) s,
(SELECT End_Date as Proj_End_Date FROM Projects WHERE End_Date NOT IN (SELECT Start_Date FROM Projects)) e
WHERE s.Proj_Start_Date < e.Proj_End_Date
GROUP BY s.Proj_Start_Date
ORDER BY DATEDIFF(min(e.Proj_End_Date), s.Proj_Start_Date) ASC, s.Proj_Start_Date ASC;```

NOTE:
1. Table s will give all the Start_Dates of the project
LOGIC: All the Start_Dates which are not present in column End_Date are the Start_Date of the Project.

2. Table e will give all the End_Dates of the project
LOGIC: All the End_Dates which are not present in column Start_Date are the End_Date of the Project.

3. After applying cross_join on table s and e, you will get all possible combinations of Proj_Start_Date & Proj_End_Date. (i.e. you will get multiple Proj_End_Dates for each Proj_Start_Date)

4. But, Well select the min(Proj_End_Date) as a valid/acceptable End_Date of the project and will neglect all other End_Dates for the given Start_Date of the Project.
Reason: As given in the question, if the tasks End_Dates are consecutive, then only we will consider those tasks in the same project.

5. We have to apply GROUP BY Proj_Start_Date to get min(Proj_End_Date)

6. Use DATEDIFF function to calculate the difference between project end_date and project start_date

7. Apply ORDER BY on above DATEDIFF and Proj_Start_Date.

### Solution-2: Using CROSS JOIN, DATEDIFF & SUB-QUERY (MySQL Query):

```SELECT s.Proj_Start_Date, min(e.Proj_End_Date) as Real_Proj_End_Date
FROM (SELECT Start_Date as Proj_Start_Date FROM Projects WHERE Start_Date NOT IN (SELECT End_Date FROM Projects)) s
CROSS JOIN (SELECT End_Date as Proj_End_Date FROM Projects WHERE End_Date NOT IN (SELECT Start_Date FROM Projects)) e
WHERE s.Proj_Start_Date < e.Proj_End_Date
GROUP BY s.Proj_Start_Date
ORDER BY DATEDIFF(min(e.Proj_End_Date), s.Proj_Start_Date) ASC, s.Proj_Start_Date ASC;```

NOTE:
1. Table s will give all the Start_Dates of the project
LOGIC: All the Start_Dates which are not present in column End_Date are the Start_Date of the Project.

2. Table e will give all the End_Dates of the project
LOGIC: All the End_Dates which are not present in column Start_Date are the End_Date of the Project.

3. After applying cross_join on table s and e, you will get all possible combinations of Proj_Start_Date & Proj_End_Date. (i.e. you will get multiple Proj_End_Dates for each Proj_Start_Date)

4. But, Well select the min(Proj_End_Date) as a valid/acceptable End_Date of the project and will neglect all other End_Dates for the given Start_Date of the Project.
Reason: As given in the question, if the tasks End_Dates are consecutive, then only we will consider those tasks in the same project.

5. We have to apply GROUP BY Proj_Start_Date to get min(Proj_End_Date)

6. Use DATEDIFF function to calculate the difference between project end_date and project start_date

7. Apply ORDER BY on above DATEDIFF and Proj_Start_Date.

### Expected Output:

```2015-10-15 2015-10-16
2015-10-17 2015-10-18
2015-10-19 2015-10-20
2015-10-21 2015-10-22
2015-11-01 2015-11-02
2015-11-17 2015-11-18
2015-10-11 2015-10-13
2015-11-11 2015-11-13
2015-10-01 2015-10-05
2015-11-04 2015-11-08
2015-10-25 2015-10-31```

--------------------------------------------------------------------------------
&
Click here to see more codes for Raspberry Pi 3 and similar Family.
&
Click here to see more codes for NodeMCU ESP8266 and similar Family.
&
Click here to see more codes for Arduino Mega (ATMega 2560) and similar Family.
Feel free to ask doubts in the comment section. I will try my best to answer it.
If you find this helpful by any mean like, comment and share the post.
This is the simplest way to encourage me to keep doing such work.

Thanks & Regards,
-Akshay P Daga