Mathematical logic does NOT hold in this SQL example, trying to understand why.
Hi, I was working through the SQL tutorial and noticed a very strange difference between my answers and the model answer. Mathematically, the two formulas below are the same. However, they produce significantly different results when run in SQL.
1. I used: acquisitions.acquired_at_cleaned <= (companies.founded_at_clean::timestamp + INTERVAL '5 years')
2. ModeAnalytics used: (acquisitions.acquired_at_cleaned - companies.founded_at_clean::timestamp) <= INTERVAL '5 years'
The example is the question from the "SQL Date Format" lesson from "Advanced SQL" section. As you can see both formulas are the same just moving the timestamp from one side of the equation to the other.
To investigate I extracted the data using:
SELECT companies.category_code, companies.permalink, companies.name, companies.founded_at_clean, acquisitions.acquired_at_cleaned, (acquisitions.acquired_at_cleaned - companies.founded_at_clean::timestamp) difference
FROM tutorial.crunchbase_companies_clean_date companies
JOIN tutorial.crunchbase_acquisitions_clean_date acquisitions
ON acquisitions.company_permalink = companies.permalink
WHERE acquisitions.acquired_at_cleaned - companies.founded_at_clean::timestamp <= INTERVAL '5 years'
AND founded_at_clean IS NOT NULL
AND companies.category_code ILIKE 'software'
ORDER BY 6 DESC
and then changed the WHERE condition in bold to my formula 1. above. There are 69 rows using 1. and 72 rows using 2. The differences are 3 rows:
|/company/hiwired||HiWired||01/01/2004||01/01/2009 00:00||1827 days|
|/company/pokelabo||Pokelabo||01/11/2007||25/10/2012 00:00||1820 days|
|/company/real-girls-media-network-inc||Real Girls Media Network||01/01/2006||13/12/2010 00:00||1807 days|
Can anyone explain why these differences exist and why formula 1. produces wrong results? There are 1826 to 1827 days in 5 years (depending on when the 29th of Feb year occurs) so formula 2 produces correct results while formula 1 produces wrong results. Logically, this difference should not exist.
Apologies for the delayed response! From the answer to the problem (see here: https://app.mode.com/tutorial/reports/3043c5879728/details/queries/92839f8dfc43), it looks like the formula is aligned with your solution. I've run a few tests and changed it to the other way around, it returns the same result. So these two formulas produce the same result. I hope that clarifies!