There is a table, here is an example:
email carrier ID DATE
email_1 car_1 1 2019-02-08
email_1 car_1 1 2019-02-09
email_1 car_1 2 2019-02-10
email_1 car_2 2 2019-02-11
email_2 car_3 3 2019-02-09
email_2 car_4 4 2019-02-09
With this table I need to choose unique entries.
The criterion of uniqueness will be the following:
If within one email
-a, the value in the carrier
field>is unique and the value in the ID
field>is unique, then such a record will be unique. The value of the Date
field should not affect the uniqueness of the recording. We can output the first when sorting by date a unique record by date.
The result must be like this:
email carrier ID DATE
email_1 car_1 1 2019-02-08
email_1 car_2 2 2019-02-11
email_2 car_3 3 2019-02-09
email_2 car_4 4 2019-02-09
Distinct
will not suit, since when it is used, any change in the column is the criterion of the uniqueness of the recording.
How to make such a request?
I will execute the request in BigQuery
, but the example on SQL
is suitable 🙂
Thank you
Answer 1, Authority 100%
with test (email, carrier, ID, DATE) AS (
SELECT 'Email_1', 'Car_1', 1, '2019-02-08'
Union All.
SELECT 'EMAIL_1', 'CAR_1', 1, '2019-02-09'
Union All.
SELECT 'Email_1', 'Car_1', 2, '2019-02-10'
Union All.
SELECT 'Email_1', 'Car_2', 2, '2019-02-11'
Union All.
SELECT 'Email_2', 'Car_3', 3, '2019-02-09'
Union All.
SELECT 'Email_2', 'Car_4', 4, '2019-02-09'
),
Data AS (
SELECT
email, carrier, ID, DATE,
row_number () over (partition by email, carrier, ID Order by Date Desc) as rownumber
From test.
)
SELECT Email, Carrier, ID, Date From Data WHERE ROWNUMBER = 1
Answer 2
select email, carrier, ID, min (d_date) as date
From test.
Group by email, carrier, ID