Situation:
I have two tables. #t1 has logins and emails. #t2 has country associated to each emails.
I would like to join the information from #t2 to #t1 without having to join it twice. Joining it only once either in the inner or outer query would break the cross apply
logic.
My current query uses a cross apply to get rolling information as such (fiddle data below):
SELECT DISTINCT CAST(logins AS DATE) AS Dates,
count(distinct d.email) AS DAU,
count(distinct m.MAU) AS MAU
FROM #t1 d
CROSS APPLY (
SELECT CAST(m.logins as date) as dates, m.email AS MAU
FROM #t1 m
WHERE m.logins BETWEEN d.logins and DATEADD(dd, 30, d.logins)
) m
group by CAST(logins as date)
The only way i found to join the two tables without having to break my cross apply was to inner join
it in both the outer and inner query which is probably wrong but at least the output is correct. I do that so i can add my second condition in my where statement in the inner query. when i apply the logic to my actual table, the performance is dreadful(fiddle data below):
SELECT distinct CASt(logins AS DATE) AS Dates,
#t2.country,
count(distinct d.email) AS DAU,
count(distinct m.MAU) AS MAU
FROM #t1 d
inner join #t2 on d.email=#t2.email
CROSS APPLY (
SELECT cast(m.logins as date) as dates, m.email AS MAU, country.country AS country
FROM #t1 m
inner join #t2 country on m.email=country.email
WHERE m.logins BETWEEN d.logins and DATEADD(dd, 30, d.logins)
and #t2.country = country.country
) m
group by cast(logins as date), #t2.country
+-------------+---------+-----+-----+
| Dates | country | DAU | MAU |
+-------------+---------+-----+-----+
| 2019-04-01 | france | 1 | 2 |
| 2019-04-02 | france | 1 | 2 |
| 2019-04-03 | france | 1 | 2 |
| 2019-04-10 | france | 1 | 1 |
| 2019-04-03 | italie | 2 | 2 |
+-------------+---------+-----+-----+
Objective:
How can i find a way to join information from one table to another without having to join it twice. (fiddle data below)
The result should look like this (output from the second query above):
Fiddle:
create table #t1 (email varchar(max), logins datetime)
insert into #t1 values
('[email protected]', '2019-04-01 00:00:00.000'),
('[email protected]', '2019-04-02 00:00:00.000'),
('[email protected]', '2019-04-03 00:00:00.000'),
('[email protected]', '2019-04-10 00:00:00.000'),
('[email protected]', '2019-04-03 00:00:00.000'),
('[email protected]', '2019-04-03 00:00:00.000'),
('[email protected]', '2019-04-03 00:00:00.000')
create table #t2 (country varchar(max), email varchar(max))
insert into #t2 values
('france', '[email protected]'),
('france', '[email protected]'),
('italie', '[email protected]'),
('italie', '[email protected]')
The following recommendations will help you in your SQL tuning process.
You'll find 3 sections below:
SELECT
DISTINCT CAST(logins AS DATE) AS Dates,
count(DISTINCT d.email) AS DAU,
count(DISTINCT m.MAU) AS MAU
FROM
#t1 d CROSS APPLY (SELECT
CAST(m.logins AS date) AS dates,
m.email AS MAU
FROM
#t1 m
WHERE
m.logins BETWEEN d.logins AND DATEADD(m.dd, 30, d.logins)) m
GROUP BY
CAST(logins AS date)
ORDER BY
NULL