T-SQL でユーザー年齢区分別のデータを抽出

 
MS SQL Server サンプルデータの AdventureWorksDW2019 スキーマを使って T-SQL を覚えるメモ。

ファクトテーブルの dbo.FactInternetSales と関連ディメンションテーブルを結合して、ユーザーの年齢区分別にデータを抽出する。

declare @startdate date = '2013-01-01';
declare @enddate date = '2013-12-31';
with t1 as (
    SELECT
        s.[ProductKey],
        s.[CustomerKey],
		s.[OrderDate],
		s.[OrderDateKey],
		c.[Gender],
		c.[BirthDate],
		convert(int, format(c.[BirthDate], 'yyyyMMdd')) as i_birthdate,
		floor((s.[OrderDateKey] - convert(int, format(c.[BirthDate], 'yyyyMMdd'))) / 10000) as age,
		c.[DateFirstPurchase],
		c.[GeographyKey],
		g.[CountryRegionCode],
        d.[FullDateAlternateKey],
        convert(varchar(7), format(d.[FullDateAlternateKey], 'yyyy-MM')) as year_month,
        s.[SalesOrderNumber],
        s.[SalesOrderLineNumber],
        s.[OrderQuantity] * s.[UnitPrice] as revenue,
        pc.[EnglishProductCategoryName] as category,
        psc.[EnglishProductSubcategoryName] as subcategory,
        p.[EnglishProductName] as productname
    FROM
        [dbo].[FactInternetSales] as s
        inner join [dbo].[DimDate] as d on s.[OrderDateKey] = d.[DateKey]
		inner join [dbo].[DimCustomer] as c on s.[CustomerKey] = c.[CustomerKey]
		inner join [dbo].[DimGeography] as g on c.[GeographyKey] = g.[GeographyKey]
        inner join [dbo].[DimProduct] as p on s.[ProductKey] = p.[ProductKey]
        left join [dbo].[DimProductSubcategory] as psc on p.[ProductSubcategoryKey] = psc.[ProductSubcategoryKey]
        left join [dbo].[DimProductCategory] as pc on psc.[ProductCategoryKey] = pc.[ProductCategoryKey]
    where
        1 = 1
        and s.[OrderDate] between @startdate and @enddate
),
t2 as (
    select
        t1.[SalesOrderNumber],
        t1.[CustomerKey],
        t1.[Gender],
        t1.[age],
        concat(
            case
                when t1.[age] >= 20 then t1.[gender]
                else ''
            end,
            case
                when t1.[age] between 4 and 12 then 'C'
                when t1.[age] between 13 and 19 then 'T'
                when t1.[age] between 20 and 34 then '1'
                when t1.[age] between 35 and 49 then '2'
                when t1.[age] >= 50 then '3'
            end
        ) as agegroup,
        t1.[subcategory]
    from
        t1
)
select
    t2.[subcategory],
    t2.[agegroup],
    count(distinct SalesOrderNumber) as order_cnt
from
    t2
group by
    t2.[subcategory],
    t2.[agegroup]
order by
    t2.[subcategory],
    t2.[agegroup];

 
結果はこうなる。

商品と年齢区分の集計

 
PowerBI で表示するとこうなる。

PowerBI でユーザー年齢区分毎に商品売上を可視化

 

参考図書

ビッグデータ分析・活用のためのSQLレシピ
SQL Server 2016の教科書 開発編