HBase数据模型深入剖析-OLAP商业环境实战

352 阅读1分钟

本套系列博客从真实商业环境抽取案例进行总结和分享,并给出Spark源码解读及商业实战指导,请持续关注本套博客。版权声明:本套Spark源码解读及商业实战归作者(秦凯新)所有,禁止转载,欢迎学习。

1:用户使用过哪些APP和使用时长(以用户id, 天为单位)

周期内时长统计

table-->behavior_user_app_201809
rowKey-->userid:20180904
columnfamily-->timeLen
column --> appName (包名为多个) 
value --> long值  

String tableName = "behavior_user_app_" + DateUtils.getMonthByHour(model.getHour());
Table table = HBaseClient.getInstance(this.props).getTable(tableName);
String rowKey = model.getUserId()+":"+DateUtils.getDayByHour(model.getHour());
table.incrementColumnValue(Bytes.toBytes(rowKey), Bytes.toBytes("timeLen"), Bytes.toBytes(model.getPackageName()), model.getTimeLen());

2:用户每小时的使用应用的时长(以用户id,天为单位)

一天中不同时段的玩机趋势

table-->behavior_user_hour_time_201809
rowKey-->userid:20180904
columnfamily-->timeLen
column --> 12 13 14 ... ...(小时24个) 
value --> long值 

String tableName = "behavior_user_hour_time_" + DateUtils.getMonthByHour(model.getHour());
Table table = HBaseClient.getInstance(this.props).getTable(tableName);
String rowKey = model.getUserId()+":"+DateUtils.getDayByHour(model.getHour());
table.incrementColumnValue(Bytes.toBytes(rowKey), Bytes.toBytes("timeLen"), Bytes.toBytes(DateUtils.getOnlyHourByHour(model.getHour())), model.getTimeLen());

3:用户每天的玩机时长(以用户id为单位)

一个月中不同天数的玩机趋势

table-->behavior_user_day_time_201809
rowKey-->userid
columnfamily-->timeLen
column --> 1 2 3 ... ...(天数:31) 
value --> long值 

String tableName = "behavior_user_day_time_" + DateUtils.getMonthByHour(model.getHour());
Table table = HBaseClient.getInstance(this.props).getTable(tableName);
String rowKey = String.valueOf(model.getUserId());
table.incrementColumnValue(Bytes.toBytes(rowKey), Bytes.toBytes("timeLen"), Bytes.toBytes(DateUtils.getOnlyDayByHour(model.getHour())), model.getTimeLen());

4:用户每个应用每小时的玩机时长(以用户id,天,应用为单位)

table-->behavior_user_day_app_time_201809
rowKey-->userid:day:packageName
columnfamily-->timeLen
column --> 12 13 14 ... ...(小时24个)
value --> long值 

String tableName = "behavior_user_day_app_time_" + DateUtils.getMonthByHour(model.getHour());
Table table = HBaseClient.getInstance(this.props).getTable(tableName);
String rowKey = model.getUserId()+":"+DateUtils.getDayByHour(model.getHour())+":"+model.getPackageName();

table.incrementColumnValue(Bytes.toBytes(rowKey), Bytes.toBytes("timeLen"), Bytes.toBytes(DateUtils.getOnlyHourByHour(model.getHour())), model.getTimeLen());

5:用户每个应用每天的玩机时长(以用户id,应用为单位)

table-->behavior_user_day_time_201809
rowKey-->userid:packageName
columnfamily-->timeLen
column --> 1 2 3 ... ...(天数:31) 
value --> long值 


String tableName = "behavior_user_day_app_time_" + DateUtils.getMonthByHour(model.getHour());
Table table = HBaseClient.getInstance(this.props).getTable(tableName);
String rowKey = model.getUserId()+":"+model.getPackageName();

table.incrementColumnValue(Bytes.toBytes(rowKey), Bytes.toBytes("timeLen"), Bytes.toBytes(DateUtils.getOnlyDayByHour(model.getHour())), model.getTimeLen());

6:总结

秦凯新 于深圳