科大讯飞AIUI智能语音体验

3,016 阅读10分钟

最近调研了一下科大讯飞智能语音服务AIUI,并基于官方Demo二次开发了一个比较简单的demo(Android->AIUI->服务器后处理),体验了一下功能。

官方的Android Demo中提供了“语音听写”、“语法识别”、“语义理解”、“语音合成”、“声纹密码”等功能,我个人主要使用了“语义理解”和“语音合成”

之前的文章介绍过Amazon的Alexa,AIUI应该想做类似的事情,但是目前AIUI的功能有限,开发文档也一般(大篇幅介绍网页控制台的操作,很多关键的地方描述的不清楚,比如“自定义技能”、“自定义实体”这些开发者非常关心的功能,开发文档只简单的介绍了一下怎么添加;“自定义实体”是个什么鬼都没说就over了);另外,官方DEMO功能都简单,尤其是第三方应用开发者需要用到的“后处理”功能,资料少之又少,官方只给出来一个简单的不能再简单的例子(只是实现了AIUI将语义分析结果转发给第三方的后处理,后处理收到了就没有了...看到那儿的时候真的很无语啊...后处理收到数据之后,要怎么处理,要用什么格式返回数据,返回的数据怎么用,一点儿没提,o(╯□╰)o服了)

目前AIUI适合的场景

语音引导:比如开放技能中“天气”这种通用的功能,查询时不需要跟用户信息关联、有可预知的关键字“天气”等(所谓的“自定义实体”)

屏幕快照 2017-09-20 下午5.25.21.png
屏幕快照 2017-09-20 下午5.25.21.png

AIUI不能满足的

无法提前设置语料库的情况,例如与账户关联的、语音信息录入(某个问诊APP,录入用户姓名等)

调研之前想实现的功能

  1. 语音引导:类似于ATM这种自助终端,用户通过语音来与终端交互;比如终端问“您好,有什么需要”,用户答“拍照”,然后终端进入拍照功能;

这种需求目前AIUI是可以满足的,怎么实现,文章后面会介绍

  1. 语音信息录入:通过用户的语音来完成信息录入,比如通过语音录入用户的姓名、年龄、患病情况等;

目前AIUI不适用这种场景,发音一样字不一样的情况太多了,比如wangxiaoer对应的名字可能是“王小二”、“王晓二”,这种情况下,用语音方式进行信息录入,可就得不偿失了。

自定义技能

image.png
image.png

开放技能中没有想要的功能时需要自己实现自定义技能,通过语义理解提取到关键字,在“后处理”服务器去执行一些操作。

比如在自定义技能中,设置语料“我尚”,当用户说“我尚”之后,AIUI会触发这个自定义技能,其中的query就是“我尚”,后处理服务收到这个语义后,可以做一些操作,返回数据;

{
    "category": "ISHANG.mHealth_demo:11.0",
    "intentType": "custom",
    "query": "我尚",
    "query_ws": "我/NP//  尚/ADD//",
    "rc": 0,
    "nlis": "true",
    "service": "ISHANG.mHealth_demo",
    "uuid": "atn000167dc@ch60f10d1d86686f2601",
    "vendor": "ISHANG",
    "version": "11.0",
    "semantic": [
        {
            "intent": "init",
            "score": 1,
            "slots": []
        }
    ],
    "sid": "atn000167dc@ch60f10d1d86686f2601",
    "text": "我尚"
}

这里注意rc字段,0表示语义理解成功,如果语义理解不成功是这样的(rc为4):

{
    "rc": 4,
    "uuid": "atn00016bf3@ch60f10d1d86f86f2601",
    "sid": "atn00016bf3@ch60f10d1d86f86f2601",
    "text": "大王"
}

自定义实体

自定义技能如果用"我是{name}",这个{name}就是一个自定义实体(可以理解为语料库),开放的自定义实体里有省、城市、歌曲名等等,如果没有,就自定义一个“张三”,在语义理解时,会出现在语义槽中

image.png
image.png
{
    "category": "ISHANG.mHealth_demo:11.0",
    "intentType": "custom",
    "query": "我是张三",
    "query_ws": "我/NP//  是/V_SHI//  张三/NPP//",
    "rc": 0,
    "nlis": "true",
    "service": "ISHANG.mHealth_demo",
    "uuid": "atn00016f72@ch0de90d1d87956f2a01",
    "vendor": "ISHANG",
    "version": "11.0",
    "semantic": [
        {
            "intent": "input_name",
            "score": 1,
            "slots": [
                {
                    "begin": 2,
                    "end": 4,
                    "name": "name",
                    "normValue": "张三",
                    "value": "张三"
                }
            ]
        }
    ],
    "sid": "atn00016f72@ch0de90d1d87956f2a01",
    "text": "我是张三"
}

这里注意,“张三”已经在自定义实体中添加过了,在json中出现在semantic[0].slots[0]这个字段,这就是语义理解的精髓所在了,就是需要你提前在语料库中添加好语料,在语义理解结果中,这个语料就可以单独提出来,在后处理作为业务逻辑参数使用

但是,语料库必须提前录入好,否则语义理解就失败了;但是有些时候——比如录入姓名——除非将全中国人的姓名做成语料库,否则语义理解就失败了,该怎么办?

image.png
image.png

如果是一个没有添加的实体,返回如下,rc为4也就是无法理解语义

image.png
image.png
{
    "rc": 4,
    "uuid": "atn000176af@ch0de90d1d88736f2a01",
    "sid": "atn000176af@ch0de90d1d88736f2a01",
    "text": "我是例子"
}

后处理

如果设置了后处理,AIUI服务器会将语义理解的结果转发给后处理服务器,在后处理服务器通过post方法接收AIUI转发请求的函数(或者方法)里,我们可以提取语义理解的结果,做一些查询等操作,然后返回;

post请求的数据

关键数据保存在Msg.Content字段,这里要注意的是SessionParams和Msg.Content是Base64编码之后的,使用时需要先解码,解码之后完整的请求数据如下:

{"MsgId":"cid6f1c2494@ch00270d1c09d20100101","CreateTime":1505803732,"AppId":"59bf6334","UserId":"d3146084944","SessionParams":{"dsrc":"sdk","dts":"1","dtype":"audio","msc.lat":"39.895252","msc.lng":"116.343834","scene":"main","scity":"ch","sid":"cid6f1c2494@ch00270d1c09d2010010","stmid":"audio-16","ver_type":"mobile_phone","wake_id":"15058037304161d1c41c87ab7cd3c"},"UserParams":"","FromSub":"kc","Msg":{"ContentType":"json","Type":"text","Content":{"intent":{"data":{"result":[{"airData":40,"airQuality":"优","city":"北京","date":"2017-09-19","dateLong":1505750400,"exp":{"ct":{"expName":"穿衣指数","level":"热","prompt":"天气热,建议着短裙、短裤、短薄外套、T恤等夏季服装。"}},"humidity":"20%","lastUpdateTime":"2017-09-19 11:39:20","pm25":"13","temp":29,"tempRange":"14℃~30℃","weather":"晴","weatherType":0,"wind":"西北风3-4级","windLevel":1},{"city":"北京","date":"2017-09-20","dateLong":1505836800,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"14℃~27℃","weather":"晴","weatherType":0,"wind":"南风微风","windLevel":0},{"city":"北京","date":"2017-09-21","dateLong":1505923200,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"17℃~28℃","weather":"多云","weatherType":1,"wind":"南风微风","windLevel":0},{"city":"北京","date":"2017-09-22","dateLong":1506009600,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"15℃~28℃","weather":"晴","weatherType":0,"wind":"西北风微风","windLevel":0},{"city":"北京","date":"2017-09-23","dateLong":1506096000,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"18℃~29℃","weather":"晴转多云","weatherType":0,"wind":"南风微风","windLevel":0},{"city":"北京","date":"2017-09-24","dateLong":1506182400,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"19℃~28℃","weather":"阴","weatherType":2,"wind":"东风微风","windLevel":0},{"city":"北京","date":"2017-09-25","dateLong":1506268800,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"19℃~28℃","weather":"多云转阴","weatherType":1,"wind":"东南风微风","windLevel":0}]},"rc":0,"semantic":[{"intent":"QUERY","slots":[{"name":"location.city","value":"CURRENT_CITY","normValue":"CURRENT_CITY"},{"name":"location.poi","value":"CURRENT_POI","normValue":"CURRENT_POI"},{"name":"location.type","value":"LOC_POI","normValue":"LOC_POI"},{"name":"queryType","value":"内容"},{"name":"subfocus","value":"天气状态"}]}],"service":"weather","text":"天气","uuid":"atn00913a37@ch46b50d1c09d46f2a01","used_state":{"state_key":"fg::weather::default::default","state":"default"},"answer":{"text":"\"北京\"今天\"晴\",\"14℃~30℃\",\"西北风3-4级\""},"dialog_stat":"DataValid","sid":"cid6f1c2494@ch00270d1c09d2010010"}}}}

返回格式

参照开放技能“天气”等json数据返回,把结果放在data或者intent的answer里,其它字段还是用post请求发过来的数据。“天气”语义理解之后的数据如下:

{
  "data": {
    "result": [
      {
        "airData": 44,
        "airQuality": "优",
        "city": "北京",
        "date": "2017-09-20",
        "dateLong": 1505836800,
        "exp": {
          "ct": {
            "expName": "穿衣指数",
            "level": "热",
            "prompt": "天气热,建议着短裙、短裤、短薄外套、T恤等夏季服装。"
          }
        },
        "humidity": "25%",
        "lastUpdateTime": "2017-09-20 11:07:03",
        "pm25": "10",
        "temp": 24,
        "tempRange": "14℃~27℃",
        "weather": "晴",
        "weatherType": 0,
        "wind": "北风微风",
        "windLevel": 0
      },
      {
        "city": "北京",
        "date": "2017-09-21",
        "dateLong": 1505923200,
        "lastUpdateTime": "2017-09-20 11:07:03",
        "tempRange": "17℃~29℃",
        "weather": "多云",
        "weatherType": 1,
        "wind": "南风微风",
        "windLevel": 0
      },
      {
        "city": "北京",
        "date": "2017-09-22",
        "dateLong": 1506009600,
        "lastUpdateTime": "2017-09-20 11:07:03",
        "tempRange": "13℃~27℃",
        "weather": "晴",
        "weatherType": 0,
        "wind": "西北风微风",
        "windLevel": 0
      },
      {
        "city": "北京",
        "date": "2017-09-23",
        "dateLong": 1506096000,
        "lastUpdateTime": "2017-09-20 11:07:03",
        "tempRange": "18℃~29℃",
        "weather": "晴转多云",
        "weatherType": 0,
        "wind": "南风微风",
        "windLevel": 0
      },
      {
        "city": "北京",
        "date": "2017-09-24",
        "dateLong": 1506182400,
        "lastUpdateTime": "2017-09-20 11:07:03",
        "tempRange": "19℃~28℃",
        "weather": "阴",
        "weatherType": 2,
        "wind": "东风微风",
        "windLevel": 0
      },
      {
        "city": "北京",
        "date": "2017-09-25",
        "dateLong": 1506268800,
        "lastUpdateTime": "2017-09-20 11:07:03",
        "tempRange": "19℃~28℃",
        "weather": "多云转阴",
        "weatherType": 1,
        "wind": "东南风微风",
        "windLevel": 0
      },
      {
        "city": "北京",
        "date": "2017-09-26",
        "dateLong": 1506355200,
        "lastUpdateTime": "2017-09-20 11:07:03",
        "tempRange": "13℃~25℃",
        "weather": "晴",
        "weatherType": 0,
        "wind": "西北风3-4级",
        "windLevel": 1
      }
    ]
  },
  "rc": 0,
  "semantic": [
    {
      "intent": "QUERY",
      "slots": [
        {
          "name": "location.city",
          "value": "CURRENT_CITY",
          "normValue": "CURRENT_CITY"
        },
        {
          "name": "location.poi",
          "value": "CURRENT_POI",
          "normValue": "CURRENT_POI"
        },
        {
          "name": "location.type",
          "value": "LOC_POI",
          "normValue": "LOC_POI"
        },
        {
          "name": "queryType",
          "value": "内容"
        },
        {
          "name": "subfocus",
          "value": "天气状态"
        }
      ]
    }
  ],
  "service": "weather",
  "text": "天气",
  "uuid": "atn00018593@ch60f10d1d8a556f2601",
  "used_state": {
    "state_key": "fg::weather::default::default",
    "state": "default"
  },
  "answer": {
    "text": "\"北京\"今天\"晴\",\"14℃~27℃\",\"北风微风\""
  },
  "dialog_stat": "DataValid",
  "sid": "atn00018593@ch60f10d1d8a556f2601"
}

DEMO

Demo地址,Demo包括后处理服务端和Android App

后处理服务端

nodejs实现

  • get请求处理方法中,主要是实现了aiui后处理服务器验证
  • post请求处理方法中,实现了一个非常简单的状态机,使用aiui发来的语义结果结合一个code变量,来控制返回什么样的数据;返回数据格式参照开放技能“天气”

Android App

基于官方demo完成(在aiui上注册android应用之后,该应用的设置界面可下载工程代码,遗憾的是这货居然是个eclipse工程o(╯□╰)o)

  • 在语义理解demo中基于语音合成功能,加入了语音播报结果功能
  • 说“我尚”,返回“欢迎您使用...请说出自己的名字”,然后说“张三”,返回“您的名字是张三,请说出您的年龄”;然后说“28”,返回“您的年龄是28,谢谢使用,再见”。

此demo在AIUI上的配置

应用配置
应用配置
自定义技能 init
自定义技能 init
自定义技能 input_name
自定义技能 input_name
自定义实体
自定义实体