9.【小萌伴Android】机器人陪聊–语音功能及其实现

3,655 views

阅读模式

前几篇都是介绍一些辅助功能，如新闻、H5游戏、段子趣图、原生小游戏，手电筒应用等，现在再来聊聊机器人陪聊主体功能--语音功能及其实现。

【小萌伴】语音

【小萌伴】中的语音功能使用的是百度语音sdk，包含语音输入、语音播放、语音转文字、文字转语音、声音变换、离线语音语义识别、语音唤醒等。

其中功能的主要可分为三部分：语音识别、语音合成、语音唤醒。（我用的sdk比较老了，下面代码也许已经不兼容新sdk，具体请参考百度语音官网）

ChatActivity实现了RecognitionListener及SpeechSynthesizerListener接口，这两个接口是语音识别与合成的监听。

初始化

语音识别和语音合成需要在进入Activity后执行初始化，在销毁时进行销毁。初始化如下，语音识别初始化没有封装，语音合成则用TtsUtils封装了一下。

    mSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(ChatActivity.this, new ComponentName(ChatActivity.this, VoiceRecognitionService.class));
    // 注册监听器
    mSpeechRecognizer.setRecognitionListener(ChatActivity.this);
    mSpeechSynthesizer = TtsUtils.getSpeechSynthesizer(ChatActivity.this, ChatActivity.this, getVoice(mRobot));

语音输入及识别

对这一块，通过BdVoiceUtil类进行了封装，通过调用如下方法即可开启语音识别：

    BdVoiceUtil.startASR(mSpeechRecognizer, mSpeechSynthesizer, true);

在RecognitionListener的回调中获取语音识别的结果，包括实时（部分）转换及全量（整句话）转换为文字，在onResults或者onPartialResults中将转换的文字发送到机器人api，之后逻辑与正常机器人陪聊一致。

语音合成

将机器人返回的语音转换为文字，这一块也在BdVoiceUtil进行了一下封装：

    BdVoiceUtil.startTTS(mSpeechSynthesizer, msg.getContent());

通过以上代码即可以开始转换，将msg.getContent()转换为语音，可以通过SpeechSynthesizerListener监听转换是否成功，语音播放的进度等。

语音合成后，也可以通过setParam控制声音（0 (普通女声), 1 (普通男声), 2 (特别男声), 3 (情感男声), 4 (童声)）等

    mSpeechSynthesizer.setParam(SpeechSynthesizer.PARAM_SPEAKER, String.valueOf(getVoice(robot)));

语音唤醒

语音唤醒功能在这里没有用到，主要是在后续介绍的“找手机”功能用到；包括语音唤醒、通过语音识别达到唤醒的目的、与微信等的语音输入冲突问题等，这些留到后续介绍“找手机”时再说。

BdVoiceUtil

对语音相关的简单封装，其实还有TtsUtils等，总的代码太多，这里就不贴了...

public class BdVoiceUtil {

    /**
     * 开始识别(会先停止SpeechSynthesizer)
     * @param speechRecognizer
     * @param mSpeechSynthesizer
     * @param bindParams 是否需要提示音
     */
    public static void startASR(SpeechRecognizer speechRecognizer, SpeechSynthesizer mSpeechSynthesizer, boolean bindParams) {
        stopTTS(mSpeechSynthesizer);
        Intent intent = new Intent();
        bindParams(intent, bindParams);
        if(speechRecognizer != null) {
            speechRecognizer.startListening(intent);
        }
    }

    public static void stopASR(SpeechRecognizer speechRecognizer) {
        //  说完了
        if(speechRecognizer != null) {
            speechRecognizer.stopListening();
        }
    }

    public static void cancelASR(SpeechRecognizer speechRecognizer) {
        // 取消
        if(speechRecognizer != null) {
            speechRecognizer.cancel();
        }
    }

    public static void destroyASR(SpeechRecognizer speechRecognizer) {
        cancelASR(speechRecognizer);
        if(speechRecognizer != null) {
            speechRecognizer.destroy();
        }
    }

    public static void bindParams(Intent intent, boolean hintSound) {
        if(hintSound) {
            // 设置识别参数
            intent.putExtra(TtsUtils.EXTRA_SOUND_START, R.raw.bdspeech_recognition_start);
            intent.putExtra(TtsUtils.EXTRA_SOUND_END, R.raw.bdspeech_speech_end);
            intent.putExtra(TtsUtils.EXTRA_SOUND_SUCCESS, R.raw.bdspeech_recognition_success);
            intent.putExtra(TtsUtils.EXTRA_SOUND_ERROR, R.raw.bdspeech_recognition_error);
            intent.putExtra(TtsUtils.EXTRA_SOUND_CANCEL, R.raw.bdspeech_recognition_cancel);
        }
        intent.putExtra("sample", 16000); // 离线仅支持16000采样率
        intent.putExtra("language", "cmn-Hans-CN"); // 离线仅支持中文普通话
        intent.putExtra("prop", 20000); // 输入
        // 语音输入附加资源，value替换为资源文件实际路径
        // 离线包过大，暂不考虑支持 intent.putExtra("lm-res-file-path", "/path/to/s_2_InputMethod");
    }

    public static EventManager initEventWakeUp(final Activity context) {
        // 唤醒功能打开步骤
        // 1) 创建唤醒事件管理器
        EventManager mWpEventManager = EventManagerFactory.create(context, "wp");
        // 2) 注册唤醒事件监听器
        mWpEventManager.registerListener(new EventListener() {
            @Override
            public void onEvent(String name, String params, byte[] data, int offset, int length) {
                try {
                    if(params == null) {
                        return;
                    }
                    JSONObject json = new JSONObject(params);
                    if ("wp.data".equals(name)) { // 每次唤醒成功, 将会回调name=wp.data的时间, 被激活的唤醒词在params的word字段
                        String word = json.getString("word"); // 唤醒词
                        WpEventManagerUtil.doEvent(context, word);
                        if(Logs.isDebug()) {
                            Logs.logI(BdVoiceUtil.class.getSimpleName(), "百度语音唤醒" + word);
                        }
                    } else if ("wp.exit".equals(name)) {
                        // 唤醒已经停止
                    }
                } catch (JSONException e) {
                    throw new AndroidRuntimeException(e);
                }
            }
        });
        return mWpEventManager;
    }
    public static EventManager eventWakeUp(final Activity context, EventManager mWpEventManager) {
        if(mWpEventManager == null) {
            mWpEventManager = initEventWakeUp(context);
        }
        // 3) 通知唤醒管理器, 启动唤醒功能
        HashMap params = new HashMap();
        params.put("kws-file", "assets:///WakeUp.bin"); // 设置唤醒资源, 唤醒资源请到 http://yuyin.baidu.com/wake#m4 来评估和导出
        mWpEventManager.send("wp.start", new JSONObject(params).toString(), null, 0, 0);
        return mWpEventManager;
    }

    public static void eventWekeUpStop(EventManager mWpEventManager) {
        if(mWpEventManager != null) {
            // 停止唤醒监听
            mWpEventManager.send("wp.stop", null, null, 0, 0);
        }
    }


    public static void stopTTS(SpeechSynthesizer mSpeechSynthesizer) {
        if(mSpeechSynthesizer != null) {
            mSpeechSynthesizer.stop();
        }
    }

    public static void releaseTTS(SpeechSynthesizer mSpeechSynthesizer) {
        stopTTS(mSpeechSynthesizer);
        if(mSpeechSynthesizer != null) {
            mSpeechSynthesizer.release();
        }
    }

    public static void startTTS(SpeechSynthesizer mSpeechSynthesizer, String text) {
        stopTTS(mSpeechSynthesizer);
        if(mSpeechSynthesizer != null) {
            mSpeechSynthesizer.speak(text);
        }
    }
}