Hello Support,
So, what i understand from your post is that create the grammar in the VXMl itself rather than referencing it in VXML.
I tried that as well and we got improved results. But still not more than 50%.
I think we need a little calibration/Tuning on these property tags. Like, We found that increasing "speedvsaccuracy" to 0.9 and "confidencelevel" value="0.2 significant ally increased the result from 20% to 50%, otherwise we were stuck most of the time on NoInput tag when we call.
For example, Below are most close to accurate account ID's results, when spoken over phone using the below script.
545SSD - 100%
FAQ5203 - 90% (Q as K)
OBMAHTR - 60% (O as zero, B as P, cutoff after 4th character)
OH1O803 - O as zero (even though it is not mentioned in grammar)
AQQ9387 - 80% (3 as C, Q as 2)
I do not understand why Letter "O" is understood as Zero which is not mentioned in grammar at all, is it platform that interpreting and ignoring the grammar written in VXML?
If platform is interfering the grammar then how can we resolve Letter O as O not Zero as input? I mean we are taking input and reading it back to caller, but in between can we write a condition if it is O do not interpret as zero rather take it as O and read it back to caller. Is there anything that can resolve this.
You can try abovelisted numbers and test the result as well.
Can You please review below code and point us what could be the best values for these tags that can work together to increase the result from 50% to at-least 80%.
For example i saw one of the post where they mentioned that setting "grammarmaxage" to "0" gave good result (But not in my case).
I think, if we can put these Properties values in such a way so that when they load together it can produce great result. Please help.
I also included few phonetics that sound like alphabets in the grammar to achieve accuracy, like code below.
<item> sea <tag>SWI_literal="c"</tag></item>
Also, if there is any other Property tag that we should use that is missing here in the code.
Please review the code, any pointers what and how to do some tuning.
<item repeat="1-8"> = what i understand that "The utterance must be spoken 1 to 8 times". But in our alphanum account ID it is not necessary that all the alphabet or number should appear atleast once. Why not using <item repeat="0-8"> meaning The utterance is not required to be repeated, but can be spoken up to 8 times. Bu doing this Is it going to help the ASR in anyway?
Code: Select all
<form id="email">
<property name="grammarmaxage" value="60s"/>
<property name="grammarmaxstale" value="25s"/>
<property name="sensitivity" value="0.5"/>
<property name="confidencelevel" value="0.2"/>
<property name="interdigittimeout" value="5s"/>
<property name="speedvsaccuracy" value="0.9"/>
<field name="em">
<prompt bargein="false"> Please say a AlphaNumber char.</prompt>
<grammar type="application/srgs+xml" mode="voice" root="alphanum">
<rule id="alphanum" scope="public">
<one-of>
<item repeat="1-8">
<ruleref uri="#chars"/>
</item>
</one-of>
</rule>
<rule id="chars" scope="public">
<one-of>
<item> A <tag>SWI_literal="a"</tag></item>
<item> eh <tag>SWI_literal="a"</tag></item>
<item> ei <tag>SWI_literal="a"</tag></item>
<item> ay <tag>SWI_literal="a"</tag></item>
<item> B <tag>SWI_literal="b"</tag></item>
<item> bi <tag>SWI_literal="b"</tag></item>
<item> bee <tag>SWI_literal="b"</tag></item>
<item> C <tag>SWI_literal="c"</tag></item>
<item> see <tag>SWI_literal="c"</tag></item>
<item> sea <tag>SWI_literal="c"</tag></item>
</one-of>
</rule>
</grammar>
<filled>
<prompt>
you said <value expr="em"/>.
</prompt>
<nomatch>
Sorry, try again.
<reprompt/>
</nomatch>
</field>
</form>
Thanks.