Pronunciation Feedback (Simple)
The Theta One Simple Pronunciation Feedback API evaluates student pronunciation using only the reference text — no native speaker audio required. A single API call returns word-level and phoneme-level pronunciation analysis.
[Student Audio + Reference Text] → /pronunciation-simple → Pronunciation Feedback
Differences from /pronunciation
| Aspect | /pronunciation | /pronunciation-simple |
|---|---|---|
| API calls required | 2 (/analyze-native + /pronunciation) | 1 |
| Native audio needed | Yes | No |
| Pronunciation score range | 50-70 | 0-100 |
| Stress/Pause/Speed feedback | Score + text | null (data still in user_speech_components) |
| Word/phoneme/syllable detail | Full | Full (identical) |
Even without native audio, user_speech_components still includes is_stressed on words/phonemes and is_long on pauses — these are measured from the student's own audio. Only the comparative feedback scores (stress, pause, speed) are null because there is no native reference to compare against.
Prerequisites
API usage requires a valid API key and either prepaid credits or a postpaid billing contract. If you haven't prepared yet, please refer to the documents below.
Evaluate Student Pronunciation (/pronunciation-simple)
Upload a student's audio file along with the reference text. The API returns phoneme/syllable-level pronunciation analysis with a pronunciation score.
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
file | File (WAV) | Yes | Student audio file (.wav format) |
options | JSON string | Yes | Evaluation options (see below) |
x-api-key (Header) | string | Yes | API key (starting with sk-theta-) |
options Fields
| Field | Type | Required | Description |
|---|---|---|---|
gold_text | string | Yes | Reference text (e.g., "Wow, check out this castle.") |
language | string | No | Language for feedback text (default: "ko"). Supported: ko, en |
API Request
- cURL
- Python
curl -X 'POST' \
'https://stt.thetaone-ai.com/pronunciation-simple' \
-H 'accept: application/json' \
-H 'x-api-key: YOUR_API_KEY' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@student_audio.wav;type=audio/wav' \
-F 'options={"gold_text": "Wow, check out this castle.", "language": "ko"}'
import requests
import json
url = "https://stt.thetaone-ai.com/pronunciation-simple"
headers = {
"x-api-key": "YOUR_API_KEY"
}
with open("student_audio.wav", "rb") as audio_file:
files = {"file": ("student_audio.wav", audio_file, "audio/wav")}
data = {
"options": json.dumps({
"gold_text": "Wow, check out this castle.",
"language": "ko"
})
}
response = requests.post(url, headers=headers, files=files, data=data)
result = response.json()
print(json.dumps(result, indent=2))
Response Example
{
"user_speech_components": [
{
"type": "word",
"word": "Wow",
"start": 0.45,
"end": 1.02,
"score": 72.0,
"is_correct": true,
"is_stressed": true,
"feedback": null,
"syllables": [
{
"syllable": "Wow",
"grapheme": null,
"score": 72.0,
"start": 0.45,
"end": 1.02
}
],
"phonemes": [
{
"phoneme": "w",
"user_phoneme": "w",
"score": 85.0,
"is_correct": true,
"is_stressed": true,
"feedback": null,
"start": 0.45,
"end": 0.62
},
{
"phoneme": "aʊ",
"user_phoneme": "æ",
"score": 58.0,
"is_correct": false,
"is_stressed": false,
"feedback": "You pronounced 'æ' instead of 'aʊ'. Try opening your mouth wide and saying 'ah-oo'.",
"start": 0.62,
"end": 1.02
}
]
},
{
"type": "pause",
"start": 1.02,
"end": 1.45,
"is_long": true
},
{
"type": "word",
"word": "check",
"start": 1.45,
"end": 1.88,
"score": 65.0,
"is_correct": false,
"is_stressed": false,
"feedback": "Pronunciation of 'check' is inaccurate.",
"syllables": [
{
"syllable": "check",
"grapheme": null,
"score": 65.0,
"start": 1.45,
"end": 1.88
}
],
"phonemes": [
{
"phoneme": "ʧ",
"user_phoneme": "ʃ",
"score": 45.0,
"is_correct": false,
"is_stressed": false,
"feedback": "You pronounced 'sh' instead of 'ch'. Place your tongue tip on the upper gum and release with a 'ch' sound.",
"start": 1.45,
"end": 1.58
},
{
"phoneme": "ɛ",
"user_phoneme": "ɛ",
"score": 98.0,
"is_correct": true,
"is_stressed": false,
"feedback": null,
"start": 1.58,
"end": 1.72
},
{
"phoneme": "k",
"user_phoneme": "k",
"score": 92.0,
"is_correct": true,
"is_stressed": false,
"feedback": null,
"start": 1.72,
"end": 1.88
}
]
}
],
"feedback": {
"pronunciation": {
"score": 78,
"feedback": "발음이 잘못된 부분이 2개 있어요. 다시 시도해보세요!"
},
"stress": null,
"pause": null,
"speed": null
}
}
Response Fields
user_speech_components
The analysis result of the student's audio. Structure is identical to the Pronunciation Feedback API response.
When pronunciation is inaccurate, the feedback field contains specific correction guidance.
For field details, see the Response Fields in Pronunciation Feedback API.
feedback
| Dimension | Score Range | Description |
|---|---|---|
pronunciation | 0-100 | Score based on percentage of correctly pronounced words |
stress | null | Not available without native reference |
pause | null | Not available without native reference |
speed | null | Not available without native reference |
Feedback text is localized based on the language option.
Workflow Example (Python)
import requests
import json
API_KEY = "YOUR_API_KEY"
BASE_URL = "https://stt.thetaone-ai.com"
GOLD_TEXT = "Wow, check out this castle."
headers = {"x-api-key": API_KEY}
# Single API call — no native audio needed
with open("student_audio.wav", "rb") as f:
response = requests.post(
f"{BASE_URL}/pronunciation-simple",
headers=headers,
files={"file": ("student_audio.wav", f, "audio/wav")},
data={
"options": json.dumps({
"gold_text": GOLD_TEXT,
"language": "ko"
})
}
)
result = response.json()
feedback = result["feedback"]
print(f"Pronunciation: {feedback['pronunciation']['score']} - {feedback['pronunciation']['feedback']}")
# Word-level details
for comp in result["user_speech_components"]:
if comp["type"] == "word":
status = "OK" if comp["is_correct"] else "NG"
print(f" [{status}] {comp['word']} (score: {comp['score']})")
for ph in comp.get("phonemes", []):
if not ph["is_correct"]:
print(f" Phoneme error: {ph['phoneme']} -> {ph['user_phoneme']} (score: {ph['score']})")
if ph.get("feedback"):
print(f" {ph['feedback']}")
Error Responses
If API processing fails, json containing an error message is returned with an HTTP error code.
400 Bad Request
There is a problem with the request format. Please check:
- Whether
optionsis a valid JSON string - Whether
gold_textis included - Whether the audio file contains all words (a
NotAllWordsSpokenErroroccurs if some words are missing)
401 Unauthorized
There is a problem with API authentication. Please check if the API key is entered correctly and if the API key status is valid.
402 PAYMENT_REQUIRED
Billing-related error. Please check if the amount of credits charged is sufficient and if the payment information is valid.
429 RATE_LIMIT_EXCEEDED
This error occurs when you send a request that exceeds the allocated requests per minute limit (Request Per Minute Limit). Please try again later, or contact us to increase the limit to suit your needs.
500 INTERNAL_SERVER_ERROR
This is an error that occurs on the Theta One API server side. If it occurs, please leave the error log along with the time of occurrence and the API key you used by email (support@thetaone.co) and we will help you quickly resolve it.