- Input accented speech: The original audio
- Accent Conversion and Improving Pronunciation: The converted audio in all experiments
| Input with Arabic accent | Accent conversion and Improving Pronunciation | |||||||
| Baseline | Fine-tuning with knowledge distillation loss | Fine-tuning with knowledge distillation loss and Synthetic Ground-Truth | Synthetic Ground-Truth | |||||
| Input with Vietnamese accent | Accent conversion and Improving Pronunciation | |||||||
| Baseline | Fine-tuning with knowledge distillation loss | Fine-tuning with knowledge distillation loss and Synthetic Ground-Truth | Synthetic Ground-Truth | |||||
| Input with Indian accent | Accent conversion and Improving Pronunciation | |||||||
| Baseline | Fine-tuning with knowledge distillation loss | Fine-tuning with knowledge distillation loss and Synthetic Ground-Truth | Synthetic Ground-Truth | |||||
| Input with Korean accent | Accent conversion and Improving Pronunciation | |||||||
| Baseline | Fine-tuning with knowledge distillation loss | Fine-tuning with knowledge distillation loss and Synthetic Ground-Truth | Synthetic Ground-Truth | |||||
| Input with Chinese accent | Accent conversion and Improving Pronunciation | |||||||
| Baseline | Fine-tuning with knowledge distillation loss | Fine-tuning with knowledge distillation loss and Synthetic Ground-Truth | Synthetic Ground-Truth | |||||