Khmer Script Misidentified as Thai in Vision Framework

It is vital for Apple to refine its OCR models to correctly distinguish between Khmer and Thai scripts. Incorrectly labeling Khmer text as Thai is more than a technical bug; it is a culturally insensitive error that impacts national identity, especially given the current geopolitical climate between Cambodia and Thailand. Implementing a more robust language-detection threshold would prevent these harmful misidentifications.

There is a significant logic flaw in the VNRecognizeTextRequest language detection when processing Khmer script. When the property automaticallyDetectsLanguage is set to true, the Vision framework frequently misidentifies Khmer characters as Thai.

While both scripts share historical roots, they are distinct languages with different alphabets. Currently, the model’s confidence threshold for distinguishing between these two scripts is too low, leading to incorrect OCR output in both developer-facing APIs and Apple’s native ecosystem (Preview, Live Text, and Photos).

import SwiftUI
import Vision

class TextExtractor {
  func extractText(from data: Data, completion: @escaping (String) -> Void) {
    let request = VNRecognizeTextRequest { (request, error) in
      
      guard let observations = request.results as? [VNRecognizedTextObservation] else {
        completion("No text found.")
        return
      }
      
      let recognizedStrings = observations.compactMap { observation in
        let str = observation.topCandidates(1).first?.string
        return "{text: \(str!), confidence: \(observation.confidence)}"
      }
    
      completion(recognizedStrings.joined(separator: "\n"))
    }
    
    request.automaticallyDetectsLanguage = true // <-- This is the issue.
    request.recognitionLevel = .accurate
    
    let handler = VNImageRequestHandler(data: data, options: [:])
    
    DispatchQueue.global(qos: .background).async {
      do {
        try handler.perform([request])
      } catch {
        completion("Failed to perform OCR: \(error.localizedDescription)")
      }
    }
  }
}

Recognizing Khmer

Confidence Score is low for Khmer text. (The output is in Thai language with low confidence score)

Recognizing English

Confidence Score is high expected.

Recognizing Thai

Confidence Score is high as expected


Issues on Preview, Photos

Khmer text

Copied text

Kouk Pring Chroum Temple [19121 รอาสายสุกตีนานยารรีสใหิสรราภูชิตีนนสุฐตีย์ [รุก
เผือชิษาธอยกัตธ์ตายตราพาษชาณา ถวเชยาใบสราเบรถทีมูสินตราพาษชาณา ทีมูโษา เช็ก
อาษเชิษฐอารายสุกบดตพรธุรฯ ตากร"สุก"ผาตากรธกรธุกเยากสเผาพศฐตาสาย รัอรณาษ"ตีพย"
สเผาพกรกฐาภูชิสาเครๆผู:สุกรตีพาสเผาพสรอสายใผิตรรารตีพสๆ เดียอลายสุกตีน
ธาราชรติ ธิพรหณาะพูชุบละเาหLunet De Lajonquiere ผารูกรสาราพารผรผาสิตภพ ตารสิทูก ธิพิ
คุณที่นสายเระพบพเคเผาหนารเกะทรนภาษเราภุพเสารเราษทีเลิกสญาเราหรุฬารชสเกาก เรากุม
สงสอบานตรเราะากกต่ายภากายระตารุกเตียน

Recommended Solutions

1. Set a Threshold

Filter out the detected result where the threshold is less than or equal to 0.5, so that it would not output low quality text which can lead to the issue.

For example,

let recognizedStrings = observations.compactMap { observation in
  if observation.confidence <= 0.5 {
    return nil
  }
  let str = observation.topCandidates(1).first?.string
  return "{text: \(str!), confidence: \(observation.confidence)}"
}

2. Add Khmer Language Support

This issue would never happen if the model has the capability to detect and recognize image with Khmer language.

Doc2Text GitHub: https://github.com/seanghay/Doc2Text-Swift

Rather than just posting in the forums about this issue - and remember, in the forums you're mainly talking to other developers like you - you should raise a bug.

You can do that here: https://feedbackassistant.apple.com/

Post the FB number here when you're done.

Khmer Script Misidentified as Thai in Vision Framework
 
 
Q