Text Prediction & Suggestion Engine
Overview
FlorisBoard's NLP (Natural Language Processing) system provides text prediction, autocorrect, spell checking, and word suggestions through a pluggable provider architecture.
Introduction
The prediction engine is built on a flexible provider system that allows different NLP implementations for various languages and use cases. The system supports:
- Word Suggestions: Next-word prediction based on context
- Spell Checking: Real-time spelling correction
- Autocorrect: Automatic correction of typos
- Emoji Suggestions: Emoji recommendations based on text
- Clipboard Suggestions: Quick access to recent clipboard items
Key Concepts
NLP Provider Architecture
FlorisBoard uses an abstract provider interface that allows pluggable NLP implementations:
interface NlpProvider {
val providerId: String
suspend fun create()
suspend fun preload(subtype: Subtype)
}
Provider Types
SuggestionProvider
Provides word suggestions and autocorrect:
interface SuggestionProvider : NlpProvider {
suspend fun suggest(
subtype: Subtype,
content: EditorContent,
maxCandidateCount: Int,
allowPossiblyOffensive: Boolean,
isPrivateSession: Boolean,
): List<SuggestionCandidate>
suspend fun notifySuggestionAccepted(subtype: Subtype, candidate: SuggestionCandidate)
suspend fun notifySuggestionReverted(subtype: Subtype, candidate: SuggestionCandidate)
suspend fun removeSuggestion(subtype: Subtype, candidate: SuggestionCandidate): Boolean
suspend fun getListOfWords(subtype: Subtype): List<String>
suspend fun getFrequencyForWord(subtype: Subtype, word: String): Double
}
SpellingProvider
Provides spell checking services:
interface SpellingProvider : NlpProvider {
suspend fun spell(
subtype: Subtype,
word: String,
precedingWords: List<String>,
followingWords: List<String>,
maxSuggestionCount: Int,
allowPossiblyOffensive: Boolean,
isPrivateSession: Boolean,
): SpellingResult
}
NlpManager
The central coordinator for all NLP operations:
class NlpManager(context: Context) {
private val activeCandidates = mutableStateListOf<SuggestionCandidate>()
private val internalSuggestions = guardedByLock { 0L to emptyList<SuggestionCandidate>() }
fun suggest(subtype: Subtype, content: EditorContent)
suspend fun spell(
subtype: Subtype,
word: String,
precedingWords: List<String>,
followingWords: List<String>,
maxSuggestionCount: Int,
): SpellingResult
}
Suggestion Candidates
Different types of suggestions:
sealed class SuggestionCandidate {
abstract val text: String
abstract val confidence: Double
abstract val isEligibleForAutoCommit: Boolean
}
data class WordSuggestionCandidate(
override val text: String,
val secondaryText: String? = null,
override val confidence: Double,
override val isEligibleForAutoCommit: Boolean,
val sourceProvider: SuggestionProvider,
) : SuggestionCandidate()
data class ClipboardSuggestionCandidate(
val item: ClipboardItem,
val sourceProvider: SuggestionProvider,
) : SuggestionCandidate()
data class EmojiSuggestionCandidate(
val emoji: Emoji,
override val confidence: Double,
) : SuggestionCandidate()
Spelling Results
sealed class SpellingResult {
abstract val suggestionsInfo: SuggestionsInfo
data class ValidWord(...) : SpellingResult()
data class Typo(val suggestions: Array<String>) : SpellingResult()
data class GrammarError(val suggestions: Array<String>) : SpellingResult()
data class Unspecified(...) : SpellingResult()
}
Implementation Details
Suggestion Flow
User Types → EditorInstance → NlpManager.suggest()
↓
Parallel Suggestion Gathering:
├─► EmojiSuggestionProvider
├─► ClipboardSuggestionProvider
└─► SuggestionProvider (Language-specific)
↓
Candidate Assembly → activeCandidates
↓
UI Display (Smartbar)
NlpManager Suggestion Process
fun suggest(subtype: Subtype, content: EditorContent) {
val reqTime = SystemClock.uptimeMillis()
scope.launch {
val emojiSuggestions = when {
prefs.emoji.suggestionEnabled.get() -> {
emojiSuggestionProvider.suggest(
subtype = subtype,
content = content,
maxCandidateCount = prefs.emoji.suggestionCandidateMaxCount.get(),
allowPossiblyOffensive = !prefs.suggestion.blockPossiblyOffensive.get(),
isPrivateSession = keyboardManager.activeState.isIncognitoMode,
)
}
else -> emptyList()
}
val suggestions = when {
emojiSuggestions.isNotEmpty() && prefs.emoji.suggestionType.get().prefix.isNotEmpty() -> {
emptyList()
}
else -> {
getSuggestionProvider(subtype).suggest(
subtype = subtype,
content = content,
maxCandidateCount = 8,
allowPossiblyOffensive = !prefs.suggestion.blockPossiblyOffensive.get(),
isPrivateSession = keyboardManager.activeState.isIncognitoMode,
)
}
}
internalSuggestionsGuard.withLock {
if (internalSuggestions.first < reqTime) {
internalSuggestions = reqTime to buildList {
addAll(emojiSuggestions)
addAll(suggestions)
}
}
}
}
}
Spell Checking Integration
override fun onGetSuggestions(textInfo: TextInfo?, suggestionsLimit: Int): SuggestionsInfo {
textInfo?.text ?: return SpellingResult.unspecified().suggestionsInfo
setupSpellingIfNecessary()
val spellingSubtype = cachedSpellingSubtype ?: return SpellingResult.unspecified().suggestionsInfo
return runBlocking {
nlpManager
.spell(spellingSubtype, textInfo.text, emptyList(), emptyList(), suggestionsLimit)
.sendToDebugOverlayIfEnabled(textInfo)
.suggestionsInfo
}
}
Clipboard Suggestions
override suspend fun suggest(
subtype: Subtype,
content: EditorContent,
maxCandidateCount: Int,
allowPossiblyOffensive: Boolean,
isPrivateSession: Boolean,
): List<SuggestionCandidate> {
if (!prefs.clipboard.suggestionEnabled.get()) return emptyList()
val currentItem = validateClipboardItem(clipboardManager.primaryClip, lastClipboardItemId, content.text)
?: return emptyList()
return buildList {
val now = System.currentTimeMillis()
if ((now - currentItem.creationTimestampMs) < prefs.clipboard.suggestionTimeout.get() * 1000) {
add(ClipboardSuggestionCandidate(currentItem, sourceProvider = this@ClipboardSuggestionProvider, context = context))
if (currentItem.isSensitive) {
return@buildList
}
}
}
}
Latin Language Provider Example
class LatinLanguageProvider(context: Context) : SpellingProvider, SuggestionProvider {
private val wordData = guardedByLock { mutableMapOf<String, Int>() }
override suspend fun preload(subtype: Subtype) {
wordData.withLock { wordData ->
if (wordData.isEmpty()) {
val rawData = appContext.assets.readText("ime/dict/data.json")
val jsonData = Json.decodeFromString(wordDataSerializer, rawData)
wordData.putAll(jsonData)
}
}
}
override suspend fun spell(
subtype: Subtype,
word: String,
precedingWords: List<String>,
followingWords: List<String>,
maxSuggestionCount: Int,
allowPossiblyOffensive: Boolean,
isPrivateSession: Boolean,
): SpellingResult {
return when (word.lowercase()) {
"typo" -> SpellingResult.typo(arrayOf("typo1", "typo2", "typo3"))
"gerror" -> SpellingResult.grammarError(arrayOf("grammar1", "grammar2", "grammar3"))
else -> SpellingResult.validWord()
}
}
}
Fallback Provider
object FallbackNlpProvider : SpellingProvider, SuggestionProvider {
override val providerId = "org.florisboard.nlp.providers.fallback"
override suspend fun spell(...): SpellingResult {
return SpellingResult.unspecified()
}
override suspend fun suggest(...): List<SuggestionCandidate> {
return emptyList()
}
}
Code Examples
Creating a Custom Suggestion Provider
class CustomSuggestionProvider(context: Context) : SuggestionProvider {
override val providerId = "com.example.custom.provider"
private val dictionary = mutableMapOf<String, Double>()
override suspend fun create() {
// Initialize provider
}
override suspend fun preload(subtype: Subtype) {
// Load language-specific data
val locale = subtype.primaryLocale
loadDictionary(locale)
}
override suspend fun suggest(
subtype: Subtype,
content: EditorContent,
maxCandidateCount: Int,
allowPossiblyOffensive: Boolean,
isPrivateSession: Boolean,
): List<SuggestionCandidate> {
val currentWord = content.composingText
if (currentWord.isBlank()) return emptyList()
return dictionary.entries
.filter { it.key.startsWith(currentWord, ignoreCase = true) }
.sortedByDescending { it.value }
.take(maxCandidateCount)
.map { (word, frequency) ->
WordSuggestionCandidate(
text = word,
confidence = frequency,
isEligibleForAutoCommit = frequency > 0.9,
sourceProvider = this
)
}
}
override suspend fun notifySuggestionAccepted(subtype: Subtype, candidate: SuggestionCandidate) {
// Update frequency or learn from user
if (candidate is WordSuggestionCandidate) {
dictionary[candidate.text] = (dictionary[candidate.text] ?: 0.0) + 0.1
}
}
override suspend fun removeSuggestion(subtype: Subtype, candidate: SuggestionCandidate): Boolean {
if (candidate is WordSuggestionCandidate) {
dictionary.remove(candidate.text)
return true
}
return false
}
}
Implementing Spell Checking
override suspend fun spell(
subtype: Subtype,
word: String,
precedingWords: List<String>,
followingWords: List<String>,
maxSuggestionCount: Int,
allowPossiblyOffensive: Boolean,
isPrivateSession: Boolean,
): SpellingResult {
// Check if word exists in dictionary
if (dictionary.containsKey(word.lowercase())) {
return SpellingResult.validWord()
}
// Generate suggestions using edit distance
val suggestions = dictionary.keys
.filter { editDistance(word, it) <= 2 }
.sortedBy { editDistance(word, it) }
.take(maxSuggestionCount)
.toTypedArray()
return if (suggestions.isNotEmpty()) {
SpellingResult.typo(suggestions)
} else {
SpellingResult.unspecified()
}
}
private fun editDistance(s1: String, s2: String): Int {
// Levenshtein distance implementation
val dp = Array(s1.length + 1) { IntArray(s2.length + 1) }
for (i in 0..s1.length) dp[i][0] = i
for (j in 0..s2.length) dp[0][j] = j
for (i in 1..s1.length) {
for (j in 1..s2.length) {
val cost = if (s1[i - 1] == s2[j - 1]) 0 else 1
dp[i][j] = minOf(
dp[i - 1][j] + 1, // deletion
dp[i][j - 1] + 1, // insertion
dp[i - 1][j - 1] + cost // substitution
)
}
}
return dp[s1.length][s2.length]
}
Registering a Custom Provider
// In Subtype configuration
val subtype = Subtype(
id = System.currentTimeMillis(),
primaryLocale = FlorisLocale.from("en", "US"),
nlpProviders = SubtypeNlpProviderMap(
spelling = ExtensionComponentName.from("com.example.custom.provider"),
suggestion = ExtensionComponentName.from("com.example.custom.provider")
),
// ... other configuration
)
Best Practices
1. Handle Private Sessions
override suspend fun suggest(..., isPrivateSession: Boolean): List<SuggestionCandidate> {
if (isPrivateSession) {
// Don't learn from user input
// Only use pre-existing dictionary
return suggestFromStaticDictionary(content)
} else {
// Can learn and adapt
return suggestWithLearning(content)
}
}
2. Respect Offensive Content Filtering
override suspend fun suggest(..., allowPossiblyOffensive: Boolean): List<SuggestionCandidate> {
val candidates = generateCandidates(content)
return if (allowPossiblyOffensive) {
candidates
} else {
candidates.filter { !isOffensive(it.text) }
}
}
3. Optimize for Performance
// Preload dictionaries asynchronously
override suspend fun preload(subtype: Subtype) = withContext(Dispatchers.IO) {
val locale = subtype.primaryLocale
dictionaryCache.getOrPut(locale) {
loadDictionaryFromAssets(locale)
}
}
// Use caching for frequent operations
private val suggestionCache = LruCache<String, List<SuggestionCandidate>>(100)
4. Provide Confidence Scores
WordSuggestionCandidate(
text = word,
confidence = calculateConfidence(word, context),
isEligibleForAutoCommit = confidence > 0.95,
sourceProvider = this
)
private fun calculateConfidence(word: String, context: EditorContent): Double {
val frequency = getWordFrequency(word)
val contextMatch = getContextualRelevance(word, context)
return (frequency * 0.6 + contextMatch * 0.4).coerceIn(0.0, 1.0)
}
5. Handle Notification Events
override suspend fun notifySuggestionAccepted(subtype: Subtype, candidate: SuggestionCandidate) {
if (candidate is WordSuggestionCandidate) {
// Increase word frequency
updateWordFrequency(candidate.text, increase = true)
// Learn n-grams for better context
learnContext(candidate.text, currentContext)
}
}
override suspend fun notifySuggestionReverted(subtype: Subtype, candidate: SuggestionCandidate) {
if (candidate is WordSuggestionCandidate) {
// Decrease confidence for this suggestion
updateWordFrequency(candidate.text, increase = false)
}
}
Common Patterns
Multi-Language Support
class MultiLanguageProvider : SuggestionProvider {
private val providers = mutableMapOf<String, LanguageSpecificProvider>()
override suspend fun suggest(
subtype: Subtype,
content: EditorContent,
maxCandidateCount: Int,
allowPossiblyOffensive: Boolean,
isPrivateSession: Boolean,
): List<SuggestionCandidate> {
val primaryProvider = providers.getOrPut(subtype.primaryLocale.language) {
createProviderForLanguage(subtype.primaryLocale)
}
val primarySuggestions = primaryProvider.suggest(
subtype, content, maxCandidateCount, allowPossiblyOffensive, isPrivateSession
)
// Optionally include suggestions from secondary languages
val secondarySuggestions = subtype.secondaryLocales.flatMap { locale ->
providers.getOrPut(locale.language) {
createProviderForLanguage(locale)
}.suggest(subtype, content, 2, allowPossiblyOffensive, isPrivateSession)
}
return (primarySuggestions + secondarySuggestions).take(maxCandidateCount)
}
}
Context-Aware Suggestions
private fun getContextualSuggestions(
currentWord: String,
precedingWords: List<String>
): List<String> {
// Use n-gram model for context
val bigrams = ngramModel.getBigrams(precedingWords.lastOrNull() ?: "")
val trigrams = if (precedingWords.size >= 2) {
ngramModel.getTrigrams(precedingWords.takeLast(2))
} else emptyList()
return (trigrams + bigrams)
.filter { it.startsWith(currentWord, ignoreCase = true) }
.sortedByDescending { ngramModel.getFrequency(it) }
}
Troubleshooting
Suggestions Not Appearing
Problem: No suggestions show up while typing.
Solutions:
- Check if suggestions are enabled in preferences
- Verify provider is properly registered
- Ensure dictionary is loaded
- Check for errors in provider implementation
- Verify
isSuggestionOn()
returns true
Spell Checking Not Working
Problem: Misspelled words not highlighted.
Solutions:
- Verify FlorisSpellCheckerService is enabled in system settings
- Check spell checker service is running
- Ensure provider implements SpellingProvider
- Verify dictionary contains words
- Check Android spell checker settings
Poor Suggestion Quality
Problem: Suggestions are irrelevant or low quality.
Solutions:
- Improve dictionary quality and coverage
- Implement better confidence scoring
- Use context-aware n-gram models
- Filter low-confidence suggestions
- Learn from user corrections
Performance Issues
Problem: Suggestions cause lag or stuttering.
Solutions:
- Preload dictionaries asynchronously
- Use caching for frequent lookups
- Limit suggestion count
- Optimize dictionary data structure (trie, hash map)
- Profile and optimize hot paths
Related Topics
- Internationalization - Multi-language support
- Architecture Overview - System architecture
- Input Processing - How input flows through the system
- Custom UI Components - Displaying suggestions
Next Steps
- Explore existing providers
- Contribute to language support
- Check the FlorisBoard documentation for upcoming guides on NLP providers and dictionary formats
Note: This documentation is continuously being improved. Contributions are welcome!