Skip to content

Anic888/russian-text-quality

Repository files navigation

russian-text-quality

License: MIT Release Stars Issues Discussions Claude Code skill i18n CLDR

A Claude Code skill that catches Russian-language bugs in code that general linters and LLMs consistently miss: broken pluralization in i18n locale files, case-agreement bugs in string concatenation, terminology drift, transliterated identifiers, and inconsistent language mixing.

Code-aware, not prose-focused. If you want a Russian prose/typography reviewer, install talkstream/ru-text alongside this skill — they complement each other.


Who this is for

  • Russian-speaking developers whose projects mix English code with Russian UI, docs, or comments.
  • Teams reviewing PRs that touch Russian locale files and want structural checks before merge.
  • Non-Russian-speaking developers using Claude Code on a codebase with Russian content — this skill explains every finding in English so you understand why the Russian is wrong.

What it checks

# Check Example of what it catches Severity
1 Plural structure ru.json has items_one and items_other but is missing items_few/items_many; ICU messages with only other {...} branch; vue-i18n values with fewer than 4 pipe segments Error
2 Terminology drift The same UI concept translates to "пользователь" in one file and "юзер" in another Warning
3 Concatenation case bug `Удалить ${item.name}?`Удалить requires the accusative but item.name is nominative Warning
4 Language mixing One file with half Russian comments, half English; inline Russian string literals outside the i18n system Warning
5 Transliterated identifiers Variable names like polzovatel, kolichestvo_tovarov, nazvanie_zakaza Info (low confidence)

Full detection algorithms and test cases live in references/plural-rules.md and the other reference files.

What it does NOT check

Out of scope Use instead
Typography (« », em-dash, NBSP) talkstream/ru-text
Spelling yaspeller
ё/е restoration eyo
UX writing quality, wooden MT copy talkstream/ru-text
Translation quality from a source language N/A — not a translation memory

Why the delegation: these areas are already well-covered by purpose-built tools. Duplicating them would fragment the ecosystem. This skill specializes in the code-aware checks that no existing tool handles.


Install

As a personal Claude Code skill

git clone https://github.com/Anic888/russian-text-quality.git ~/russian-text-quality
ln -s ~/russian-text-quality ~/.claude/skills/russian-text-quality

Or copy the directory:

cp -r russian-text-quality ~/.claude/skills/

Claude Code picks up skills in ~/.claude/skills/ automatically.

Verifying the install

Open a project with Russian content and ask Claude Code:

"Review this project for Russian text quality issues."

The skill should activate and produce a Markdown report. If the project has no Russian content, it will respond "Not activated — no Russian content detected" and stop.

Optional: project-level glossary

Copy .claude/russian-glossary.example.yml to your repo at .claude/russian-glossary.yml and customize. The glossary overrides the default auto-drift detection for check #2.


Activation — when this skill runs

The skill runs only when a project contains Russian content. Specifically, it requires:

  1. At least 10 total Cyrillic characters across code and locale files (node_modules, .git, dist, build, and .gitignore'd paths excluded).
  2. At least one file containing 5 consecutive Cyrillic characters (rules out single stray tokens).

Otherwise it stays silent. Zero noise on English-only projects is a hard design goal.

You can also invoke it explicitly by asking Claude Code to "use the russian-text-quality skill" — useful before adding Russian content to a previously English-only project.


Example output

# Russian Text Review — checkout-service

## Summary
- Files scanned: 42
- Russian content in: 7 files
- Findings: 3 errors, 2 warnings, 1 info

## Findings

### Errors (must fix)
#### #1 Plural structure
- [HIGH] src/locales/ru.json:45 — key `cart.items` missing `_few`, `_many`
  - Why: Russian CLDR requires one/few/many/other. Missing `_few` breaks n=2, 3, 4.
  - Fix: scaffolded below. Review and apply.

- [HIGH] src/messages/ru.json:12 — ICU plural block missing `one`, `few`, `many`
  - Bad: `{count, plural, other {# товаров}}`
  - Good: `{count, plural, one {# товар} few {# товара} many {# товаров} other {# товара}}`

### Warnings
#### #2 Terminology drift
- [MEDIUM] `user` is translated as "пользователь" in ru.json and "юзер" in admin/ru.json.
  - Fix: pick one, lock via .claude/russian-glossary.yml.

### Info
#### #5 Transliterated identifiers
- [LOW] src/order.ts:12 — `kolichestvo_tovarov` looks like transliterated Russian.
  - Suggestion: `product_quantity` or keep Cyrillic if your project convention uses Cyrillic.

## External tools we recommend
- Typography / UX writing: talkstream/ru-text
- Spelling: yaspeller
- ё/е: eyo

Limitations and honest caveats

  • i18next v3 unsupported. The legacy key_plural suffix format is out of scope for v1. Only the v4 _one/_few/_many/_other format is recognized.
  • vue-i18n pipe detection is heuristic. Pipes in prose can false-positive. Findings for vue-i18n are always Warning severity, never Error, to reflect this.
  • Concatenation check #3 is a regex heuristic. Expect roughly 10–15% false positives. Mitigation: always Warning severity, never auto-fixed.
  • Terminology normalization uses suffix stripping, not full morphology. A real pymorphy3 integration is on the roadmap for a future version.
  • Transliteration check #5 has thin public evidence. The dictionary is seeded from common Russian dev vocabulary because live GitHub permalinks for transliterated identifiers are rare — most of them live in private corporate codebases. Always Info severity.
  • Nested ICU plurals (plural inside select) may be missed by the v1 regex-based parser.
  • Russian comments in obscure languages' file formats are not scanned. Supported languages: TypeScript, JavaScript, TSX, JSX, Vue, Svelte, Python, Ruby, PHP, Go, Rust, Kotlin, Swift, Java, C#.

Design principles

  1. Zero false positives on English-only projects. Activation is gated on Cyrillic presence; non-Russian codebases never see findings from this skill.
  2. Every finding explains the why. Non-Russian-speaking developers must understand why the Russian is wrong, not just that it is.
  3. Errors only when deterministic. The skill reserves Error severity for structural bugs (check #1). All heuristics are Warning or Info.
  4. No auto-fix without approval. Only check #1 offers auto-fixes, and only after the user explicitly approves the suggested diff.
  5. Delegation over duplication. Areas already served by good tools are explicit non-goals — see the table above.

Contributing

New trigger words for check #3, new dictionary entries for check #5, and new examples in references/examples.md are all welcome. Please see CONTRIBUTING.md for the rules. Short version:

  • New triggers and dict entries need a linguistic source link (gramota.ru, Rosenthal reference, or similar).
  • New examples need a source link — no invented examples.
  • A PR that risks introducing false positives must include a before/after test run on at least 5 open-source Russian projects.

Security

See SECURITY.md for how to report prompt-injection payloads, malicious dictionary entries, or impersonation attempts on companion tools.

License

MIT. See LICENSE.


Acknowledgments

  • talkstream/ru-text — the existing Claude Code skill for Russian prose and typography. Install it alongside this one.
  • rails-i18n — reference implementation for the CLDR-compliant Russian locale file shape.
  • gramota.ru and orfogrammka.ru — authoritative sources for Russian punctuation and morphology cited throughout the detection rules.
  • Unicode CLDR — source of truth for Russian plural categories.
  • The Badoo, Kaspersky, VK, Yandex, and ua-hosting engineering blogs on Habr, whose localization war stories seeded the real-world example set in references/examples.md.

About

A Claude Code skill that catches Russian-language bugs in code: pluralization, case agreement, terminology drift, transliterated identifiers, language mixing. Complements talkstream/ru-text.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors