Question #459MediumDart BasicsAdvanced Concepts

What are Runes in Dart?

#dart#runes#unicode#utf-16#string#code-points

Answer

Overview

In Dart, Runes are the Unicode code points of a

text
String
. Dart strings are encoded in UTF-16, which means some characters (like emojis) are stored as two code units (a surrogate pair). Runes give you the actual Unicode characters, not the underlying UTF-16 code units.


The Problem: UTF-16 Encoding

A Dart

text
String
is a sequence of UTF-16 code units. Characters in the Basic Multilingual Plane (U+0000 to U+FFFF) use one code unit. Characters outside it (like emojis) use two code units (a surrogate pair).

dart
void main() {
  final emoji = '๐Ÿ‘จ';  // U+1F468 (Man emoji)

  print(emoji.length);         // 2 (two UTF-16 code units!)
  print(emoji.codeUnits);      // [55357, 56424] (surrogate pair)
  print(emoji.runes.toList()); // [128104] (one Unicode code point: 0x1F468)
}

String.runes Property

The

text
runes
property returns an
text
Iterable<int>
of Unicode code points, correctly handling surrogate pairs:

dart
void main() {
  // ASCII characters โ€” no difference
  const ascii = 'Dart';
  print(ascii.length);             // 4
  print(ascii.runes.toList());     // [68, 97, 114, 116]
  print(ascii.codeUnits);          // [68, 97, 114, 116] (same!)

  // Emoji โ€” runes vs codeUnits differ
  const emoji = '๐ŸŽฏ๐Ÿ”ฅ';
  print(emoji.length);             // 4 (each emoji = 2 code units)
  print(emoji.codeUnits.length);   // 4
  print(emoji.runes.length);       // 2 (two actual characters)
  print(emoji.runes.toList());     // [127919, 128293]
}

The Runes Class

text
Runes
is a class in
text
dart:core
that provides an iterable of Unicode code points from a string:

dart
void main() {
  const text = 'Hello ๐ŸŒ';

  // Using Runes class directly
  final runes = Runes(text);
  for (final rune in runes) {
    print('U+${rune.toRadixString(16).toUpperCase()} -> ${String.fromCharCode(rune)}');
  }
  // U+48 -> H
  // U+65 -> e
  // U+6C -> l
  // U+6C -> l
  // U+6F -> o
  // U+20 ->
  // U+1F30D -> ๐ŸŒ
}

Creating Strings from Runes

Use

text
String.fromCharCodes()
to create a string from Unicode code points:

dart
void main() {
  // Create string from Unicode code points
  final heart = String.fromCharCode(0x2764);    // โค
  final dart = String.fromCharCodes([0x44, 0x61, 0x72, 0x74]); // Dart
  final emoji = String.fromCharCode(0x1F600);   // ๐Ÿ˜€

  print(heart);  // โค
  print(dart);   // Dart
  print(emoji);  // ๐Ÿ˜€

  // Using rune literals in strings
  print('I \u2764 Dart');         // I โค Dart
  print('\u{1F600}');              // ๐Ÿ˜€ (use {} for > 4 hex digits)
}

Practical Use Cases

1. Correctly Counting Characters

dart
int characterCount(String text) {
  // โŒ Wrong โ€” counts UTF-16 code units
  // return text.length;

  // โœ… Correct โ€” counts actual Unicode characters
  return text.runes.length;
}

void main() {
  print(characterCount('Hello'));    // 5
  print(characterCount('Hello ๐ŸŒ')); // 7 (not 8!)
}

2. Safe String Manipulation

dart
// โŒ Wrong โ€” may split a surrogate pair
String wrongTruncate(String s, int maxLength) {
  return s.substring(0, maxLength);
}

// โœ… Correct โ€” respects Unicode characters
String safeTruncate(String s, int maxChars) {
  final runes = s.runes.toList();
  if (runes.length <= maxChars) return s;
  return String.fromCharCodes(runes.sublist(0, maxChars));
}

void main() {
  const text = '๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Family';
  print(safeTruncate(text, 3)); // Safely truncates
}

3. Character Validation

dart
bool containsEmoji(String text) {
  return text.runes.any((rune) => rune > 0xFFFF);
}

void main() {
  print(containsEmoji('Hello'));      // false
  print(containsEmoji('Hello ๐ŸŽ‰'));   // true
}

Runes vs codeUnits vs characters

PropertyReturnsHandles EmojisHandles Grapheme Clusters
text
.length
UTF-16 code unit countNoNo
text
.codeUnits
UTF-16 code unitsNoNo
text
.runes
Unicode code pointsYesNo
text
.characters
Grapheme clustersYesYes
dart
import 'package:characters/characters.dart';

void main() {
  const family = '๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ'; // Family emoji (one visual character)

  print(family.length);               // 11 (many code units)
  print(family.codeUnits.length);     // 11
  print(family.runes.length);         // 7  (multiple code points joined by ZWJ)
  print(family.characters.length);    // 1  (one grapheme cluster โœ…)
}

Key Insight: Use

text
.runes
when you need true Unicode code points (not UTF-16 code units). For user-visible character counting (especially with complex emojis), use the
text
characters
package for grapheme cluster support.